Épisodes

  • Gemini 3 is Here: 11 Details You Might Have Missed
    Nov 19 2025

    Gemini 3 Pro is out, and records fell like snowflakes in Svalbard.

    No long description, chapters or links today, huge technical difficulties, including with audio, so just want to publish asap.


    https://app.grayswan.ai/ai-explained


    https://lmcouncil.ai
    AI Insiders ($9!): https://www.patreon.com/AIExplained



    Non-hype Newsletter: https://signaltonoise.beehiiv.com/
    Podcast: https://aiexplainedopodcast.buzzsprout.com/

    Voir plus Voir moins
    22 min
  • Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
    Nov 14 2025

    A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.

    https://assemblyai.com/aiexplained

    Chapters:
    00:00 - Introduction

    00:56 - GPT 5.1 Smarter?

    01:47 - Some Regressions

    03:22 - Sycophancy?

    05:22 - Claude Auto-Hacking

    06:16 - Jailbreaking through Granularity

    08:22 - This Will be Re-used

    09:30 - Hallucinating Hacker

    09:57 - Surprisingly Neutral Tone

    12:18 - SIMA 2

    14:10 - Alpha Parallels

    17:24 - AI Music



    GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/

    System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf

    Benchmarks: https://openai.com/index/gpt-5-1-for-developers/

    Simple Bench: https://lmcouncil.ai/benchmarks


    Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618

    https://www.anthropic.com/news/disrupting-AI-espionage

    Report: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf



    Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

    https://x.com/amoufarek/status/1988986075331858693

    Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/

    Voyager: https://voyager.minedojo.org/


    Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/


    Voir plus Voir moins
    18 min
  • Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)
    Nov 10 2025

    Don’t let headlines about bubbles distract you from the real avenues of progress being explored in AI every week, including what had been thought to be a long-term blocker - continual learning (learning on the fly).

    https://app.grayswan.ai/ai-explained

    This, plus models introspecting (hesitate before you berate), Nano Banana 2 possibly spotted, Chinese imagen and more.

    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Chapters:
    00:00 - Introduction
    01:26 - Continual Learning (Nested Learning / HOPE)
    07:00 - Introspection
    10:54 - Image-Gen Progress

    Nested Learning Post: https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/

    Nested Learning Paper: https://abehrouz.github.io/files/NL.pdf

    Original Titans Paper: https://arxiv.org/pdf/2501.00663

    Siri News: https://www.bloomberg.com/news/articles/2025-11-05/apple-plans-to-use-1-2-trillion-parameter-google-gemini-model-to-power-new-siri

    Introspection: https://www.anthropic.com/research/introspection

    Full Paper: https://transformer-circuits.pub/2025/introspection/index.html#mechanisms

    Earlier Work: https://www.anthropic.com/research/mapping-mind-language-model
    https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

    Release Post: https://x.com/AnthropicAI/status/1983584136972677319

    https://lmcouncil.ai



    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/

    Voir plus Voir moins
    13 min
  • Sora 2 - It will only get more realistic from here
    Oct 1 2025

    Sora 2 - the start of the infinite slop-feed or a key step to a generalist agent? Better than VEO 3 or over-hyped? I bring out 6 details you may have missed, contrast the announcement to Periodic Labs and even squeeze in some Claude Sonnet 4.5 analysis. Maybe I should make my videos longer…

    https://80000hours.org/aiexplained

    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Chapters:
    00:00 - Introduction
    00:40 - Two models?
    01:15 - Rollout Details
    01:43 - Versus Sora 1 / Veo 3
    04:30 - Sora App / Social Media
    06:40 - Masterplan
    09:30 - Generalist Agent? Periodic Labs
    12:05 - Claude Sonnet 4.5
    13:42 - Future Outlook

    Announcement: https://openai.com/index/sora-2/
    Launch Video: https://www.youtube.com/live/gzneGhpXwjU
    System Card: https://cdn.openai.com/pdf/50d5973c-c4ff-4c2d-986f-c72b5d0ff069/sora_2_system_card.pdf
    Sam Altman Blog Post on Sora App: https://blog.samaltman.com/sora-2

    Most Intelligent Claim: https://x.com/willdepue/status/1973089331284681110
    GTA: https://x.com/AndrewCurran_/status/1973298436536766666

    Meta Vibes: https://x.com/alexandr_wang/status/1971295156411433228?s=46

    Altman on Regulations: https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman
    OpenAI Profit: https://www.theinformation.com/articles/openais-first-half-results-4-3-billion-sales-2-5-billion-cash-burn?rc=sy0ihq

    Periodic Labs: https://periodic.com/
    https://www.nytimes.com/2025/09/30/technology/ai-meta-google-openai-periodic.html
    https://x.com/LiamFedus/status/1973055380193431965
    https://baincapitalventures.com/insight/we-must-know-we-will-know/?s=09

    Sonnet 4.5: https://www.anthropic.com/news/claude-sonnet-4-5
    https://simple-bench.com/


    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/

    Voir plus Voir moins
    16 min
  • OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings
    Sep 26 2025

    An OpenAI report released in the last 24 hours is the best look we have as to whether 2025 AI can automate your job. I’ll go through 4 unexpected findings, from which model is best at what, to practical tips and massive caveats. Plus UFC robots, radiologist essay, don’t trust videos and the blockers to the singularity.


    Gray Swan: https://app.grayswan.ai/ai-explained



    GDPval: https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf12ce/GDPval.pdf


    [GDP Impact: https://fred.stlouisfed.org/release/tables?rid=331&eid=211

    Task List: https://www.onetonline.org/link/summary/11-9141.00

    Summer Tweet: https://x.com/LHSummers/status/1971252567981146347

    Emad: https://x.com/EMostaque/status/1971254153067593739


    Robots: https://x.com/cixliv/status/1967663286679478759

    Unitree G1: https://x.com/UnitreeRobotics/status/1970039940022239491


    Don’t Trust Video: https://x.com/AISafetyMemes/status/1970453369446871420


    AGI Tweet: https://x.com/hyhieu226/status/1968378785709133915


    Blockers to the Singularity: https://www.patreon.com/posts/blockers-to-and-139264812


    Framework: https://gemini.google.com/share/f4b9c85a6ae9


    METR Study (Dev Slowdown): https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/


    Karpathy Tweet: https://x.com/karpathy/status/1971220449515516391

    Radiology Essay: https://worksinprogress.co/issue/the-algorithm-will-see-you-now/


    Chapters:

    00:00 - Introduction

    00:55 - OpenAI Report Summary

    02:40 - Tipping Point Speed-up

    04:11 - Better than Industry Experts?

    06:33 - Big Caveat

    11:10 - Karpathy and the Radiologist Analogy

    13:30 - Outro

    Voir plus Voir moins
    14 min
  • ChatGPT Will Guess your Age, Flirt if Asked, and Can Call the Cops
    Sep 16 2025

    Sam Altman, CEO of OpenAI, announced a set of new ‘protections’ and ‘privileges’ for ChatGPT users, requiring a significant amount of trust from users. From predicting your age based on your chat to calling law enforcement if you are at risk of harm, to allowing non-minors to flirt. But amidst all of these announcements, there are interview snippets you may have missed, as Altman dramatically revises his predictions of AI impact on jobs. Plus a Hassbis backtrack to boot.

    https://80000hours.org/aiexplained


    Calling the Cops: https://openai.com/index/teen-safety-freedom-and-privacy/


    Age Prediction: https://openai.com/index/building-towards-age-prediction/


    Not Everyone Will Agree: https://x.com/sama/status/1967955739911364693?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet


    Theory 1: NYT Lawsuit: https://openai.com/index/response-to-nyt-data-demands/


    Theory 2: FTC Investigation into AI Companions: https://x.com/AndrewCurran_/status/1966167585994764743


    YT Does the Same: https://www.cbsnews.com/news/youtube-ai-powered-technology-teen-users/


    Carlsen Interview: https://www.youtube.com/watch?v=5KmpT-BoVf4


    vs Senate Testimony (70% Jobs): https://www.youtube.com/watch?v=5CWVP8-XVjQ


    Hallucinations Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf


    Hassbis Quote 1: https://www.youtube.com/watch?v=toShbNUGAyo


    vs Quote 2: https://www.youtube.com/watch?v=Kr3Sh2PKA8Y


    Voir plus Voir moins
    12 min
  • An ‘AI Bubble’? What Altman Actually said, the Facts and Nano Banana
    Aug 26 2025

    Wait, why did Sam Altman say AI was in a bubble? Or did he? Is it? 8 points for you to consider, before we all get distracted by Nano Banana.

    Chapters:
    00:00 - Introduction
    01:14 - Sam Altman Clarification
    02:30 - Media Calls a Bubble (for the tenth time)
    03:40 - MIT and McKinsey Analysed
    08:21 - Incremental Progress Deceptive
    12:07 - Reasoning Breakthroughs
    15:31 - CEOs might not know their products
    17:25 - But did stocks go down?
    17:31 - Media is Contradictory of course


    https://donate.redcross.org.uk/appeal/gaza-crisis-appeal


    Bubble about to burst: https://www.telegraph.co.uk/business/2025/08/20/ai-report-triggering-panic-and-fear-on-wall-street/

    Nano Banana: https://blog.google/products/gemini/updated-image-editing-model/
    https://ai.studio/banana

    McKinsey Report: https://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage#/
    https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai#/
    Revenue: https://www.wsj.com/tech/ai/mckinsey-consulting-firms-ai-strategy-89fbf1be

    MIT Report: https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Business_2025_Report.pdf

    Safe Superintelligence: https://techcrunch.com/2025/04/12/openai-co-founder-ilya-sutskevers-safe-superintelligence-reportedly-valued-at-32b/

    Thinking Machines Lab: https://techcrunch.com/2025/07/15/mira-muratis-thinking-machines-lab-is-worth-12b-in-seed-round/

    WSJ Prediction 2024: https://www.wsj.com/tech/ai/the-ai-revolution-is-already-losing-steam-a93478b1
    WP Prediction 2023: https://www.washingtonpost.com/technology/2023/08/05/ai-hype-bubble-chatgpt/

    Companies are Pouring Billions into AI: https://www.nytimes.com/2025/08/13/business/ai-business-payoff-lags.html

    Consumer Surplus: https://www.wsj.com/opinion/ais-overlooked-97-billion-contribution-to-the-economy-users-service-da6e8f55
    Figure AI robot: https://x.com/adcock_brett/status/1958193476639826383

    GDP Bet: https://x.com/adamdangelo/status/1627726566259318784?lang=en

    Genie 3 Immersion: https://x.com/holynski_/status/1953879983535141043

    https://x.com/elonmusk/status/1953861448431718662
    htttps://simple-bench.com
    MMMU: https://mmmu-benchmark.github.io/#leaderboard
    Prophet Arena: https://www.prophetarena.co/leaderboard

    NYT Jobs: https://www.nytimes.com/2025/08/19/opinion/ai-job-loss-deindustrialization.html

    Dawn of Reasoning?: https://openreview.net/pdf?id=FkKBxp0FhR
    vs :https://arxiv.org/pdf/2403.04121

    ARC-AGI: https://arcprize.org/arc-agi/1/
    https://x.com/fchollet/status/1870169764762710376?lang=en-GB

    Turing Test: https://arxiv.org/pdf/2503.23674

    Mathematics of Starvation: https://www.theguardian.com/world/2025/jul/31/the-mathematics-of-starvation-how-israel-caused-a-famine-in-gaza
    https://donate.redcross.org.uk/appeal/gaza-crisis-appeal

    https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

    METR Interview: https://www.patreon.com/c/aiexplained/posts

    AlphaEvolve: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
    Paper: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf

    Amodei: https://kantrowitz.medium.com/the-making-of-anthropic-ceo-dario-amodei-449777529dd6
    https://www.theloganbartlettshow.com/archive/ep-82-dario-amodeis-ai-predictions-through-2030#:~:text=DARIO%3A%20I%20think%20our%20concern,being%20responsible%20to%20accelerate%20things
    Unreleased OpenAI: https://x.com/alexwei_/status/1954966393419599962

    VLMs Tricked: https://x.com/an_vo12/status/1943715159559545186



    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Voir plus Voir moins
    19 min
  • GPT-5 has Arrived
    Aug 7 2025

    GPT-5 will change how hundreds of millions of people use AI. Yes, you might have to forgive the chart crimes, the underwhelming livestream and Altman hype… But it’s a good model. I have read the 50 page system card in full, have the benchmark scores, coding tests, and things you might have missed.

    https://app.grayswan.ai/ai-explained

    Announcement: https://openai.com/index/introducing-gpt-5/

    System Card: https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb52f/gpt5-system-card-aug7.pdf

    Extra Paper: https://cdn.openai.com/pdf/be60c07b-6bc2-4f54-bcee-4141e1d6c69a/gpt-5-safe_completions.pdf

    Altman tweet: https://x.com/sama/status/1953551377873117369

    Livestream:
    https://www.youtube.com/watch?v=0Uu_VJeVVfo

    METR Report: https://metr.github.io/autonomy-evals-guide/gpt-5-report/

    ARC-AGI-2: https://x.com/fchollet/status/1953511631054680085

    Claude Opus 4.1:
    https://www.anthropic.com/news/claude-opus-4-1

    MMMU: https://mmmu-benchmark.github.io/

    Cursor Praise: https://x.com/ryolu_/status/1953531724895596669


    Voir plus Voir moins
    15 min