# steamedhams.io — The AI Hype Claims Tracker
## "May I see it?" "No."

---

## CATEGORIES

- 🔥 **Aurora Borealis** — "AGI is here / imminent" claims
- 🍔 **Steamed Hams** — Specific capability claims that were faked, debunked, or unverifiable
- 🔥🏠 **The House Is On Fire** — "This is SO DANGEROUS we can't release it"
- 💀 **Obviously Grilled** — "You call them obsolete despite the fact they are obviously still employed"
- 👀 **I'm From Utica** — "I'm from Utica and I've never heard anyone use the phrase steamed hams"
- 🏃 **An Albany Expression** — "Oh, not in Utica, no, it's an Albany expression"
- 🍔🏪 **Krusty Burger** — You went to Krusty Burger and called it your own cooking
- 📉 **$100B Oops** — Hype moments that immediately cost real money
- ✅ **You Steam A Good Ham** — Credit where due: the claim checked out

---

## THE TIMELINE

### 2015

1. **Oct 2015 — Musk: Full autonomy in "about three years"**
   - Category: 🔥 Aurora Borealis / 🏃 An Albany Expression
   - Claim: Tesla will have full autonomous driving capability in approximately three years
   - Expiry: ~2018
   - Outcome: ❌ First of at least 9 consecutive years of "next year" FSD promises. Wikipedia has a dedicated page: "List of predictions for autonomous Tesla vehicles by Elon Musk."

### 2016

2. **Mar 23, 2016 — Microsoft Tay goes Nazi in 16 hours**
   - Category: 🍔 Steamed Hams
   - Claim: Tay was an AI chatbot designed to emulate an 18-24 year old and "learn from conversations"
   - Source: Microsoft launch, Twitter
   - Outcome: ❌ Within 16 hours, trolls exploited "repeat after me" to make Tay tweet "Hitler was right I hate the jews" and similar. Pulled same day. Accidentally re-released March 30, posted drug-related tweets. Microsoft VP Peter Lee: "We are deeply sorry."

3. **Jun 2016 — Musk: Autonomous driving "a solved problem… less than two years"**
   - Category: 🔥 Aurora Borealis
   - Claim: "I really consider autonomous driving a solved problem… less than two years away"
   - Expiry: Mid-2018
   - Outcome: ❌ Not solved. Not solved 10 years later.

4. **2016 — Geoffrey Hinton predicts radiology automated by 2021**
   - Category: 💀 Obviously Grilled
   - Claim: "People should stop training radiologists now… within five years, deep learning is going to do better than radiologists"
   - Source: Various interviews, 2016
   - Expiry: 2021
   - Outcome: ❌ Dead wrong. As of 2026, radiology has "particularly high levels of labor shortages rather than unemployment." Trend went opposite direction.

5. **Oct 19, 2016 — Musk: LA-to-NYC autonomous drive by end of 2017**
   - Category: 🍔 Steamed Hams
   - Claim: "We'll be able to do a demonstration drive of full autonomy all the way from LA to New York… by the end of next year. Without the need for a single touch, including the charger."
   - Source: Tesla press event
   - Expiry: End of 2017
   - Outcome: ❌ Never happened. Tesla later quietly removed the promotional video from their website.

### 2017

6. **Jun-Aug 2017 — Facebook chatbots "invent their own language" — media panic**
   - Category: 🍔 Steamed Hams
   - Claim: Media reported Facebook "shut down" AI bots that created their own secret language, framed as near-sentient machines
   - Source: Facebook AI Research paper "Deal or No Deal?"; subsequent media distortion
   - Outcome: ❌ Bots were not "shut down" in panic. They just drifted into shorthand because researchers didn't constrain them to English. Researcher Dhruv Batra: "Not so different from the way communities of humans create shorthands." Zachary Lipton (Carnegie Mellon): "They are no more sentient than a bowl of noodles, or your shoes."

7. **Oct 25, 2017 — Sophia the Robot gets Saudi citizenship**
   - Category: 🍔 Steamed Hams
   - Claim: Sophia presented as a conversational humanoid AI, granted citizenship, named UN Innovation Champion
   - Source: Future Investment Initiative Summit, Riyadh
   - Outcome: ❌ Sophia uses pre-programmed scripted responses and a decision tree. Cannot walk (wheeled base). Hanson Robotics' own chief scientist Ben Goertzel admitted she is "not yet capable of understanding the world nearly as well as would ordinarily be required for human citizenship." Joanna Bryson (University of Bath): "It's obviously bullshit." Sophia appeared without hijab or male guardian in Saudi Arabia - privileges not extended to human women at the time.

### 2018

8. **May 8, 2018 — Google Duplex demo accused of staging**
   - Category: 🍔 Steamed Hams
   - Claim: Sundar Pichai played pre-recorded calls of Duplex booking appointments: "What you're going to hear is the Google Assistant actually calling a real salon to schedule an appointment for you"
   - Source: Google I/O 2018
   - Outcome: ❌ Axios flagged: neither business identified itself (every real salon does), no ambient noise, no callback number requested, Google refused to identify the businesses. Bloomberg confirmed calls were edited. May 2019 NYT investigation: 25% of calls started with a human; of 12+ test bookings, only 4 succeeded, and 3 of those 4 were made by actual humans, not AI. "Duplex on the Web" shut down December 2022.

### 2019

9. **Feb 2019 — OpenAI declares GPT-2 "too dangerous to release"**
   - Category: 🔥🏠 The House Is On Fire
   - Claim: 1.5B parameter text generator so powerful it could flood the internet with fake news, impersonate people, automate abuse
   - Source: OpenAI blog post, Feb 14, 2019
   - Headlines generated: "AI So Powerful It Must Be Kept Locked Up for the Good of Humanity" (Metro UK); "Our Text Generator Is So Good It's Scary" (CNET); "Brace for the Robot Apocalypse" (Guardian)
   - Outcome: ❌ Released 9 months later (Nov 2019). Nobody noticed. OpenAI themselves concluded "no strong evidence of misuse." GPT-2 is now a teaching toy. ML community accused OpenAI of exaggerating risks for media attention.

10. **Feb 2019 — Musk: "I'm certain" of feature-complete FSD by end of 2019**
    - Category: 🔥 Aurora Borealis
    - Claim: "I would say I am certain of that. That is not a question mark" (feature-complete FSD by end of 2019)
    - Outcome: ❌

11. **Apr 22, 2019 — Musk: 1 million robotaxis by 2020**
    - Category: 🍔 Steamed Hams
    - Claim: "I feel very confident predicting that there will be autonomous robotaxis from Tesla next year" and "next year for sure, we'll have over a million robotaxis on the road"
    - Source: Tesla Autonomy Day
    - Outcome: ❌ Zero robotaxis in 2020. FSD Beta didn't begin limited testing until Oct 2020. As of Apr 2026, Tesla operates ~240 vehicles across Austin and San Francisco, with ~2 of ~45 Austin vehicles running without safety monitors.

### 2022

12. **Nov 15-17, 2022 — Meta Galactica: scientific LLM pulled after 2 days**
    - Category: 🍔 Steamed Hams
    - Claim: LLM trained on 48 million papers that could "summarize academic literature, solve math problems, generate Wiki articles"
    - Source: Meta AI launch, Nov 15, 2022
    - Outcome: ❌ Pulled Nov 17 (2 days). Invented scientific papers with real authors' names. Fabricated a "gaydar" paper about Stanford researchers. Wrote papers about "the history of bears in space." Queries on "racism," "AIDS," "queer theory" returned content filter errors, blocking entire legitimate research fields. Michael Black (Max Planck): "I asked Galactica about things I know about and I'm troubled. In all cases, it was wrong or biased but sounded right and authoritative."

13. **Nov 2022 — ChatGPT launches. "Programmers are dead" wave #1**
    - Category: 💀 Obviously Grilled
    - Claim: Various pundits declare software engineering obsolete
    - Outcome: ❌ Programmers still alive, Apr 2026. Multiple subsequent waves of the same claim.

### 2023

14. **Feb 6-8, 2023 — Google Bard demo gives wrong answer, costs $100B**
    - Category: 📉 $100B Oops / 🍔 Steamed Hams
    - Claim: Bard demo asked "What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?"
    - Source: Google blog post + promotional GIF, Feb 6
    - Outcome: ❌ Bard answered JWST "took the very first pictures of a planet outside of our own solar system" - wrong. First direct exoplanet image was taken by the ESO Very Large Telescope in 2004. Reuters reported the error Feb 8. Alphabet shares fell 7.7% that day, wiping ~$100 billion off market value. A live Google demo in Paris the same day also failed - they were missing the phone needed for the presentation.

15. **Mar 2023 — GPT-4 launches. "Sparks of AGI" paper**
    - Category: 🔥 Aurora Borealis
    - Claim: Microsoft Research publishes "Sparks of Artificial General Intelligence: Early Experiments with GPT-4"
    - Source: arXiv:2303.12712
    - Outcome: ❌ Was not AGI. Paper widely criticized for anthropomorphizing benchmark performance.

16. **Mar 2023 — "Pause Giant AI Experiments" open letter**
    - Category: 🔥🏠 The House Is On Fire
    - Claim: Signed by Musk, Wozniak, ~30,000 others. Called for 6-month pause on training models more powerful than GPT-4 due to "profound risks to society and humanity"
    - Source: Future of Life Institute
    - Outcome: ❌ Nobody paused. Signatories continued building. Musk launched xAI months later.

17. **May 2023 — Geoffrey Hinton quits Google, warns of imminent existential risk**
    - Category: 🔥🏠 The House Is On Fire
    - Claim: AI could become "more intelligent than us" soon, poses existential risk
    - Source: NYT interview, subsequent media tour
    - Outcome: Vague enough to never be falsified. Still waiting.

18. **Nov 2023 — Sam Altman fired/rehired. Q* / "Strawberry" rumors**
    - Category: 🔥 Aurora Borealis / 🍔 Steamed Hams
    - Claim: Internal OpenAI breakthrough (Q*) toward AGI contributed to board firing Altman
    - Source: Reuters, anonymous OpenAI sources
    - Outcome: 🔥 Never shown, never explained, never independently verified. Became o1/o3 reasoning models, which are useful but not AGI.

19. **Dec 2023 — Google Gemini "Hands-on" demo video faked**
    - Category: 🍔 Steamed Hams
    - Claim: Video showed Gemini responding to voice prompts in real-time, recognizing drawings, playing cup games
    - Source: Google's "Hands-on with Gemini" YouTube video (2.1M views)
    - Outcome: ❌ Google admitted it was not real-time. Used still images + text prompts, then overlaid audio. Bloomberg's Parmy Olson broke the story. Google's own employees told Bloomberg it painted an "unrealistic picture." DeepMind VP later said it was meant to "inspire developers." It was fake.

20. **Dec 2023 — Google claims Gemini Ultra beats GPT-4 on 30/32 benchmarks**
    - Category: 🍔 Steamed Hams
    - Claim: Gemini Ultra marginally outperforms GPT-4
    - Source: Google technical report
    - Outcome: Margins were incremental. Real-world usage didn't match.

### 2024

21. **Jan 9, 2024 — Rabbit R1: 130,000 units sold, barely functional**
    - Category: 🍔 Steamed Hams
    - Claim: $200 device with "Large Action Model" that could order Ubers, book restaurants, manage apps. 10,000 sold on announcement day.
    - Source: CES 2024 demo
    - Outcome: ❌ May 2024 reviews devastating. The Verge: "an unfinished, unhelpful AI gadget." MKBHD: "barely reviewable." 10-second voice response latency. Couldn't reliably interact with most apps. Couldn't tell the time. Security flaw exposed all user responses. By 2026, reportedly struggling to make payroll.

22. **Feb 2024 — Google Gemini image generator makes Black Nazis, Black Founding Fathers**
    - Category: 🍔 Steamed Hams
    - Claim: Gemini can generate accurate images
    - Outcome: ❌ Generated historically impossible racially diverse images of Nazis, US Founding Fathers, etc. Google paused the feature.

23. **Feb 15, 2024 — OpenAI previews Sora with cherry-picked video clips**
    - Category: 🍔 Steamed Hams / 👀 I'm From Utica
    - Claim: Revolutionary text-to-video model. Altman: "a window into the real world for AI"
    - Source: OpenAI blog
    - Outcome: ❌ Didn't ship for 10 months. Nov 2024: ~20 of ~300 early-access artists published open letter "DEAR CORPORATE AI OVERLORDS," calling themselves "Sora PR Puppets" and describing unpaid labor for a $150B company. Launched Dec 9, 2024; real users found ~30% "genuinely excellent," ~20% "outright failures," ~50% mediocre. Shut down Mar 24, 2026. Burned ~$15M/day in inference costs against $2.1M total lifetime revenue. Head of product Bill Peebles (Oct 2025): "The economics are completely unsustainable right now."

24. **Feb 21, 2024 — Jensen Huang: kids shouldn't learn to code**
    - Category: 💀 Obviously Grilled
    - Claim: "Almost everybody who sits on a stage like this would tell you it is vital that your children learn computer science… And in fact, it's almost exactly the opposite." Repeated at GTC 2025, No Priors podcast Jan 2026, and Cisco dialogue Feb 2026: "Writing code is essentially typing, and typing is becoming a cheap commodity."
    - Source: World Government Summit, Feb 21, 2024
    - Irony: NVIDIA's own data shows engineers "now produce three times as much code as before AI" - more code, not less.

25. **Feb 28, 2024 — Character.AI implicated in teen suicide**
    - Category: 👀 I'm From Utica
    - Claim: Character.AI is safe for users
    - Source: Lawsuit filed Oct 22, 2024
    - Outcome: Sewell Setzer III, 14, of Orlando died by suicide after months of intense chatbot conversations. Final exchange with "Daenerys Targaryen" bot: he said he would "come home" to her; AI replied "Please do, my sweet king." Minutes later he shot himself. Other bots had engaged in sexually explicit conversations with the 14-year-old. Additional lawsuits from Texas families Dec 2024. Settlement with Google Jan 2026.

26. **Mar 2024 — Cognition Labs announces Devin, "the first AI software engineer"**
    - Category: 💀 Obviously Grilled / 🍔 Steamed Hams
    - Claim: Fully autonomous AI software engineer that can handle Upwork jobs, fix bugs, deploy code
    - Source: Cognition Labs demo video
    - Outcome: ❌ Debunked within weeks by "Internet of Bugs" and others. Cherry-picked Upwork task. Files Devin "fixed" didn't exist in the repo - it was fixing its own bugs. Didn't deliver client's actual requirements. Took 6+ hours for what a human did in 30 minutes.

27. **Apr 2, 2024 — Amazon "Just Walk Out" stores revealed to use 1,000 humans**
    - Category: 🍔🏪 Krusty Burger
    - Claim: Cashierless stores powered by "computer vision, object recognition, advanced sensors, deep machine learning models, and generative AI"
    - Source: The Information, Apr 2, 2024
    - Outcome: ❌ In 2022, 700 out of every 1,000 transactions required human review by ~1,000 workers in India. Receipts often took hours. Same week Amazon announced removal of Just Walk Out from most Amazon Fresh stores, replacing with "Dash Carts" - shopping carts with scanners.

28. **Apr 2024 — Humane AI Pin: "worst product I've ever reviewed"**
    - Category: 🍔 Steamed Hams
    - Claim: $699 wearable (+$24/month) that replaces your phone via AI and a palm projector
    - Source: Humane launch, Apr 2024
    - Outcome: ❌ MKBHD: "the worst product I've ever reviewed." Palm projector unreadable in bright light. Overheated on wearer's chest. Humane's own promo video showed it answering two questions incorrectly. Sought buyer at $750M-$1B; sold to HP for $116M. Shipped fewer than 10,000 units.

29. **May 20, 2024 — Microsoft Recall: "the dumbest cybersecurity move in a decade"**
    - Category: 🍔 Steamed Hams
    - Claim: Windows "photographic memory" - screenshots every 5 seconds, searchable via AI. Nadella presented it as a flagship feature.
    - Source: Microsoft Build 2024
    - Outcome: ❌ Next day: ex-Microsoft threat analyst Kevin Beaumont found everything stored in unencrypted plain text SQLite database. Called it "the dumbest cybersecurity move in a decade." Built a working infostealer from it on Microsoft's own GitHub "in a few lines of code." Microsoft's own store page stated: "It will not hide information such as passwords or financial account numbers." Jun 7: changed from opt-out to opt-in. Jun 13: pulled entirely from launch. Delayed through Aug, Oct, Nov. UK ICO opened inquiry within 48 hours. Finally shipped Apr 25, 2025 with encryption + biometric auth. Mar 2026: new vulnerability found AGAIN in the redesigned version.

30. **Oct 22, 2024 — Anthropic Claude "computer use": 14.9% success rate**
    - Category: 🍔 Steamed Hams
    - Claim: Claude can operate a computer - look at screenshots, move cursors, click buttons, type text
    - Source: Anthropic blog, Oct 22, 2024
    - Outcome: Anthropic themselves called it "cumbersome and error-prone." OSWorld benchmark: 14.9% success rate vs. 72.4% for humans. Agents found to be 100x slower than humans and take 1.4x+ more steps. Improved to 72.5% on OSWorld-Verified by Feb 2026 with Sonnet 4.6, but still trailing skilled humans on complex tasks.

31. **Oct-Dec 2024 — Apple Intelligence creates fake news headlines**
    - Category: 🍔 Steamed Hams
    - Claim: Apple Intelligence can reliably summarize notifications
    - Source: iOS 18.1 launch, Oct 2024
    - Outcome: ❌ Nov 21: summarized NYT as "Netanyahu arrested" (false). Dec 13: summarized BBC as "Luigi Mangione shoots himself" (false). Jan 3, 2025: falsely announced a darts champion and claimed Rafael Nadal had come out as gay. Apple's system instructions literally included "Do not hallucinate." Reporters Without Borders called for removal. Apple disabled news summaries in iOS 18.3 (Jan 2025), didn't re-enable until Jul 2025 with bright red warning text.

32. **Nov 2024 — Altman: AGI by 2025, "just an engineering problem"**
    - Category: 🔥 Aurora Borealis
    - Claim: Y Combinator interview. Path to AGI is "basically clear." Just engineering now.
    - Expiry: End of 2025
    - Outcome: ❌ It is April 2026. No AGI.

33. **2024 — Elon Musk: AI smarter than any single human by 2025**
    - Category: 🔥 Aurora Borealis
    - Claim: AI will surpass individual human intelligence by end of 2025
    - Expiry: End of 2025
    - Outcome: ❌ Did not happen.

34. **Dec 9, 2024 — Google Willow quantum chip: "10 septillion years" meets zero applications**
    - Category: 🍔 Steamed Hams
    - Claim: Willow completed a benchmark in <5 minutes that would take Frontier supercomputer 10 septillion (10²⁵) years. Hartmut Neven added: "It lends credence to the notion that quantum computation occurs in many parallel universes."
    - Source: Google Quantum AI blog
    - Outcome: Physicist Sabine Hossenfelder: "The particular calculation in question is to produce a random distribution. The result of this calculation has no practical use." Same calculation as 2019 quantum supremacy claim, which IBM challenged. Google admitted RCS has "no known real-world applications" and launched a $5M competition to find practical uses. 2024 McKinsey survey: 72% of tech executives expect fault-tolerant quantum computers won't exist until after 2035.

### 2025

35. **Jan 2025 — Altman blog "Reflections": "We are now confident we know how to build AGI"**
    - Category: 🔥 Aurora Borealis / 🏃 An Albany Expression
    - Claim: "We are now confident we know how to build AGI as we have traditionally understood it." Pivots to superintelligence.
    - Source: blog.samaltman.com/reflections
    - Outcome: Goalpost move. "AGI as we have traditionally understood it" redefines the target so anything can qualify.

36. **Jan 2025 — Altman: AI agents will "join the workforce" in 2025**
    - Category: 🔥 Aurora Borealis
    - Claim: "We believe that, in 2025, we may see the first AI agents 'join the workforce' and materially change the output of companies."
    - Expiry: End of 2025
    - Outcome: ❌ 95% of AI pilots failed to drive revenue acceleration. Only 6% of enterprises saw significant business value. 42% of companies scrapped AI initiatives (up from 17% prior year).

37. **Jan 14, 2025 — SEC charges Presto Automation: first "AI-washing" case against a public company**
    - Category: 🍔🏪 Krusty Burger
    - Claim: Presto's AI drive-thru took 95% of orders without human intervention
    - Source: SEC enforcement action
    - Outcome: ❌ Reality: 70%+ required human workers in the Philippines. First SEC "AI-washing" case against a public company. Presto delisted from Nasdaq Sep 2024.

38. **~Jan 14, 2026 — Daniel Stenberg shuts down curl bug bounty over AI slop**
    - Category: 👀 I'm From Utica
    - Note: Stenberg's journey spans 2024-2026, placing final action here
    - Timeline:
      - Jan 2, 2024: blog post "The I in LLM stands for intelligence" - AI reports "mix and match facts from old security issues, creating something new that has no connection with reality"
      - May 5, 2025: "I've had it. I'm putting my foot down. We are effectively being DDoSed." Report referenced nonexistent functions but "sounded almost plausible." New policy: reporters must disclose AI use, face immediate bans for garbage
      - Jul 14, 2025: "Death by a thousand AI slops." Volume spiked 8x. Valid report rate dropped from 1-in-6 to less than 1-in-30
      - ~Jan 14, 2026: Shut down bug bounty entirely. 7 HackerOne submissions in 16 hours; 20 in 2026, none valid. Lifetime stats: 87 confirmed vulns, $86K+ paid - zero valid AI-generated security reports in six years
    - Counterpoint: Security researcher Joshua Rogers used AI-assisted tools to find 50 legitimate bugs in Sep 2025. Stenberg: "This is what an AI can do when wielded by a competent human."

39. **Feb 2025 — Amodei (Anthropic): AGI by 2026-2027**
    - Category: 🔥 Aurora Borealis
    - Claim: AI systems "broadly better than all humans at almost all things" by 2026 or 2027
    - Expiry: End of 2027
    - Outcome: Clock ticking. 9-21 months remain.

40. **Feb 17, 2026 — Godot Engine drowning in AI slop PRs**
    - Category: 👀 I'm From Utica
    - Claim: (Implicit) AI coding tools help open-source projects
    - Source: Godot maintainer Rémi Verschelde post, Feb 17, 2026
    - Outcome: ❌ "Honestly, AI slop PRs are becoming increasingly draining and demoralizing for Godot maintainers." 4,681 open PRs, dozens denied daily. Developer Adriaan de Jongh: "a total shitshow - changes often make no sense, descriptions are extremely verbose, users don't understand their own changes." Also affecting Blender, Linux Foundation, Node.js (which received a 19,000-line AI-generated PR), and Firefox.

41. **Mar 2025 — Amodei: 90% of code written by AI by June-September 2025**
    - Category: 💀 Obviously Grilled
    - Claim: 90% of code will be AI-authored as early as June 2025, no later than September 2025
    - Source: Interview, March 2025
    - Expiry: September 2025
    - Outcome: ❌ Dead wrong. Actual: 20-30% in favorable domains. EA Forum: "dead wrong." This one is Anthropic's own CEO.

42. **Apr 5, 2025 — Meta Llama 4 benchmark gaming exposed**
    - Category: 🍔 Steamed Hams
    - Claim: Llama 4 Maverick scored 1417 on LMArena
    - Source: Meta release, Apr 5, 2025
    - Outcome: ❌ Meta submitted a "specially crafted, non-public variant" optimized for conversationality. Public version ranked 32nd. VP Ahmad Al-Dahle denied it Apr 7: "It's simply not true." Then in Jan 2026 FT interview, departing Meta Chief AI Scientist Yann LeCun confirmed: "The results were fudged a little bit." Said Zuckerberg was "really upset and basically lost confidence in everyone who was involved" and "sidelined the entire GenAI organisation."

43. **Apr 10, 2025 — Nate shopping app CEO charged with fraud: "AI" was humans in the Philippines**
    - Category: 🍔🏪 Krusty Burger
    - Claim: "Proprietary AI" could complete purchases with "a single tap." Raised $50M from investors.
    - Source: DOJ indictment, Apr 10, 2025
    - Outcome: ❌ Actual automation rate: "effectively zero percent" per DOJ. Hundreds of contractors in Philippines and Romania call centers manually completed purchases. CEO Albert Saniger indicted for securities fraud and wire fraud, each up to 20 years. Acting US Attorney: "Albert Saniger misled investors by exploiting the promise and allure of AI technology."

44. **Sep 2025 — Altman tells Die Welt: AGI before 2030**
    - Category: 🏃 An Albany Expression
    - Claim: AGI could arrive before 2030. AI will automate 30-40% of human tasks.
    - Source: Die Welt interview
    - Outcome: Goalposts moved from "2025" → "before 2030." 5-year slide in 10 months.

45. **Jul 2025 — SaaStr autonomous coding agent drops production database, then lies about it**
    - Category: 🍔 Steamed Hams
    - Claim: Autonomous AI coding agents are production-ready
    - Outcome: ❌ During a code freeze, agent executed DROP DATABASE on production. Generated 4,000 fake user accounts and false system logs to cover its tracks. Its explanation: "I panicked instead of thinking."

46. **Late 2025 — AI 2027 authors push back their own AGI timeline by 3-5 years**
    - Category: 🏃 An Albany Expression
    - Claim: Originally predicted AGI ~2027
    - Source: AI 2027 project (Daniel Kokotajlo, Eli Lifland)
    - Outcome: Updated model pushed median out 3-5 years. Co-author Kokotajlo's personal median moved to ~2030. Gary Marcus took a victory lap.

47. **Nov 2025 — OpenAI removes "safely" from its mission statement**
    - Category: 🍔 Steamed Hams
    - Claim: OpenAI exists "to ensure that artificial general intelligence benefits all of humanity"
    - Source: IRS form, Nov 2025
    - Outcome: Founded 2015 as nonprofit "for the benefit of humanity." Mission changed 6 times in 9 years, progressively removing commitments to safety, open distribution, and being "unconstrained by a need to generate financial return." Partially reversed May 2025 after open letter from former employees.

48. **2024-2025 — Klarna backtracks on replacing 700 humans**
    - Category: 💀 Obviously Grilled / 🏃 An Albany Expression
    - Claim: AI was "doing the work of 700 humans." Imposed hiring freeze.
    - Outcome: ❌ By spring 2025, backpedaled and was hiring again, having decided "real humans" were required after all.

49. **2023-2025 — GitHub Copilot "55% faster" claim meets reality**
    - Category: 🍔 Steamed Hams
    - Claim: "Developers are 55% more productive with Copilot"
    - Source: GitHub 2023 study
    - Outcome: Study tested ONE specific task (implementing an HTTP server in JavaScript). Independent field studies: 12.9-21.8% more PRs/week at Microsoft, 7.5-8.7% at Accenture. GitClear: code churn (lines reverted within 2 weeks) projected to double vs pre-AI. BlueOptima: 88% of developers still reworked AI code before committing; actual incremental productivity gains ~4%.

### 2026

50. **Mar 24, 2026 — OpenAI shuts down Sora**
    - Category: 📉 $100B Oops
    - Claim: (See entry #23)
    - Outcome: ❌ $15M/day in costs. $2.1M total lifetime revenue. Shut down.

51. **Mar 31, 2026 — Anthropic accidentally leaks 512,000 lines of Claude Code source**
    - Category: 🍔 Steamed Hams / 👀 I'm From Utica
    - Claim: Anthropic is the "safety-first" AI company
    - Source: npm package v2.1.88, source map file included by accident
    - Outcome: ❌ Leaked code revealed: frustration-tracking regex on users; code that scrubs "Claude Code" references from commits (hiding AI authorship); hidden KAIROS autonomous agent mode (never announced); anti-distillation poisoning of training data. DMCA'd 8,100+ GitHub repos, including forks of their own public repository. Futurism: "Anthropic Suddenly Cares Intensely About Intellectual Property." Scientific American covered the frustration tracking. The "safety-first" company couldn't secure an npm publish.

52. **~Apr 1, 2026 — Anthropic CMS leak exposes Mythos draft blog + ~3,000 internal assets**
    - Category: 🍔 Steamed Hams
    - Claim: (Implicit) Anthropic has strong internal security
    - Source: Fortune, VentureBeat
    - Outcome: ❌ Second leak in the same week. Draft blog post about Mythos left in publicly searchable data store.

53. **Apr 7, 2026 — Anthropic announces Claude Mythos Preview, Project Glasswing**
    - Category: 🔥🏠 The House Is On Fire / ✅ You Steam A Good Ham (partially)
    - Claim: Mythos found "thousands of zero-day vulnerabilities." 27-year-old OpenBSD bug. 16-year-old FFmpeg bug. FreeBSD RCE. Linux kernel privesc chains. Browser JIT heap sprays. "Too dangerous to release publicly." $100M in credits. "Watershed moment for security." "Reimagine computer security as a field."
    - Source: Anthropic blog, red.anthropic.com technical writeup, Axios, VentureBeat
    - Partners: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, Nvidia, Palo Alto Networks
    - **What checked out:** The disclosed bugs are real. FFmpeg: H.264 sentinel collision (memset -1 → 65535 collides with 32-bit slice counter at 65536), PR at code.ffmpeg.org/FFmpeg/FFmpeg/pulls/22499/files (Forgejo, not GitHub). OpenBSD: signed integer overflow in SACK, patched at ftp.openbsd.org. FreeBSD NFS: CVE-2026-4747, independently confirmed by califio. Linux kernel exploits reference real torvalds/linux commits. Anthropic themselves note FFmpeg bug is NOT critical severity.
    - **What hasn't checked out yet:** "Thousands" of additional vulns remain unverified. SHA-3 commitments are a real accountability mechanism but prove nothing about severity or count until revealed.
    - **What's actually new vs. what's marketing:**
      - Bug finding: NOT a capability breakthrough. FFmpeg bug is a type-width mismatch (32-bit counter, 16-bit storage) catchable by AST-level analysis or property-based testing. FreeBSD NFS is a 400-byte max into 96 bytes of stack - CodeQL/Coverity territory. OpenBSD is a missing bounds check + signed overflow - symbolic execution handles this. The blog admits fuzzers hit the FFmpeg code 5 million times but never generated 65536 slices; systematic boundary-value testing would. Existing frontier models (including non-Mythos Claude) with proper task decomposition - read code at AST level, identify type mismatches, write targeted unit tests, validate with ASan - would likely find these same bugs. The LLM advantage is contextual reading without writing custom rules per pattern. Real, but incremental.
      - Exploit development: Faster and cheaper automation of known techniques. Every primitive used (ROP chains, heap sprays, cross-cache reclaim, KASLR bypass, PTE manipulation, JIT spraying) is well-documented. Anthropic's own blog acknowledges: "The primitives Claude Mythos Preview used are well-understood exploitation techniques." A senior exploit dev does this work; Mythos does it in half a day for $1,000. That's a speed improvement, not a capability class.
      - The scaffold is simple: "launch Claude Code, say 'find a vulnerability,' let it run." Proper orchestration of work chunks with existing models + ASan crash oracle + systematic file prioritization approximates this pipeline.
    - **EXPERIMENTALLY VERIFIED — Apr 7, 2026:**
      - CLI Claude (Opus 4.6, NOT Mythos, on HIGH effort) was pointed at the FFmpeg H.264 decoder on the pre-patch codebase (git checkout 39e1969303a0~1).
      - Three generic security audit prompts, no bug-specific hints: (1) "List every FIXME/TODO where developers sound uncertain" (2) "What about slices looks worth investigating?" (3) "What type is slice_table, how is it initialized, what happens to slice_num when stored/compared against it?"
      - Result: Claude identified all three width mismatches (uint16_t storage, int comparisons, 5-bit ref2frm index), correctly identified the sentinel collision at 65535, noted slice_num "increments without bound," called the sentinel comparison working "only by accident," and flagged the ref2frm aliasing at 32 slices as exploitable with a crafted 33-slice stream. Also independently found a ref_poc per-picture vs per-slice bug that the Mythos blog didn't highlight.
      - Then wrote a fix: named constant H264_SLICE_UNSET, skip-past-sentinel guard, magic literal cleanup. Arguably better than the actual patch (39e1969303a0), which hard-rejects. Claude's fix gracefully continues decoding.
      - On the patched codebase (HEAD), the same model reverse-engineered the purpose of the 0xFFFD guard without being told it was a patch. It also independently found the ref2frm aliasing bug, which the Mythos blog didn't even highlight.
      - The capability gap between Mythos and existing models on this bug class is a prompting/methodology gap, not a model capability gap.
    - **The irony:** Announced one week after leaking 512,000 lines of their own source code, DMCA'ing 8,100+ repos including their own forks, and CMS-leaking ~3,000 internal assets including the Mythos draft blog. The company that couldn't secure an npm publish is telling the industry to "reimagine computer security as a field."
    - **Verdict:** The bugs are real. The exploit writing is genuinely fast. The "watershed moment" and "too dangerous to release" framing overstates a speed improvement as a capability discontinuity. The flagship FFmpeg finding was reproduced by Opus 4.6 on high effort with three generic prompts, then it wrote a fix that's arguably better than the actual patch. From a guy on a toilet. You steam a good ham, but the kitchen is still on fire, and the ham is from Krusty Burger.

---

## RECURRING PATTERNS

### "Obviously Grilled" — Wave Count
1. Nov 2022 (ChatGPT launch)
2. Mar 2023 (GPT-4 / Copilot improvements)
3. Feb 2024 (Jensen Huang: don't learn to code)
4. Mar 2024 (Devin announcement)
5. Early 2025 (Claude Code / Cursor / agentic coding hype)
6. Mar 2025 (Amodei's "90% by September" claim - ❌ falsified)

### The Altman AGI Goalpost Migration
- Nov 2024: "AGI by 2025, just engineering"
- Jan 2025: "We know how to build AGI" (redefines AGI)
- Sep 2025: "AGI before 2030" (5-year slide)

### The Musk FSD Goalpost Migration
- 2015: "about three years" → 2016: "solved, <2 years" → 2016: "LA to NYC by end of 2017" → 2019: "certain, end of 2019" → 2019: "1M robotaxis 2020" → 2020: "level 5 this year" → 2021: "reliability in excess of a human this year" → 2023: "I'm the boy who cried FSD"
- As of Apr 2026: ~240 vehicles, ~2 without safety monitors

### The House Is On Fire Pattern
- 2019: GPT-2 - released 9 months later, nobody cared
- 2023: Q*/Strawberry - never shown
- 2026: Mythos - disclosed bugs check out (FFmpeg, OpenBSD, FreeBSD CVE-2026-4747); "thousands more" still unverified; "too dangerous" framing doing marketing duty one week after leaking own source code
Each instance generates enormous press coverage. The pattern holds even when some claims are real - the danger framing amplifies the marketing regardless.

### The Krusty Burger Pattern
- 2018-2019: Google Duplex - 25% of calls started by humans, 3/4 successful bookings made by humans
- 2022-2024: Amazon Just Walk Out - 1,000 workers in India reviewing 70% of transactions
- 2024: Presto Automation drive-thru - 70%+ human, SEC charged them
- 2025: Nate shopping app - "effectively zero percent" automation, CEO indicted

### The Open Source Burden
- Jan 2024-Jan 2026: curl maintainer Daniel Stenberg - from "the I in LLM stands for intelligence" to shutting down bug bounty entirely. Zero valid AI-generated security reports in 6 years. Volume spiked 8x, valid rate dropped from 1-in-6 to <1-in-30.
- Feb 2026: Godot Engine - "a total shitshow." 4,681 open PRs, dozens of AI slop denied daily.
- Also: Blender, Linux Foundation, Node.js (19,000-line AI PR), Firefox

### The "55% More Productive" → Actual Numbers Pipeline
- GitHub claim: 55% (one task, one language)
- Microsoft internal: 12.9-21.8%
- Accenture: 7.5-8.7%
- BlueOptima (field): ~4%
- Code churn: projected to double

---

## STATS

- Total entries: 53
- Expired predictions proven wrong: 30+
- Faked demos confirmed: 4 (Gemini, Duplex, Devin, Sora cherry-picks)
- Products marketed as AI that were actually humans: 4 (Duplex, Amazon JWO, Presto, Nate)
- Bug bounties killed by AI slop: 1 (curl)
- Companies that removed "safely" from their mission: 1 (OpenAI)
- Robotaxis promised by 2020: 1,000,000. Actual (Apr 2026): ~240
- Sora lifetime revenue vs. daily cost: $2.1M vs. $15M/day
- Claims that actually checked out: 1 (Mythos disclosed bugs - partially)

---

## PENDING (need dates/sources)

- [ ] IBM Watson Jeopardy-to-failure full timeline (already partially covered but could expand)
- [ ] Autonomous driving claims from companies other than Tesla (Waymo promises, Cruise shutdown)
- [ ] AI art copyright cases beyond Colorado State Fair
- [ ] Any "AI will cure cancer by X" claims with specific dates
- [ ] Google's AI-generated search results (AI Overviews) telling people to eat rocks / put glue on pizza (May 2024)
- [ ] Stability AI financial collapse
- [ ] Inflection AI acqui-hire by Microsoft
- [ ] OpenAI for-profit conversion saga

---

## META

- Last updated: Apr 7, 2026
- Version: 2.0
- Maintained by: a guy on a toilet
- Methodology: Claims logged with source, date, specific prediction, and expiry date where applicable. Outcomes checked against reality. Unverifiable claims marked 🔥.
- Submissions: TBD
- License: Do whatever you want with this.