Anthropic's 'Mythos': The AI Safety Wake-Up Call We're Not Taking Seriously
Anthropic said it built a model called Mythos that it believes is too powerful to release broadly. This is not spin. This is a company admitting it created capabilities it can't responsibly deploy.
Context matters here. Anthropic is approaching $19 billion in annualized revenue, which means they're declining to monetize something they could. That decision has real financial consequences.
Major U.S. AI companies including OpenAI, Google, and Anthropic are sharing intelligence about Chinese firms allegedly using 'distillation' techniques to extract capabilities from American AI models, with Anthropic specifically blocking Chinese-controlled companies from using Claude and identifying three Chinese AI labs as illicitly extracting model capabilities, with distilled models often lacking safety guardrails.
So here's the dilemma Anthropic faces: if they won't release Mythos publicly for safety reasons, but Chinese labs are actively stealing and replicating their models anyway, what's the point of restraint?
My take: This moment signals a fracture in AI strategy. Anthropic is willing to leave $billions on the table for safety. OpenAI and Google are not—they're racing ahead regardless. By September 2026, we'll know which strategy wins. If Mythos somehow gets leaked or distilled by competitors, Anthropic's entire safety-first positioning collapses. If it stays contained, they've shown a path forward that competitors won't follow. Either way, someone loses.