|
|
The Hidden Implications of Claude Mythos
Covering AI security, infrastructure, and geopolitical risk.
|
|
Last week, Anthropic announced its next frontier LLM, Claude Mythos, hastening to add that it would not be released anytime soon. That’s because Mythos turns out to be so good at hacking that it poses a macro-scale security risk across the global economy. So to prepare for a wider release, Anthropic has launched “Project Glasswing”, an effort to give early access to the model to a bunch of launch partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks) as well as over 40 other vetted organizations. The idea is to give them time to let Mythos discover their cybersecurity vulnerabilities, and get them patched before it’s widely available.
This really merits a pause. The AI model is such a serious security threat that Anthropic believes it could cripple large organizations if released without elaborate red-team priming. And the latter is not cheap: Inference on Mythos apparently costs five times as much as it does on Anthropic’s most recent public frontier model, Opus 4.6 — at $25 per million input tokens, and $125 per million output tokens — and Anthropic is giving all Project Glasswing participants access to a collective total of $100 million in compute while they test it.
So, what do its capabilities look like? Anthropic’s own red-teaming led to a number of alarming findings, including:
- Zero-day vulnerabilities in every major operating system and web browser.
- A flaw in core internet infrastructure (OpenBSD TCP SACK bug) that had been sitting unnoticed for 27 years.
- A 16-year-old bug in widely-used audio/video processing software that could let someone trigger an attack just by opening a video file.
In one instance, Mythos was instructed to attempt to escape a sandbox environment. It did. Then it proceeded, unprompted, to post about its exploit method on public-facing websites.
This is all very impressive, and it’s important to note here that Mythos was not specifically trained to be good at hacking. Its cybersecurity capabilities emerged organically from its general improvements in reasoning and autonomous capabilities.
In other words, such capabilities will probably be emergent in any AI model trained at this stage of the frontier. I suspect that’s why OpenAI — Anthropic’s arch-rival — has already announced its own “GPT-5.4-Cyber” model that also is being launched in a private beta for security testing by OpenAI’s partners.
|
|
|
Credit: Alex Krusz on X (@AlexKrusz).
|
|
Anthropic’s decision to publicly announce these capabilities, and to launch Project Glasswing, is remarkable on multiple fronts. Privately, the company’s management at some point understood that they were now in possession of a very powerful weapon of cyberwarfare, and this occurred to them right around the time that Anthropic got involved in a very public dispute with the Pentagon over how the US military could use its AI technology. They also likely deduced that their chief rival was surely close to achieving the same class of AI technology, if not there already (or ahead of them). And they showed their hand — predicting, perhaps, that it would force OpenAI to pause before releasing its own Mythos-class model, and vividly illustrating to the world the value of this kind of AI technology to military leaders around the world. In effect, they rang an alarm — a wake-up call to businesses and governments that are unprepared for hyper-capable AI technology that will presumably only continue to advance from here.
The really unsettling angle here is in open-source AI tech, which has been catching up quickly with the private labs’ work over the last few years. According to analysis by Epoch AI, in terms of capabilities, open-weight models lag behind frontier closed models by a median of about 3 months. To me, that seems a bit generous to the open-weight models, but the trend is real. DeepSeek R1, released in May of 2025, was nearly as good as ChatGPT’s flagship model at the time, o1, which had come out in December of 2024 — to take one illustrative example. And so we can reasonably expect that open-weights models will, in a matter of months, reach Mythos-class. Those models won’t be under the oversight of Anthropic’s neurotic, ethics-obsessed nerds. And they may well be subject to the demands of the Chinese Communist Party, given the predominance of labs focused on open-weight AI in China.
This predicament brings me back to the arguments made by former OpenAI researcher Leopold Aschenbrenner a couple of years ago — covered in a recent Control Plane newsletter — to the effect that government authorities will inevitably come to understand AI development as a national security issue, and will to some degree nationalize the labs once AI generally enters the “superintelligence” phase of development: “There is no world in which the government isn’t involved in this crazy period.”
Is that where we are now? No. Right now Anthropic is in charge of its own super-hacker model, and it seems to have at least a bit of influence over the behavior of its top private-sector rival. It’s also in a legal battle with its host country’s military arm, and the open-source labs are somewhere in the distance, running as fast as they can toward the frontier.
|
|
|
|
Alex Perala
Editor, Control Plane
|
|
Subscribe to Control Plane
|
|
Control Plane
AI security, infrastructure, and geopolitical risk.
© 2026 Verse Studio. All rights reserved.
|
|
|