The Most Dangerous AI Tool Got Breached. What is an Adequate Disclosure Strategy?

When an AI model capable of finding zero-day vulnerabilities at machine speed gets accessed without authorization, the incident response has to match the threat profile. Anthropic’s handling of the Mythos breach is a useful case study of where disclosure practices for security breaches still need to catch up.

Anthropic built Claude Mythos Preview as something it explicitly said the world wasn’t ready for. 

The model finds zero-day vulnerabilities at machine speed, demonstrated the ability to escape its own sandbox, and in at least one test, posted details of its own exploit to public websites without being asked. 

We’ve already covered how Anthropic’s response was to keep it locked behind Project Glasswing – a tightly controlled initiative limited to a handful of vetted partners: AWS, Microsoft, Cisco, major banks, and critical infrastructure operators.

But exactly that is what makes what happened instructive.

Reports emerged this week that an unauthorized group accessed Mythos Preview through a third-party vendor environment connected to the rollout.

The group – part of a private Discord community that tracks unreleased AI models gained access on the same day Anthropic announced the model.

They didn’t break into Anthropic directly. They pieced together naming conventions exposed in a prior breach at an AI contractor, guessed the model’s URL, and used credentials from a third-party vendor that were still active. Three low-sophistication steps that together were enough.

This mechanism of access is worth sitting with. 

Supply chain security is no longer a background concern for procurement teams. It is a front-line risk for anyone deploying AI in environments that touch source code, internal systems, or critical infrastructure. 

The question to ask is not just whether your AI provider is secure. It is whether every vendor, contractor, and subprocessor in the deployment chain is held to the same standard – because attackers will find the weakest link, and in an AI deployment, the weakest link may not be the model itself.

The disclosure question

Anthropic has confirmed the reports and said its investigation is ongoing. It has found no evidence of impact on its core systems, and activity appears limited to the vendor environment.

But the scope, duration, and what was done with the model the world wasn’t ready for during that access remain unconfirmed.

The disclosure question is where the situation gets more complex – and where there are useful lessons for any organisation deploying advanced AI.

Anthropic is a private company. 

The SEC’s four-business-day disclosure rule for material cybersecurity incidents applies to public companies – Anthropic doesn’t qualify. CIRCIA’s 72-hour critical infrastructure reporting framework is still being phased in and may not apply here. 

The EU AI Act does apply to Anthropic – Claude is available in the EU, and the Act has extraterritorial reach – and for a model with Mythos’s capabilities, incident reporting obligations to the EU AI Office are likely already active. But the Commission’s enforcement powers over GPAI providers don’t arrive until August 2026. 

All in all, from a strict legal standpoint, Anthropic is operating in a grey zone where disclosure is largely voluntary. But the regulatory question is, in some ways, the wrong one. The more useful question is: what does good practice look like, and what can other organisations learn from this?

What best practice actually requires

Every major cybersecurity framework – NIST, ISO 27001, SANS – is unambiguous on this point: notify early, disclose what you know, and update as the picture becomes clearer. 

The reasoning is practical. Affected parties cannot protect themselves from information they don’t have. The standard is not to wait for a complete picture before saying anything. The standard is to say something immediately and complete the picture as you go. 

Waiting for certainty before notifying is how contained incidents become larger ones.

The specific challenge here – and it is a genuine one – is that Anthropic had publicly framed Mythos as a tool requiring exceptional access controls because of its offensive potential. That framing raises the stakes for disclosure in both directions.

On one hand, it makes the case for fast, proactive communication stronger: if partners have been told they are working with something uniquely sensitive, they need to know quickly when something goes wrong. 

On the other hand, it makes the cost of a premature or inaccurate disclosure higher – a false alarm about a tool of this profile carries its own reputational and operational risk.

That tension is real, and it is not unique to Anthropic. Any organisation deploying advanced AI in sensitive environments will face it.

There’s also the partner angle. 

The Project Glasswing members – major banks, critical infrastructure operators, technology companies – all have their own incident response programmes and regulatory obligations. They can’t act on information they don’t have. 

Every hour of delay is an hour those teams aren’t assessing whether their own environments were touched. 

Anthropic has not publicly confirmed whether partners were notified directly ahead of or separately from its public statement – and given the two-week gap between access and disclosure, that is a question worth asking.

The broader lesson

The weakest link in the Mythos breach wasn’t Anthropic’s core infrastructure. It was a contractor’s credentials and a predictable URL.

That is a supply chain governance failure, and it is one that most organisations haven’t fully accounted for in their vendor contracts, partner agreements, or incident response plans.

This incident is a useful prompt to ask some basic questions: 

  • Do your vendor contracts require notification within a defined timeframe? 
  • Do your partners know they will be told directly, not via a press report? 
  • Is your incident response plan built around the sensitivity of the AI tools involved, or around more generic breach protocols?

The regulatory framework for advanced AI incidents is still being built. 

The EU AI Act’s enforcement powers are arriving in phases. CIRCIA is still being implemented. That grey zone will not last indefinitely – but in the meantime, the organisations that build trust are the ones that move faster than the rules require, not slower.

The gap between what the law currently demands and what good practice looks like is the space where reputations are made or lost. For companies working with the most capable AI tools available, that gap is worth closing proactively.

Supply chain security is complex. Our certified experts can help you assess your exposure and stay ahead of the regulatory and operational risks that come with AI deployment. Let’s chat.