A class of AI models touted as terrifyingly powerful apparently scared the government too much and are now deactivated

According to a statement posted on its website on FridayAnthropic was forced to “abruptly deactivate” two of its most prized frontier AI models in response to a highly restrictive order from the US government. “We believe this is a misunderstanding and are working to restore access as soon as possible,” the statement said.

The government action in question is an “export control directive” that says foreign nationals cannot use the models inside or outside the U.S., and was motivated by what Anthropic says was an unspecified national security concern.

But national security concerns and other security fears have been at the center of the launch of these models, which arguably made an event like this foreseeable.

Instead of releasing its Claude Mythos Preview model to the public in early April, Anthropic turned the creation of the model into a kind of awareness campaign about the ostensible dangers of frontier AI models.

He launched a system card explaining why the model would not be made public and detailing terrifying capabilities such as deception and the ability to supposedly break the containment of a limited system. It could also supposedly be useful in the development of advanced weapons. For example, the system card described it as “capable of performing important cross-domain synthesis relevant to the catastrophic development of biological weapons.”

At the same time, the company launched Project Glasswing, a program in which a limited group of partners and organizations were allowed to test the model to learn what new horrors it could inflict on the world of cybersecurity. “We formed Project Glasswing because of the capabilities we have observed in a new Anthropic-trained frontier model that we believe could reshape cybersecurity,” said the Anthropic blog post about Project Glasswing says.

Soon, despite the inherent nerdiness of the topic, Mythos Preview became a sensational story. An article in the New York Post quoted computer scientist Roman Yampolskiy prophesying that, much like what Mythos advertises, AI could soon develop “hacking tools, biological weapons, chemical weapons (and) novel weapons we can’t even imagine.” The phrase “Weapons We Can’t Even Imagine” even made headlines.

British government officials and UK financial sector leaders They scrambled to put together a plan of action in light of the perceived danger. According to the New York TimesThe Trump Administration’s “hands-off policy” toward AI changed after the Mythos announcement, and its mere existence helped lead to the development of a security-focused AI executive order. Trump signed one of those orders About a week ago.

However, last week Anthropic released Claude Fable 5 and Mythos 5. The company described Fable 5 as “a Mythos-class model that we have made safe for general use”, but with capabilities that “exceed those of any model we have made available to the public.” Meanwhile, Mythos 5 got a very limited release as part of Project Glasswing.

Brian Merchant in Blood in the Machine He described it like this:

After sparking a major news cycle in the tech media with its April announcement that it had built an AI model, Mythos, so powerful, so dangerous that it threatened to disrupt the entire civilizational order, and that it was diligently hiding the product from the public to protect us from it, the now number one AI company in the country decided to put Mythos on for sale after all.

Hours after Merchant wrote those words, the export control directive was handed over to Anthropic, and Fable 5 and Mythos 5 were left inaccessible due to apparent national security concerns. It appears that Anthropic was only ordered to revoke access to users who are not US citizens, but it is understandable that Anthropic would find it impractical to allow anyone to access them anywhere in the world for fear of disobeying the order. Among many issues, non-Americans work at Anthropic. It is clearly easier to simply remove the models completely until the situation is resolved.

Interestingly, Anthropic’s statement on the export control directive noted that Anthropic had “worked with the US government,” along with the UK government and “multiple private third-party organizations” in an effort to create a satisfactory set of safeguards for the models. After his release, the safeguards were, in many ways, The most prominent feature of media narrative. around Fable 5. One of the toughest barriers, designed to silently punish users who abused the model, was even considered ill-conceived, prompting Anthropic to apologize.

But according to Anthropic’s account, the government panicked after learning of a Fable 5 leak that bypassed those important safeguards:

“It is our understanding that the government believes it has become aware of a method to bypass or ‘unlock’ Fable 5. We reviewed a demonstration of this specific technique that is used to identify a small number of previously known minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly available models can also discover them without requiring a bypass.

Anthtropic points out, rightly, that when it released Fable 5, the section in your blog post Regarding the safety of the model, he made it clear that some leaks were still possible. It is “probably impossible completely prevent universal jailbreaks, but our goal is to make the remaining jailbreaks slow and expensive enough that we can detect and prevent them before they are used at scale,” Anthropic wrote. Essentially, since it is not yet possible to make a perfectly jailbreak-proof model, Anthropic sought to make jailbreaks either expensive to produce or too “narrow” to be a threat. Anthropic is also public about the fact that retains user data from Mythos class models much longer than usual.

Still, it’s strange to see Anthropic now downplaying the importance of the perceived dangers of its models, writing that these vulnerabilities are “minor,” “previously known,” and “relatively simple,” as well as noting that “other publicly available models can also discover them without needing to overlook them.”

Again, when Anthropic first published this class of models, it told the world that it had created something of unprecedented power with the potential to cause real harm to the world. Two months later, a “Mythos-class” model was a product for public consumption, available as a premium product for users of the “seat-based Pro, Max, Team, and Enterprise plans at no additional cost,” but only for a limited time. On June 23, Anthropic’s intention was to “remove Fable 5 from those plans” and instead mandate a pay-as-you-go plan.

Anthropic claims that government actions like this could, if they became standard, “stop all new model deployments for all frontier model vendors.” And maybe that’s true. For a product launch to be halted when that product launch involved a precursor piece of technology that supposedly merited a global cybersecurity reassessment, an overreaction to holes in that product’s safeguards probably shouldn’t be surprising, even if that overreaction is bad for business.

Source link

A class of AI models touted as terrifyingly powerful apparently scared the government too much and are now deactivated

Leave a ReplyCancel Reply

Amazon’s Fire TV Stick 4K Select drops to $17.99

Eight ways I optimize my Motorola Razr 2026 camera to help me take better photos

PixelRAG Outperforms Text Analyzers in Accuracy and Reduces AI Agent Token Costs by 10X

Leave a ReplyCancel Reply

Trending now

Amazon’s Fire TV Stick 4K Select drops to $17.99

Eight ways I optimize my Motorola Razr 2026 camera to help me take better photos

PixelRAG Outperforms Text Analyzers in Accuracy and Reduces AI Agent Token Costs by 10X