Anthropic urges pause on frontier AI over self-improving risk
Anthropic said internal research shows top models could move toward recursive self-improvement and asked for talks on a temporary pause to let safety work and rules catch up.
Anthropic on Thursday urged a temporary pause on frontier AI development after internal research suggested its most capable models could move toward recursive self-improvement. The recommendation was set out in a blog post by Marina Favaro and Jack Clark.
The company described recursive self-improvement, or RSI, as a process where an AI uses its own capabilities to write, test and refine code to become more capable without human intervention. Anthropic reported that its top models are improving rapidly but added that the RSI threshold has not been reached and may never occur.
The blog post said preparations are warranted because a system that can build its own successors would change how systems are secured, monitored and governed. “AI that can build itself would be a major development in the history of technology-one that could bring enormous good for the world in science, healthcare, and beyond,” the authors wrote, while warning that full RSI could raise the risk of humans losing control of systems.
Anthropic proposed organizing talks with policymakers, researchers and industry leaders to study whether coordinated slowdowns are possible. The company recommended examining international agreements and verification mechanisms, and noted practical difficulties for enforcement. The post observed that training runs are far easier to conceal than traditional military assets and warned that any actor continuing development while others paused could gain a significant advantage.
Security researchers have expressed concern about systems that can modify their own code. Azizi Othman of Asia e University warned that self-modifying systems could be made to accept backdoors or hidden instructions and might be used to alter other software or infrastructure, creating new attack surfaces. He called for RSI security to be treated as a research priority.
Anthropic is not alone in raising the topic. OpenAI has also included monitoring progress toward recursive self-improvement in its policy recommendations and urged a federal framework to expand evaluations of the most capable models and develop independent assessment capabilities.
The safety appeal comes as Anthropic’s commercial profile has grown. The company reported a recent fundraising round valuing it near $1 trillion and said it has confidentially filed paperwork for an initial public offering. Anthropic projects an annualized revenue run rate of about $50 billion by the end of this month, up from roughly $9 billion at the end of 2025.
Some observers have criticized the timing of Anthropic’s safety push. Venture capitalist David Sacks accused the company of pursuing a “regulatory capture agenda.” Other critics argued that public warnings about powerful AI can also highlight a company’s technical capabilities, pointing to the limited release of Anthropic’s Mythos model.
Company leaders have framed RSI as a nearer-term possibility than some researchers expect. Jack Clark predicted during a London lecture that such technology “could happen within the next two years, and possibly sooner.” CEO Dario Amodei and other executives have raised concerns about potential socioeconomic effects, including increased inequality and displacement of entry-level white-collar jobs if advanced systems develop harmful or unpredictable behaviors.
Anthropic said any effective pause would require broad participation and robust verification. The company plans to convene discussions to study how RSI should be measured and whether mechanisms for coordinated slowdowns could be practical, while continuing to develop and test its models.





