Anthropic Wants a Global AI Off-Switch

If the phrase global AI off-switch sounds like science fiction with a committee attached, Anthropic would like to clarify: it is not proposing one dramatic red button for Claude. The company behind the Claude chatbot is instead calling for a coordinated, verifiable way for leading artificial intelligence developers to slow or temporarily pause frontier AI work if advanced systems start improving themselves faster than society can realistically manage.

The proposal appears in Anthropic’s June 4 essay, “When AI builds itself,” where the company argues that AI is already speeding up AI development. Its concern is “recursive self-improvement,” a scenario in which AI systems help design, build and train more capable successors with less human involvement each time. Anthropic says the industry is “not there yet” and that the outcome is “not inevitable.” Still, it warns that institutions may be moving too slowly for a technology sector famous for treating caution as an optional plug-in.

What Anthropic says is changing inside AI labs

Anthropic’s core argument is that the tools used to build artificial intelligence are becoming powerful enough to materially accelerate the next generation of tools. That is not just autocomplete with better manners. The company says frontier systems have moved from helping with isolated pieces of code to acting as autonomous agents that can edit files, run code, assign work to other agents and complete longer engineering or research tasks.

In Anthropic’s view, that shift could shrink the human role in frontier AI development. The concern is not simply that AI writes code. It is that a lab could gradually become a mostly automated production system, where humans still set goals but AI handles more of the implementation, testing and iteration.

That is where the governance problem becomes awkward. If each model helps produce a better model, and that model then helps produce an even better one, the pace of improvement may stop looking like normal software development and start looking like a feedback loop with funding rounds.

Claude is already writing much of Anthropic’s code

Anthropic’s most concrete evidence comes from its own engineering workflow. The company says that as of May 2026, more than 80 percent of the code merged into its codebase was authored by Claude. Before Claude Code launched in research preview in February 2025, that figure was in the low single digits.

The company also says the typical Anthropic engineer in the second quarter of 2026 was merging eight times as much code per day as in 2024. The reason, according to Anthropic, is that engineers are increasingly directing, checking and approving Claude’s work rather than writing every line themselves.

Anthropic also points to improvements in Claude’s handling of difficult, open-ended tasks. It says Claude’s success rate on its most open-ended coding tasks reached 76 percent in May 2026 after rising sharply over the previous six months. In some examples, the company says the model completed in hours work that would normally take human engineers days.

In research settings, Anthropic says Claude-powered agents have shown signs of running experiments from start to finish when humans define the problem and scoring criteria. That caveat matters. Humans are still choosing which problems are worth solving, which is a rather important detail, even in a very automated office.

The remaining human role is the hard part

Anthropic is not claiming that current models have replaced researchers. The company says today’s systems still trail humans in “research taste” and strategic judgment: choosing the right goals, deciding which results are trustworthy and knowing when an approach should be abandoned.

But Anthropic argues that much of the routine “perspiration” of AI research is becoming automatable. That includes coding, debugging, testing, refactoring and running repeated iterations. Even if AI never fully replaces human judgment, the company says multiplying each researcher’s output could still create compounding gains in frontier AI development.

This is the less cinematic version of recursive self-improvement, and probably the more plausible one in the near term. No robot needs to announce it has taken over the lab. The lab can simply become faster, more automated and harder for outside institutions to understand in real time.

That is why Anthropic is pushing the pause discussion now, before the company believes AI systems can fully automate the creation of their successors.

The “off-switch” is really a coordinated pause plan

Anthropic’s proposed mechanism would not shut down Claude or other deployed AI products. It would instead create a system under which major frontier AI labs, potentially across several countries, could slow or pause training and development of the most advanced systems under agreed conditions.

The company says any credible pause system would need answers to several basic questions:

What specific capability or risk threshold triggers a pause
What conditions allow work to resume
Who decides whether the threshold has been crossed
How participants verify that rivals have actually stopped
How the system avoids rewarding companies that ignore the rules

Anthropic argues that a unilateral pause by one company would not solve much. If one cautious lab stops while others continue, leadership may simply shift to less cautious competitors. In the essay, Anthropic says a meaningful slowdown would require “multiple well-resourced labs at or near the frontier” to pause under the same conditions and verify that others have done so too.

Verification is the ugly technical and political problem under the polished proposal. Large AI training runs are not nuclear submarines. They are easier to hide, use general-purpose hardware and cloud infrastructure, and come with enormous incentives to defect if a rival believes it can gain an advantage.

Why Anthropic’s position is complicated

Anthropic has built much of its public identity around AI safety, but it is also one of the companies racing to build and sell more powerful AI systems. That tension is not new, but the off-switch proposal puts it under brighter lights.

The call arrives only months after Anthropic revised its Responsible Scaling Policy. In that update, the company separated actions it believes it can take alone from broader safeguards it says would require industrywide cooperation or government involvement. In February, Anthropic described the revised policy as a pragmatic move toward transparency, risk reports and frontier safety roadmaps, while acknowledging that some higher-level protections are difficult for any single company to implement by itself.

That framing helps explain the latest proposal. Anthropic is effectively saying that it can publish policies, run evaluations and slow some work internally, but if frontier development becomes a race among several powerful labs and countries, voluntary restraint by one participant may not be enough.

Convenient? Possibly. Relevant? Also yes. Both things can be true, which is why AI governance debates so rarely fit into a clean hero-and-villain interface.

The idea has roots beyond Anthropic

Coordinated pausing is not a brand-new concept. A 2023 paper by Jide Alaga and Jonas Schuett proposed an evaluation-based system for frontier AI models. Under that approach, models would be tested for dangerous capabilities, developers would pause certain activities if models failed those evaluations and other participating developers would be notified so they could pause related work.

The paper argued that coordinated pausing could help manage emerging risks, but it also highlighted serious practical and legal obstacles. One major issue is antitrust law. Competitors are generally not encouraged to coordinate their business activities, even when the reason is safety rather than price-fixing in a nicer jacket.

A workable system would likely need legal clarity, trusted evaluators and technical monitoring good enough to detect when participants continue training anyway. It would also need shared definitions of dangerous capabilities, which is harder than it sounds when labs, governments and researchers may disagree on which risks matter most and how to measure them.

Critics question feasibility and incentives

Industry reaction is likely to be split. Critics quoted by Scientific American questioned whether a global slowdown is politically realistic in a market shaped by competition among U.S., Chinese and European actors. If frontier AI is seen as central to economic growth, military capability and national security, convincing governments and companies to tap the brakes at the same time becomes a very high-friction exercise.

Some skeptics also argue that calls for caution from leading labs can serve incumbents’ interests. A company already near the frontier may benefit if regulation makes it harder for smaller rivals to catch up. That does not automatically make the safety argument false, but it does mean policymakers will be watching motives as well as models.

This is the basic governance puzzle: the companies with the most technical knowledge are also commercial players with strong incentives. Excluding them would be foolish. Letting them write the rulebook alone would be equally generous, perhaps excessively so.

OpenAI is emphasizing government oversight

OpenAI, Anthropic’s major rival and the maker of ChatGPT, has recently stressed a different governance model. In a June 3 blueprint for frontier AI oversight, OpenAI said decisions about the pace of AI innovation should not be left to any one lab, company or special interest group.

Instead, OpenAI argued that democratic governments must set the rules, safeguards and accountability structures for increasingly capable AI systems. The company called for a U.S. federal framework, stronger evaluation institutions and wider public-sector planning so governments can better withstand and manage AI-related disruption.

Despite the different emphasis, OpenAI and Anthropic now both treat recursive self-improvement as a serious governance challenge. OpenAI’s blueprint described early signs that AI is accelerating AI development, warning that this could intensify competition among companies and nations. Anthropic goes further by arguing that the world should start building the capacity to slow or pause frontier development before AI systems can fully automate the creation of their successors.

In other words, both companies see the road getting faster. They differ on who should hold the speed limit sign.

Policymakers have to turn warnings into rules

Anthropic’s proposal leans on an arms-control comparison. The company notes that governments have built verification regimes for dangerous technologies before, including nuclear arms agreements. But those arrangements took decades, relied on specialized infrastructure and required levels of trust that do not currently exist for frontier AI.

Anthropic’s warning is that society may not have decades if model capabilities continue improving quickly. A credible pause mechanism would likely require:

Technical monitoring of advanced AI development
Agreed thresholds for dangerous capabilities
Independent auditors or public authorities with real expertise
Legal structures that allow safety coordination without violating competition law
International buy-in from countries that view AI as strategically essential

The company says it plans to convene conversations in the coming months with policymakers, researchers, civil society groups and other AI companies to explore how coordination might work.

Whether those talks produce a real framework or mostly clarify how much everyone disagrees remains to be seen. The central issue is no longer whether one company pauses one model. It is whether governments and frontier labs can build enough trust to govern a technology that may soon help build itself. Anthropic’s message is blunt: the world may not need to press the button today, but it needs to decide who can build it, when it can be used and how anyone will know the race has actually stopped.

What Anthropic says is changing inside AI labs

Claude is already writing much of Anthropic’s code

The remaining human role is the hard part

The “off-switch” is really a coordinated pause plan

Why Anthropic’s position is complicated

The idea has roots beyond Anthropic

Critics question feasibility and incentives

OpenAI is emphasizing government oversight

Policymakers have to turn warnings into rules

About The Pixel Gazette Editorial Team

Related Coverage

Pope Leo’s AI Encyclical Meets Anthropic

Rocket League World Championship 2026 Field Set

Marvel's 2042 Movie Plan Is Still a Sketch