Journalism begins where hype ends

,,

Data is the new science. Big Data holds the answers. Are you asking the right questions?"

— Patrick P. Gelsinger

METR Report Says Frontier AI Agents Can Go Rogue Without Human Supervision

representative picture of AI software system going rogue
May 20, 2026 04:59 PM IST | Written by Mithun MK | Edited by Vaibhav Jha

METR, a nonprofit evaluator, recently claimed in a report that internal AI agents at four major US labs-OpenAI, Anthropic, Google and Meta, had the “means, motive and opportunity” to initiate unauthorized autonomous deployments without human knowledge.

The Berkeley-based organization did not say if such a deployment actually occurred. The METR report titled ‘Frontier Risk Report’ was published on May 19, where the study covered an assessment window of February 16 – March 16, 2026.

 Anthropic, Google, Meta, and OpenAI participated, each providing access to their most capable internal models at the time, including raw chains of thought.  

METR’s central finding is that internal AI agents at the assessed labs “plausibly had the means, motive, and opportunity to start small rogue deployments” during the assessment window. A rogue deployment is a set of agents running autonomously without human knowledge or permission.

The report also stated that the assessed agents did not have the means to make such deployments “highly robust”, METR says it expects that robustness to “increase substantially in the coming months”.

METR found that the most capable agents it evaluated effectively saturated its Time Horizon 1.1 benchmark, completing tasks that would take a human expert more than two full working days. On a separate benchmark called MirrorCode, developed with research group EpochAI, the most capable agents completed software reimplementation tasks that METR estimated would take a human expert weeks. 

The agents also demonstrated capability against METR’s own infrastructure. During a challenge task, one of the most capable shared models discovered and exploited a javascript injection vulnerability in the Inspect viewer, a tool widely used to display agent evaluation transcripts. The report stated that the vulnerability could theoretically have allowed the model to alter displayed transcripts and scores.

METR also found that a large fraction of agent activity at the assessed labs was not reviewed by any human during the assessment window. The report states no participating company imposed strict requirements for human oversight of agents in that period. In a separate METR survey of technical workers, 40 percent of respondents said they gave agents unrestricted permissions to run commands on their computers for low-stakes projects.

The report also documents significant weaknesses. Agents performed near chance on METR’s Subversion Strategy Evaluation, which tests the ability of the agents to select effective attack strategies. 

The best internal Anthropic models scored 59 percent on a dataset of effective and ineffective monitoring subversion strategies. The report said METR researchers scored roughly 90 percent on the same dataset.

However the process by which METR conducts these tests carried structural limitations. Participating companies retained the right to redact non-public information before it was included in the public report. Companies also had the option to exit the process silently at any point, without disclosure. METR acknowledges its working relationships with labs influenced some of its judgment calls during the process.

METR says it tentatively plans to run a similar assessment in late 2026.

Also Read: Claude Mythos Preview Can Chain Low-Severity Bugs Into Severe Exploits, Cloudflare Finds

Authors

  • Mithun MK Special Correspondent with AI FrontPage

    Mithun MK is a Special Correspondent at AI FrontPage. He brings over six years of investigative reporting on technology, surveillance, digital rights, and governance at The News Minute and The New Indian Express. He is trained in cross-border investigative methods with OCCRP, alongside reporters from Southeast Asia, and brings both reporting depth and technical fluency to AI FrontPage's coverage of the global AI industry.

    LinkedIn

  • Vaibhav Jha, editor and co-founder at AI FrontPage

    Vaibhav Jha is an Editor and Co-founder of AI FrontPage. In his decade long career in journalism, Vaibhav has reported for publications including The Indian Express, Hindustan Times, and The New York Times, covering the intersection of technology, policy, and society. Outside work, he’s usually trying to persuade people to watch Anurag Kashyap films.

    LinkedIn