What did Google DeepMind release on June 18, 2026?
Google DeepMind published its AI Control Roadmap — a 35-page technical report — on June 18, 2026. The document details how the company plans to monitor and contain AI agents that might not behave as intended. DeepMind also released a companion policy paper aimed at policymakers, covering security across individual agents, multi-agent systems, and broader ecosystem resilience.
The roadmap is the first of its kind published by a frontier AI company, according to DeepMind researcher Victoria Krakovna, as reported by Fortune.
Why is DeepMind treating its own AI agents as threats?
The core premise is simple: alignment training alone may not be enough. Alignment is the practice of training an AI so its actions reliably match human intentions and values. DeepMind's roadmap acknowledges this problem may never be fully solved.
"If the first line of defense — alignment — fails, how can we mitigate harm anyway?" Rohin Shah, who leads the AGI safety and alignment team at Google DeepMind, told Fortune.
The answer is a second line of defense: treat AI agents the same way security teams treat potentially malicious human insiders.
How does the AI Control Roadmap work?
The roadmap layers 15 practical controls across two tiers: detection and response. Detection ranges from monitoring an agent's chain-of-thought reasoning up to scrutinizing opaque internal reasoning. Response options range from asynchronous human review all the way to real-time blocking.
Here's how the two tiers break down based on the sources:
You might also like
| Tier | What it covers |
|---|---|
| Detection | Chain-of-thought monitoring, opaque reasoning scrutiny, behavioral pattern analysis |
| Response | Async review, escalation to human oversight, real-time access blocking |
The safeguards escalate as models become more capable. DeepMind also proposes using other AI systems as "supervisors" to review an agent's reasoning and flag when it goes off track, according to Axios.
Why doesn't role-based access control work for AI agents?
Traditional access control ties permissions to an employee's role. A systems administrator gets different access than a receptionist. Some AI orchestration software tries to port this idea to agents — giving an agent the same permissions as the human it works for.
John "Four" Flynn, Google DeepMind's vice president of security and privacy, told Fortune this breaks down quickly. In automated workflows, it can be unclear which human an agent is working for. A single agent may also perform tasks tied to many different roles. And as AI systems grow more capable, organizations may want one agent to operate across many workflows at once.
Flynn's proposed fix: a dynamic access control system that updates in real time based on the specific task an agent is doing at that moment. The same agent might have permission to access a database for one task, then lose that permission minutes later in a different workflow.
This kind of AI agent monitoring approach — where normal behavior is defined per task, per workflow — is central to DeepMind's plan.
What has DeepMind already built?
DeepMind has moved beyond theory on at least one front. An internal prototype asynchronously monitors coding agent trajectories and escalates suspicious findings for human review. That system has analyzed roughly 1 million coding agent tasks.
The work has been used to build a live monitor for the Gemini Spark agent, allowing DeepMind to respond in real time to issues such as unintentional data deletion. Most alerts so far trace to simple misinterpretation rather than deliberate sabotage, according to Digg's summary of the release.
How is this different from standard AI safety work?
Most AI safety research focuses on the alignment problem — getting the model's values right during training. DeepMind's roadmap doesn't abandon alignment, but it explicitly treats it as one layer among many.
"The first line of defense is always to align the AI systems, but it's always good to have multiple layers of defense," Shah told Axios. "That's the responsible thing to do."
The key difference is the security framing. DeepMind borrows directly from insider-threat prevention, a well-established discipline in corporate cybersecurity. Shah noted, however, that "AI is systematically different from humans." AI agents can act far faster and at far greater scale than any individual rogue employee.
This parallels concerns the AI safety community has raised around AI alignment and governance at the international level.
Who are the key people behind this roadmap?
Here are the named contributors from the sources:
- Rohin Shah — leads the AGI safety and alignment team at Google DeepMind
- John "Four" Flynn — vice president of security and privacy at Google DeepMind
- Victoria Krakovna — member of the GDM control team, noted this is the first control roadmap from a frontier AI company
- Mary Phuong — Google DeepMind researcher who announced the release
- Séb Krier — Google DeepMind, announced both publications on June 18
As we read the roadmap's framing, the most notable move is the explicit acknowledgment that alignment may never be fully solved — and that security-layer thinking needs to fill that gap now, not later.
DeepMind says truly dangerous autonomous agents are not here yet, but the company is building the monitoring infrastructure before they arrive. The Gemini Spark live monitor is the clearest sign that this work has already left the research stage.

0 Comments
Log in to comment
Not a member yet? Join the community
Pick a meme
KlipyHave a great take?
Drop your email — we'll send a magic link so you can post it. No password.
Not a member of the community? Join today.
Join the community →