Skip to main content

Meta Bans Claude Code & Codex Over Distillation

Meta blocked Claude Code and OpenAI Codex internally after internal documents warned that rival AI outputs could seep into Meta's training data through distillation attacks.

Meta Bans Claude Code & Codex Over Distillationaiweekly.co

What did Meta actually ban, and why?

Meta restricted engineers from using Anthropic's Claude Code and OpenAI's Codex, according to internal documents reported by The Information. The concern was not cost or productivity. Meta's stated worry was distillation — and what it could do to the company's own AI training data.

An internal memo reportedly told teams to pause certain tasks that relied on those models. The document warned that allowing rival AI outputs into Meta's training data could trigger "serious escalations with partner companies." That language suggests the risk is partly contractual.

What is a distillation attack?

A distillation attack is a method where one AI model is trained to mimic a more capable AI by flooding it with prompts and collecting its responses. The attacker uses those responses to train their own model — cheaply and without building from scratch.

Distillation is not always malicious. As InformationWeek reports, frontier AI labs use distillation legitimately to create smaller versions of their own models. Shatabdi Sharma, CIO at Capacity, described it as "a teacher model and a student model that is still learning."

The problem arises when the method is used without authorization — and at scale.

How did distillation attacks reach industrial scale?

Anthropic published a blog post describing how three AI laboratories — DeepSeek, Moonshot, and MiniMax — used thousands of fraudulent accounts and proxy services to extract capabilities from Claude. OpenAI has also accused DeepSeek of running distillation attacks against its models.

You might also like

Here's what we know so far: these weren't small-scale experiments. The operations involved coordinated infrastructure designed to systematically pull model capabilities at volume.

Anthropic flagged that distilled models often lack the safety safeguards of the originals. The company also noted this poses national security risks, in addition to threatening the competitive advantage of frontier model providers.

Why did Anthropic's terms update matter to Meta?

Anthropic updated its consumer terms in August and September of 2025 to allow opt-in training on select datasets. That revision reportedly sharpened attention inside legal and security teams at companies like Meta.

The mechanics are straightforward. When an engineer uses Claude Code to debug a model training script, chunks of that codebase can travel to external servers. That is normal behavior for AI coding tools. But it becomes a meaningful risk when the codebase is proprietary — and when the model provider's terms around training data have recently changed.

Who else is at risk from distillation attacks?

Distillation risk is not limited to frontier AI labs. Enterprises building proprietary models in high-value verticals are also targets.

Tony Garcia, chief information and security officer at Infineo, put it plainly: "If somebody has a particularly good model that they develop in a certain vertical, whether it's legal or healthcare, et cetera, then certainly [they] can be open to attacks, for somebody to do it better, faster, cheaper."

Users of illicitly distilled models face their own exposure. Those models may lack safeguards, and enterprise data fed into them could be at risk of leakage. John Bruggeman, consulting CISO at CBTS, warned: "There's going to be legal risk to organizations that are using pirated LLM models."

What defenses can enterprises put in place?

Security experts named several concrete measures enterprises can take:

  • Rate limiting — Bruggeman recommended capping the number of queries allowed in a given time window as a first line of defense, even though it cannot stop threat actors running thousands of accounts.
  • Watermarking — The Open Worldwide Application Security Project (OWASP) is developing a watermarking project aimed at verifying model authenticity and cutting down unauthorized usage.
  • The Glaze Project — An initiative out of the University of Chicago that develops tools making unauthorized AI training more difficult, also cited by Bruggeman.
  • Data anonymization — Garcia recommended that CIOs minimize exposure by anonymizing data before it enters any frontier model.
  • Business impact assessment — Bruggeman advised calculating the value of data at risk: "What's it going to cost if this data gets away?"

Sharma also raised the question of model provenance: "Are there any watermarks that exist so that we can confirm the lineage of the model and make sure that it isn't a result of a distillation attack?"

How does this connect to the broader AI supply chain risk?

The DeepSeek distillation accusations and Meta's internal restrictions point to the same underlying problem: AI model outputs are now a supply chain input, and that supply chain has attack surfaces.

For CIOs, InformationWeek frames distillation as a supply chain risk on par with any other. Governance is the foundation — understanding what data enters external models, what terms govern its use, and what safeguards exist on the receiving end.

Claude's paid user growth and the broader expansion of AI coding tools inside enterprises means more proprietary code is touching external model infrastructure than ever before. That exposure is structural, not incidental.

The most confirmed fact from the sources: Meta's internal memo explicitly warned that rival AI outputs in its training data could trigger "serious escalations with partner companies" — a contractual red line, not just a competitive one.

Frequently asked questions

**Why did Meta ban Claude Code and OpenAI Codex internally?**
According to internal documents reported by The Information, Meta restricted both tools because of distillation risk. The concern was that outputs from rival AI models could enter Meta's own training data. An internal memo warned this could trigger "serious escalations with partner companies," suggesting the risk is partly contractual as well as competitive.
**What is a distillation attack in AI?**
A distillation attack is when an attacker floods a target AI model with prompts, collects the responses, and uses them to train a cheaper copycat model. Anthropic documented how DeepSeek, Moonshot, and MiniMax used thousands of fraudulent accounts and proxy services to extract capabilities from Claude at industrial scale. OpenAI has also accused DeepSeek of the same practice.
**What did Anthropic change in its terms of service in 2025?**
Anthropic updated its consumer terms in August and September of 2025 to allow opt-in training on select datasets. That change reportedly heightened concern inside legal and security teams at companies like Meta, because it altered the conditions under which data passing through Claude-based tools could be used.
**How can enterprises defend against distillation attacks?**
Experts recommend several measures: rate limiting queries to slow systematic extraction, watermarking models to verify provenance, anonymizing data before it enters external AI tools, and conducting business impact assessments to understand the cost of data exposure. OWASP is also developing a dedicated AI model watermarking project to help verify model authenticity.
**Are distilled AI models dangerous for enterprise users?**
Yes, according to security experts cited by InformationWeek. Distilled models often lack the safety safeguards of the originals, which creates risk for enterprise data fed into them. John Bruggeman, consulting CISO at CBTS, warned there is "legal risk to organizations that are using pirated LLM models," whether they chose them knowingly or not.

Verified claims

Each key claim below was checked against its source — the exact supporting passage is quoted so you can confirm it yourself.

  1. Shatabdi Sharma, CIO at Capacity, described distillation as 'a teacher model and a student model that is still learning.'

    a teacher model and a student model that is still learning
    Verified informationweek.com

Sources

  1. InformationWeek reports informationweek.com

Keep reading

0 Comments

Log in to comment

Not a member yet? Join the community