Meta CEO Mark Zuckerberg has pledged to make artificial general intelligence (AGI) — which is roughly defined as AI that can accomplish any task a human can — openly available one day. But in a new policy document, Meta suggests that there are certain scenarios in which it may not release a highly capable AI system it developed internally.
The document, which Meta is calling its Frontier AI Framework, identifies two types of AI systems the company considers too risky to release: “high risk” and “critical risk” systems.
As Meta defines them, both “high-risk” and “critical-risk” systems are capable of aiding in cybersecurity, chemical, and biological attacks, the difference being that “critical-risk” systems could result in a “catastrophic outcome [that] cannot be mitigated in [a] proposed deployment context.” High-risk systems, by contrast, might make an attack easier to carry out but not as reliably or dependably as a critical risk system.
Which sort of attacks are we talking about here? Meta gives a few examples, like the “automated end-to-end compromise of a best-practice-protected corporate-scale environment” and the “proliferation of high-impact biological weapons.” The list of possible catastrophes in Meta’s document is far from exhaustive, the company acknowledges, but includes those that Meta believes to be “the most urgent” and plausible to arise as a direct result of releasing a powerful AI system.
Somewhat surprising is that, according to the document, Meta classifies system risk not based on any one empirical test but informed by the input of internal and external researchers who are subject to review by “senior-level decision-makers.” Why? Meta says that it doesn’t believe the science of evaluation is “sufficiently robust as to provide definitive quantitative metrics” for deciding a system’s riskiness.
If Meta determines a system is high-risk, the company says it will limit access to the system internally and won’t release it until it implements mitigations to “reduce risk to moderate levels.” If, on the other hand, a system is deemed critical-risk, Meta says it will implement unspecified security protections to prevent the system from being exfiltrated and stop development until the system can be made less dangerous.
Meta’s Frontier AI Framework, which the company says will evolve with the changing AI landscape, appears to be a response to criticism of the company’s “open” approach to system development. Meta has embraced a strategy of making its AI technology openly available — albeit not open source by the commonly understood definition — in contrast to companies like OpenAI that opt to gate their systems behind an API.
For Meta, the open release approach has proven to be a blessing and a curse. The company’s family of AI models, called Llama, has racked up hundreds of millions downloads. But Llama has also reportedly been used by at least one U.S. adversary to develop a defense chatbot.
In publishing its Frontier AI Framework, Meta may also be aiming to contrast its open AI strategy with Chinese AI firm DeepSeek’s. DeepSeek also makes its systems openly available. But the company’s AI has few safeguards and can be easily steered to generate toxic and harmful outputs.
“[W]e believe that by considering both benefits and risks in making decisions about how to develop and deploy advanced AI,” Meta writes in the document, “it is possible to deliver that technology to society in a way that preserves the benefits of that technology to society while also maintaining an appropriate level of risk.”
Kyle Wiggers is a senior reporter at TechCrunch with a special interest in artificial intelligence. His writing has appeared in VentureBeat and Digital Trends, as well as a range of gadget blogs including Android Police, Android Authority, Droid-Life, and XDA-Developers. He lives in Brooklyn with his partner, a piano educator, and dabbles in piano himself. occasionally — if mostly unsuccessfully.