A coalition of developers specializing in large foundation models has begun implementing corporate protocols to evaluate and mitigate risks associated with their artificial intelligence (AI) technologies. As of September 2023, several key AI companies have voluntarily published these protocols aimed at addressing severe risks posed by their models. This initiative gained momentum at the AI Seoul Summit in May 2024, where sixteen companies committed to the Frontier AI Safety Commitments, with an additional four companies joining since then. Currently, twelve organizations, including Anthropic, OpenAI, Google DeepMind, Meta, and Microsoft, have made their frontier AI safety policies public.
The initial report released in August 2024 focused on the commonalities found in the safety policies of Anthropic, OpenAI, and Google DeepMind. By March 2025, as the number of available policies increased to twelve, the document was updated to incorporate new insights and developments. The latest version, published in December 2025, references updates in some developers’ safety policies, along with relevant guidelines from the EU AI Act and California’s Senate Bill 53.
Each policy scrutinized in the reports employs capability thresholds, which evaluate the potential risks associated with AI models, such as their capacity to facilitate biological weapons development, cyberattacks, or autonomous replication. The developers commit to conducting assessments to determine if their models approach these thresholds that could lead to severe or catastrophic outcomes. When such thresholds are approached, the policies advocate for model weight security and deployment mitigations, especially for models identified as having concerning capabilities.
In response to risks, developers have pledged to secure model weights to prevent theft by sophisticated adversaries and to implement safety measures that minimize the risk of misuse. Policies also include provisions to halt development and deployment should mitigation efforts prove inadequate. To ensure effective risk management, evaluations are designed to thoroughly assess model capabilities, occurring before deployment, during training, and after deployment. All three policies emphasize the importance of exploring accountability mechanisms, including potential oversight by third parties or advisory boards, which would monitor policy implementation and assist with evaluations.
As developers continue to refine their evaluation processes and deepen their understanding of AI-related risks, the policies are expected to be updated over time. This ongoing evolution reflects a heightened awareness within the industry of the potential consequences of advanced AI technologies and the necessity for stringent safety measures. With the rapid advancement of AI capabilities, the commitment to accountability and risk mitigation promises to shape the future landscape of responsible AI deployment.
See also
AI Safety Conclave Unveils Guidelines for Ethical AI Deployment in Critical Sectors
New York Enacts RAISE Act Mandating AI Safety Reporting for Developers by 2027
Mississippi Lawyer Fined $20K for AI Hallucinations in Legal Documents
Trump Prepares Executive Order to Centralize AI Regulation, Blocking State Laws
Trump’s Health Agencies Accelerate AI Adoption, Reducing Patient Protections


















































