Connect with us

Hi, what are you looking for?

AI Generative

Anthropic Launches Claude Opus 4.6, Achieving 144 Elo Point Lead Over GPT-5.2

Anthropic’s Claude Opus 4.6 launches with a 144 Elo point advantage over GPT-5.2, enhancing AI-driven productivity and safety for enterprise applications

Anthropic has unveiled its latest AI model, Claude Opus 4.6, which boasts significant enhancements over its predecessor. Announced on October 3, 2023, the model features improved coding competencies, a larger context window of 1 million tokens in beta, and enhanced capabilities for executing complex tasks autonomously. This model is designed to assist users in various everyday tasks, including financial analysis, research, and document creation, thereby elevating productivity in workplace environments.

Claude Opus 4.6 has demonstrated exceptional performance across multiple evaluations. It achieved the highest score on the Terminal-Bench 2.0 coding evaluation and surpassed other models in Humanity’s Last Exam, a challenging multidisciplinary reasoning test. Moreover, it strongly outperformed OpenAI’s GPT-5.2 by approximately 144 Elo points on the GDPval-AA benchmark, which evaluates performance in economically valuable knowledge work tasks across finance and legal domains. Claude Opus 4.6 also excelled in BrowseComp, an assessment of locating complex information online, underscoring its superior capabilities in information retrieval.

The model’s safety profile also stands out, exhibiting misalignment rates comparable to or better than any other leading AI models. According to the extensive safety evaluations conducted, Claude Opus 4.6 maintains low rates of undesirable behaviors, ensuring that it aligns with user well-being and safety standards.

In addition to these capabilities, Claude Opus 4.6 introduces several new features aimed at enhancing collaborative work. The model allows users to assemble teams of autonomous agents within the Claude Code environment, enabling multiple agents to tackle tasks concurrently. Furthermore, it incorporates adaptive thinking, allowing the model to determine when to engage in deeper reasoning, and offers developers new controls over intelligence, speed, and cost through various effort settings.

Substantial upgrades have also been made to Claude for Excel and a research preview of Claude in PowerPoint has been released. These updates make the model more adept at handling intricate tasks typically required in office settings, like processing and structuring data in Excel before visually presenting it in PowerPoint.

Feedback from early-access partners reflects Claude Opus 4.6’s advancements. Notion users highlighted the model’s capability to handle ambitious requests autonomously, while developers noted its effectiveness in managing complex, multi-step coding workflows. Other users emphasized the model’s proficiency in agentic planning, where it successfully breaks down intricate tasks into manageable subtasks and executes them with accuracy. This responsiveness has led to enhanced collaboration and efficiency across various teams.

Performance metrics further validate these claims. Claude Opus 4.6 reportedly improved performance on a blind ranking against its predecessor in cybersecurity investigations, achieving superior results in 38 out of 40 cases. Additionally, it attained a score of 90.2% on the BigLaw Bench, showcasing its capabilities in legal reasoning.

Looking forward, Claude Opus 4.6 is poised to change how enterprises leverage AI in their operations. With a focus on comprehensive safety evaluations, the model not only enhances productivity but does so with a view toward ethical considerations. Users can expect continual improvements as the model adapts to new challenges and incorporates feedback from real-world applications.

Available today on claude.ai, via its API, and across major cloud platforms, Claude Opus 4.6 maintains its pricing structure at $5/$25 per million tokens, providing an accessible option for developers and organizations aiming to integrate advanced AI capabilities into their workflows.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Cybersecurity

Fortinet shares fell 3.62% to $78.10 after Anthropic's AI data leak raised cybersecurity concerns, highlighting vulnerabilities in legacy security solutions.

Top Stories

Mistral launches Voxtral TTS, an open-source model supporting nine languages for edge devices, enhancing voice applications with real-time performance and minimal audio input.

Top Stories

DeepSeek prepares to launch its most advanced language model, competing directly with OpenAI's newly completed GPT-5.5, as AI scalability challenges intensify.

AI Cybersecurity

Concerns mount over Anthropic's unconfirmed "Claude Mythos," an AI model potentially capable of generating exploit code to compromise cybersecurity defenses.

AI Education

OpenAI unveils the ChatGPT 26 program to support 26 student AI innovators, while the FTC forms a Healthcare Task Force to enhance patient protections.

AI Cybersecurity

Anthropic tests its advanced AI model Claude Mythos amid cybersecurity risks, revealing plans for a Capybara tier designed to surpass previous models in security...

Top Stories

OpenAI launches a Safety Bug Bounty program to address AI misuse risks, rewarding researchers for identifying design flaws that could cause significant harm.

AI Government

Federal Judge Rita Lin blocks the Pentagon from designating Anthropic as a supply chain risk, citing 'arbitrary' actions that could hinder the AI firm's...

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.