Connect with us

Hi, what are you looking for?

AI Education

OpenAI Launches GPT-5.2, Achieving 92.4% on PhD-Level Science Benchmarks

OpenAI launches GPT-5.2, achieving 92.4% on PhD-level science benchmarks, enhancing professional workflows with significant time savings and improved reasoning.

OpenAI has officially launched GPT-5.2, a significant upgrade designed to tackle complex, multi-step tasks across various domains, including spreadsheets, presentations, coding, images, and extensive documents. The new model is reported to enhance reasoning capabilities and tool usage for agentic workloads. This rollout follows a detailed announcement on OpenAI’s blog and a series of promotional posts from senior executives on LinkedIn, emphasizing the model’s potential applications in professional settings.

Fidji Simo, CEO of Applications at OpenAI, asserted on LinkedIn that “GPT-5.2 is here and it’s the best model out there for everyday professional work.” The company positions GPT-5.2 not merely as an upgrade for chat functionalities but as a robust engine for professional knowledge work. OpenAI has launched three variants of the model within ChatGPT: GPT-5.2 Instant, GPT-5.2 Thinking, and GPT-5.2 Pro, with initial availability under paid plans.

Kevin Weil, OpenAI’s VP for Science, highlighted the model’s advanced capabilities, noting that GPT-5.2 has achieved impressive results on several specialist benchmarks. This includes a score of 92.4% on GPQA Diamond, which consists of PhD-level questions in various scientific fields, and a 40.3% on Frontier Math, a climb from GPT-5.1’s previous score. Moreover, GPT-5.2 reached 70.9% on GDPval, a benchmark evaluating professional work across 44 occupations.

The company claims that GPT-5.2 is particularly well-suited for “professional knowledge work,” citing that average users of ChatGPT Enterprise report saving between 40 to 60 minutes daily, with heavy users benefiting from over 10 hours saved each week. OpenAI underscores GPT-5.2 Thinking as the primary tool for handling intricate workflows, achieving a remarkable performance on GDPval, where it surpassed or matched top industry professionals in 70.9% of cases evaluated.

On the engineering front, GPT-5.2 Thinking has reportedly set a new benchmark score of 55.6% on SWE-Bench Pro, which evaluates real-world software engineering across four programming languages. This improvement translates to enhanced capabilities for debugging production code, implementing feature requests, and refactoring extensive codebases with minimal manual intervention. Early feedback indicates a marked improvement in front-end tasks and complex UI work, including the creation of interactive web apps from single prompts.

In addition to coding enhancements, OpenAI asserts that GPT-5.2 Thinking has achieved a new high of 98.7% on the Tau2-bench Telecom benchmark for multi-turn customer support tasks, demonstrating its reliability in tool usage. The model also excels in long-context reasoning, achieving near 100% accuracy on OpenAI’s MRCRv2 evaluation, allowing for deep analysis of contracts, research papers, and multi-file projects without compromising coherence.

Notably, OpenAI has detailed significant gains in areas such as scientific workloads and reasoning benchmarks. In tests like GPQA Diamond and FrontierMath, GPT-5.2 Pro achieved 93.2% and 40.3% accuracy, respectively, highlighting the model’s capabilities in supporting scientific inquiry. Additionally, the new model surpassed previous benchmarks in abstract reasoning, breaking new ground in ARC-AGI assessments.

Despite these advancements, OpenAI acknowledges the limitations of GPT-5.2, particularly concerning safety and reliability. The company reports a 30% reduction in errors compared to GPT-5.1, although it emphasizes the necessity of double-checking results for critical applications. OpenAI has also made strides in safety measures, improving the model’s responses to prompts indicating potential mental health concerns.

As for pricing, GPT-5.2 is available in different tiers, with costs reflecting its capabilities. The Instant model is priced at $1.75 per million input tokens and $14 per million output tokens, while the Pro version is set at $21 per million input tokens and $168 per million output tokens. OpenAI clarifies that while GPT-5.2 is priced higher than its predecessor, its greater efficiency may result in lower overall costs for users.

Developed in collaboration with partners including NVIDIA and Microsoft, GPT-5.2 represents a further evolution in OpenAI’s ongoing advancements in artificial intelligence. As the company notes, this release is part of a broader roadmap, with ongoing efforts to enhance safety, reliability, and performance in high-stakes applications. The future of AI in professional workflows appears promising, driven by the capabilities introduced with GPT-5.2.

See also
David Park
Written By

At AIPressa, my work focuses on discovering how artificial intelligence is transforming the way we learn and teach. I've covered everything from adaptive learning platforms to the debate over ethical AI use in classrooms and universities. My approach: balancing enthusiasm for educational innovation with legitimate concerns about equity and access. When I'm not writing about EdTech, I'm probably exploring new AI tools for educators or reflecting on how technology can truly democratize knowledge without leaving anyone behind.

You May Also Like

Top Stories

Analysts warn that unchecked AI enthusiasm from companies like OpenAI and Nvidia could mask looming market instability as geopolitical tensions escalate and regulations lag.

Top Stories

SpaceX, OpenAI, and Anthropic are set for landmark IPOs as early as 2026, with valuations potentially exceeding $1 trillion, reshaping the AI investment landscape.

Top Stories

OpenAI launches Sora 2, enabling users to create lifelike videos with sound and dialogue from images, enhancing social media content creation.

Top Stories

Musk's xAI acquires a third building to enhance AI compute capacity to nearly 2GW, positioning itself for a competitive edge in the $230 billion...

Top Stories

Nvidia and OpenAI drive a $100 billion investment surge in AI as market dynamics shift, challenging growth amid regulatory skepticism and rising costs.

AI Research

OpenAI and Google DeepMind are set to enhance AI agents’ recall systems, aiming for widespread adoption of memory-enabled models by mid-2025.

Top Stories

OpenAI's CLIP model achieves an impressive 81.8% zero-shot accuracy on ImageNet, setting a new standard in image recognition technology.

Top Stories

Micron Technology's stock soars 250% as it anticipates a 132% revenue surge to $18.7B, positioning itself as a compelling long-term investment in AI.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.