AI Generative

Inception Launches Mercury 2, a Diffusion LLM 10x Faster than OpenAI’s Models

Inception unveils Mercury 2, a diffusion LLM delivering up to 10x faster performance than OpenAI’s models, transforming AI application development.

Staff

Published

3 March, 2026

Inception launched its new large language model, Mercury 2, last week, marking a significant shift in the generative AI landscape. Unlike traditional autoregressive models used by major AI labs, Mercury 2 employs a diffusion approach, as explained by Inception CEO and co-founder Stefano Ermon during a recent episode of The New Stack Agents. This innovative model is expected to reshape how AI applications are developed, offering advantages in both speed and efficiency.

Traditional large language models (LLMs) generate text sequentially, processing one token at a time from left to right, a method Ermon likens to “fancy autocomplete.” In contrast, diffusion models begin with an approximate output and refine it in parallel, similar to how image models like Stable Diffusion convert noise into coherent images. Inception’s own testing indicates that Mercury 2 can produce more than 1,000 tokens per second, achieving speeds five to ten times faster than optimized models from industry leaders such as OpenAI, Anthropic, and Google.

Ermon noted, “What we’re seeing is that our Mercury 2 model, which is a reasoning model, is actually able to match the quality of these speed-optimized models from frontier labs OpenAI, Anthropic, Meta, and Google, while being five to ten times faster in terms of, like, the end-to-end latency, how long you need to wait before it gives you an answer.” This capability is particularly significant as the demand for rapid response times in AI applications continues to grow.

The slower performance of autoregressive models stems from their reliance on memory to process data sequentially, as opposed to the parallel computation approach favored by diffusion models. This parallelism takes advantage of the architecture of modern GPUs, which are designed for such computational tasks. Nvidia, a key investor in Inception, is also assisting in optimizing Mercury 2’s serving engine to enhance performance further.

Ermon, who has a background in developing diffusion models for images during his time at Stanford, highlighted the trade-offs involved in this new technology. While Mercury 2 is capable of matching the quality of Claude Haiku and Google Flash-class models, it does not yet reach the performance level of Claude Opus or OpenAI’s GPT-4. However, Ermon maintains that as models scale, the economic advantages of the diffusion approach will become increasingly compelling. He emphasized that reinforcement learning, a technique foundational to current reasoning models, benefits from the efficiency of diffusion architectures, particularly in addressing inference bottlenecks.

Currently, Inception stands out as the only company offering a production-level diffusion LLM, with Google’s text diffusion model still classified as “experimental.” Mercury 2 is now accessible via an OpenAI-compatible API, with plans for integration into AWS Bedrock expected soon.

As the competitive landscape of AI continues to evolve, the introduction of Mercury 2 may signal a broader shift in the industry, highlighting the potential for new methodologies to redefine traditional approaches to AI development.

AI Research

OpenAI Launches GPT-Rosalind to Accelerate Drug Discovery and Biological Research

OpenAI unveils GPT-Rosalind, a groundbreaking AI model designed to enhance drug discovery and biological research, collaborating with major players like Moderna and Amgen.

Staff4 hours ago

OpenAI Abandons Ninth Circuit Appeal Over ‘Cameo’ Trademark Block for Sora 2

OpenAI withdraws its Ninth Circuit appeal against a trademark injunction blocking the use of "Cameo" for its AI video generator, Sora 2, signaling a...

Staff6 hours ago

AI Business

OpenAI Shuts Down Sora, Loses Two Executives in Shift to Enterprise Focus

OpenAI shuts down its video tool Sora and loses executives Kevin Weil and Bill Peebles as it pivots to enterprise solutions to secure $10B...

Marcus Chen8 hours ago

AI Regulation

OpenAI’s Chris Lehane Urges AI Firms to Improve Messaging Amid Rising Backlash

OpenAI's Chris Lehane warns that only 26% of U.S. voters view AI positively, urging firms to enhance messaging amid escalating backlash and fears of...

Staff12 hours ago

AI Generative

OpenAI Reveals Key Differences Between Generative AI and LLMs for 2025 Applications

OpenAI's latest insights reveal a 411% surge in interest for generative AI tools, highlighting crucial distinctions between them and large language models for 2025...

Staff16 hours ago

AI Regulation

Elon Musk Proposes Universal High Income to Combat AI-Driven Unemployment

Elon Musk advocates for "Universal High Income" funded by AI-driven economic growth to address impending unemployment, challenging traditional wealth distribution models.

Staff16 hours ago

AI Technology

OpenAI Plans $20 Billion Investment in Cerebras Chips, Eyes Equity Stake

OpenAI plans a transformative $20 billion investment in Cerebras chips, aiming to enhance AI capabilities and secure a significant equity stake in the startup.

Staff21 hours ago

Cerebras Secures $20B Deal with OpenAI for AI Computing Infrastructure Expansion

Cerebras secures a $20 billion deal with OpenAI to enhance AI computing infrastructure, underscoring the escalating demand for specialized hardware.

Staff23 hours ago

AIPRESSA.COM

AI Generative

Inception Launches Mercury 2, a Diffusion LLM 10x Faster than OpenAI’s Models

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Research

OpenAI Launches GPT-Rosalind to Accelerate Drug Discovery and Biological Research

Top Stories

OpenAI Abandons Ninth Circuit Appeal Over ‘Cameo’ Trademark Block for Sora 2

AI Business

OpenAI Shuts Down Sora, Loses Two Executives in Shift to Enterprise Focus

AI Regulation

OpenAI’s Chris Lehane Urges AI Firms to Improve Messaging Amid Rising Backlash

AI Generative

OpenAI Reveals Key Differences Between Generative AI and LLMs for 2025 Applications

AI Regulation

Elon Musk Proposes Universal High Income to Combat AI-Driven Unemployment

AI Technology

OpenAI Plans $20 Billion Investment in Cerebras Chips, Eyes Equity Stake

Top Stories

Cerebras Secures $20B Deal with OpenAI for AI Computing Infrastructure Expansion