DeepSeek Launches V4 API with 2M Token Context, Undercutting OpenAI and Anthropic Prices

DeepSeek’s V4 API launches with a groundbreaking 2-million-token context window, challenging OpenAI and Anthropic while offering competitive pricing at $2.80 per million input tokens.

Staff

Published

2 hours ago

DeepSeek has launched its V4 model family, available through its public API, which introduces two distinct inference tiers—Flash and Pro—designed to challenge the pricing structures of established players like OpenAI and Anthropic. This release, which went live this morning, showcases a significant expansion in capability, featuring a staggering 2-million-token context window for its Pro tier.

The Hangzhou-based lab’s decision to split its offerings into Flash and Pro reflects a strategic focus on performance tailored to different applications. Flash is engineered for speed, boasting an inter-token latency under 15 milliseconds, making it competitive with leading models such as GPT-4o-mini and Claude Haiku. With pricing set at $0.40 per million input tokens and $1.20 per million output tokens, DeepSeek has positioned Flash as an attractive option for developers engaged in real-time applications, where latency is a critical factor.

On the other hand, the Pro tier introduces a substantial increase in the context window, rising from V3’s 128,000 tokens to an impressive 2 million tokens. This enhancement allows users to input extensive data, such as entire codebases or multi-year document archives, directly into the model without the need for additional retrieval architectures. While retrieval-augmented generation (RAG) processes are still relevant, they become optional for many use cases, thereby simplifying engineering tasks and reducing potential errors associated with data retrieval.

DeepSeek’s V4 Pro employs a 16×16 expert routing architecture, an evolution of the mixture-of-experts approach that characterized its previous version. Early assessments indicate that V4 Pro achieved an 88.5% score on the MMLU benchmark, an improvement over V3’s 85.5%. While this may appear marginal, competitive benchmarking culture emphasizes that even slight advancements are keenly scrutinized. Pro is priced at $2.80 per million input tokens and $8.80 per million output tokens, significantly undercutting the frontier-tier pricing set by Western labs.

The choice to launch V4 directly via API rather than through a staged product rollout signals DeepSeek’s confidence in its positioning within the AI infrastructure landscape. Unlike competitors that may focus on consumer-facing applications, DeepSeek is explicitly aiming to serve as a foundational technology provider. This shift complicates the competitive landscape for OpenAI and Anthropic, who may find it challenging to respond effectively to this infrastructure-oriented strategy.

As a direct consequence of this release, the broader orchestration ecosystem—encompassing tools like LangChain and LlamaIndex—is expected to integrate V4 updates rapidly, benefiting from the developer adoption DeepSeek achieved in 2025. However, the more significant impact will likely be felt in API pricing across the industry. With DeepSeek’s benchmark performance coupled with aggressive pricing, other providers will face increasing pressure to justify their own rate structures.

Looking ahead, industry observers will be keen to see how OpenAI and Anthropic respond. The introduction of a 2-million-token context window in V4 Pro presents a direct challenge to Claude’s long-context capabilities. Furthermore, it raises the possibility that enterprise procurement teams may begin evaluating V4 Pro alongside their current vendors—not as a substitute but as a strategic leverage point in negotiations. In this sense, DeepSeek’s most critical offering may not just be the model itself, but the invoice it provides to CTOs seeking better terms from their AI providers.

Also read: The viral rumor claiming OpenAI’s GPT 5.5 is a flat rate subscription is collapsing under the weight of complex new billing metrics • GPT Images 2.0 is producing photorealistic variety so broad that stock photography may never recover • Kling AI releases native 4K video generation and quietly resets the bar for every competitor.

AI Generative

Kling AI Launches v2.5 with Native 4K Video Generation, Setting New Industry Standard

Kling AI launches v2.5, delivering native 4K video generation with 10-second clips, drastically lowering production costs for filmmakers and challenging Western competitors.

Staff1 hour ago

Hugging Face Launches ML Intern, Outperforming Claude Code in Scientific Reasoning

Hugging Face launches ML Intern, an open-source AI agent that surpasses Claude Code in scientific reasoning with a 32% GPQA score, offering $1,000 in...

Staff4 hours ago

AI Generative

OpenAI Launches GPT-5.5 Just Six Weeks After GPT-5.4, Boosting AI Efficiency and Accuracy

OpenAI unveils GPT-5.5 for paid subscribers, enhancing efficiency and accuracy with a 900 million weekly user base, just six weeks after GPT-5.4.

Staff5 hours ago

AI Technology

Sitharaman Meets Bank Leaders to Address AI Risks Post-Anthropic’s Mythos Concerns

Indian Finance Minister Nirmala Sitharaman met with bank leaders to address AI risks, following Anthropic's alarming claims about its Claude Mythos model's cybersecurity threats.

Staff13 hours ago

AI Cybersecurity

Anthropic’s Mythos Reveals AI’s Role in Accelerating Cyber Threats and Governance Needs

Anthropic's Mythos can autonomously exploit vulnerabilities and execute cyberattacks, raising urgent questions about AI governance and cybersecurity resilience.

Rachel Torres18 hours ago

OpenAI Prepares for IPO with $852B Valuation; 2 Ways to Invest Now

OpenAI, valued at $852 billion, eyes a 2026 IPO as revenue soars 225% to $13 billion, presenting investment opportunities via Ark Venture Fund and...

Staff19 hours ago

AI Technology

AI Experts Urge Regulation as OpenAI’s Sam Altman Proposes Legislative Framework

OpenAI's Sam Altman proposes a new AI regulatory framework as the White House blacklists Anthropic over failed contract negotiations, signaling rising tensions.

Staff20 hours ago

Tencent and Alibaba Eye $40B AI Startup DeepSeek, Seek Major Stake in Funding Round

Tencent aims for a 20% stake in $40B AI startup DeepSeek as Alibaba joins funding talks, intensifying the competition in China's AI landscape

Staff21 hours ago

AIPRESSA.COM

Top Stories

DeepSeek Launches V4 API with 2M Token Context, Undercutting OpenAI and Anthropic Prices

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Generative

Kling AI Launches v2.5 with Native 4K Video Generation, Setting New Industry Standard

Top Stories

Hugging Face Launches ML Intern, Outperforming Claude Code in Scientific Reasoning

AI Generative

OpenAI Launches GPT-5.5 Just Six Weeks After GPT-5.4, Boosting AI Efficiency and Accuracy

AI Technology

Sitharaman Meets Bank Leaders to Address AI Risks Post-Anthropic’s Mythos Concerns

AI Cybersecurity

Anthropic’s Mythos Reveals AI’s Role in Accelerating Cyber Threats and Governance Needs

Top Stories

OpenAI Prepares for IPO with $852B Valuation; 2 Ways to Invest Now

AI Technology

AI Experts Urge Regulation as OpenAI’s Sam Altman Proposes Legislative Framework

Top Stories

Tencent and Alibaba Eye $40B AI Startup DeepSeek, Seek Major Stake in Funding Round