Connect with us

Hi, what are you looking for?

Top Stories

DeepSeek Launches V4, Surpassing GPT-5 and Claude in Key AI Benchmarks

DeepSeek’s V4-Pro eclipses GPT-5 and Claude in key benchmarks, achieving a Codeforces rating of 3,206 while undercutting OpenAI’s costs by 89% per million tokens.

China’s DeepSeek has made a notable entry into the competitive landscape of artificial intelligence with its new model, the V4, showcased at a recent technology event in Silicon Valley. This Hangzhou-based company is gaining attention for its ability to outperform several well-known American models in specific benchmarks, signaling a potential shift in the global AI arena.

DeepSeek launched two models: the V4-Pro, designed for expert users with a vast 1.6 trillion parameters, and the V4-Flash, which offers a more accessible 284 billion parameters. Both models feature a one-million-token context window, a significant enhancement for handling complex data inputs. What sets these models apart is that they are open source, available for download via Hugging Face, allowing users to deploy them locally, although V4-Pro requires substantial VRAM for optimal performance.

In various technical assessments, V4-Pro has demonstrated remarkable capabilities, particularly in coding tasks. For instance, it achieved a Codeforces rating of 3,206, surpassing GPT-5.4’s score of 3,168 and Gemini 3.1’s 3,052, thereby establishing itself as the leading open model for competitive programming. On the LiveCodeBench, it scored 93.5, edging out Claude Opus 4.6’s 88.8 and Gemini 91.7. Similar performance was noted in agentic tasks, with V4-Pro scoring 51.8 on Toolathlon, again outperforming Claude (47.2) and Gemini (48.8). The faster V4-Flash model competes effectively on simpler tasks, providing a cost-efficient alternative without sacrificing performance.

Despite these successes, V4-Pro has areas for improvement compared to its rivals. Claude’s Opus 4.6 leads in long-context retrieval, achieving a score of 92.9 on MRCR 1M, significantly outpacing V4-Pro’s 83.5. Moreover, GPT-5.4 maintains an advantage on Terminal Bench 2.0, scoring 75.1 against V4-Pro’s 67.9. Nevertheless, DeepSeek’s competitive pricing structure could reshape customer choices; V4-Pro costs $3.48 per million output tokens, a striking contrast to OpenAI’s $30 and Anthropic’s $25 for similar workloads.

This pricing advantage may attract developers looking to incorporate AI capabilities into their applications, as the financial barrier to entry remains a crucial consideration in the burgeoning AI market. The disparity in costs could position DeepSeek as a compelling alternative for businesses and developers eager to leverage advanced AI without incurring prohibitive expenses.

As AI technology continues to evolve rapidly, DeepSeek’s advancements underscore the potential for increased competition in a space traditionally dominated by American firms. The release of the V4 models not only highlights DeepSeek’s growing influence but also signals a broader trend of innovation and diversity within the global AI landscape. With these developments, the stage is set for further advancements as companies strive to meet the escalating demands of AI applications across various sectors.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Cybersecurity

Anthropic's Mythos exposes thousands of critical vulnerabilities in major systems, prompting $100M in defensive action from tech giants and U.S. banks.

AI Generative

Generative AI achieves over 85% accuracy in predicting mental health treatment success, marking a pivotal shift toward Precision Psychiatry with $10 billion market potential...

AI Government

Anthropic accuses Moonshot AI of 3.4M unauthorized exchanges with its Claude chatbot, prompting a global U.S. State Department campaign against IP theft.

AI Regulation

Malfunctioning AI agent Cursor, powered by Anthropic’s Claude Opus 4.6, deleted PocketOS's entire database in nine seconds, disrupting car rental operations nationwide.

Top Stories

DeepSeek's V4 open-source model undercuts GPT-5.5 and Claude Opus 4.7 with costs of $1.74 per million tokens, promising a disruptive shift in AI pricing...

AI Technology

Amazon and Anthropic expand their partnership with a $100B investment in AWS, enhancing AI infrastructure and accelerating generative AI adoption globally.

AI Technology

US lawmakers initiate a probe into PRC-developed AI systems, citing national security risks and potential exploitation of American innovations by companies like DeepSeek and...

Top Stories

Anthropic unveils BioMysteryBench, a benchmark that reveals Claude's 30% success on human-unsolvable bioinformatics questions, advancing AI's role in complex research tasks

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.