AI Technology

Nvidia Achieves 10x Cost Savings in AI Inference with Open-Source Models on Blackwell

Nvidia’s Blackwell architecture cuts AI inference costs by 90%, achieving 5 cents per token and enhancing healthcare efficiency with open-source models.

Staff

Published

14 February, 2026

Nvidia has announced significant advancements in its AI infrastructure, revealing that the cost per token for its services has dropped from 20 cents using the older Hopper platform to just 10 cents on the new Blackwell architecture. Furthermore, by utilizing Blackwell’s native low-precision NVFP4 format, the cost has been reduced to a mere 5 cents per token. This transition illustrates a fourfold improvement in cost efficiency while maintaining the accuracy expected by customers.

In a recent blog post, Nvidia outlined four industry deployments that showcase how the integration of Blackwell infrastructure, NVFP4, optimized software stacks, and open-source models can lead to substantial cost reductions. One of the highlighted sectors is healthcare, which faces challenges such as time-consuming tasks related to medical coding, documentation, and insurance management. These routine activities often detract from the time healthcare professionals can spend with patients.

Sully.ai has emerged as a solution to address these challenges by leveraging AI agents to perform these repetitive tasks. However, the proprietary and closed-source models initially employed by Sully.ai did not provide the scalability necessary for widespread adoption. In a strategic pivot, Sully.ai adopted Baseten’s open-source Model API on Blackwell GPUs, incorporating the NVFP4 data format, the TensorRT-LLM library, and the Dynamo inference framework. This shift resulted in a remarkable 90% decrease in inference costs, representing a tenfold reduction compared to the previous closed-source implementation. Additionally, response times for critical workflows, such as generating medical notes, improved by 65%.

The optimization of costs and performance through Nvidia’s technology highlights the growing importance of open-source solutions in driving efficiencies within the healthcare sector. By reducing costs and enhancing the speed of processes, AI can help alleviate some of the burdens faced by healthcare providers, allowing them to focus more on direct patient care.

As AI technologies continue to evolve, organizations across various sectors are increasingly turning to innovative solutions to streamline operations and improve efficiency. Nvidia’s advancements in AI infrastructure not only signify a leap in technological capability but also reflect a broader trend of integrating open-source models into commercial applications. This shift is likely to resonate throughout the industry as companies seek to harness the power of AI while managing operational costs.

The implications of these developments extend beyond immediate cost savings. With AI becoming more accessible through such advancements, smaller firms and startups may find themselves better equipped to compete with larger players in the market. As the landscape of AI technology continues to evolve, Nvidia’s Blackwell platform may serve as a catalyst for further innovation and efficiency across a variety of sectors.

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AMD unveils the Ryzen AI Halo Mini-PC, boasting a 16-core Ryzen AI Max+ 395 APU and the capability to process models with up to...

Staff3 May, 2026

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

Nvidia's partnerships with Asian firms like LG and Nanya surge AI chip demand to 90% of production costs, reshaping the tech landscape in Asia.

Staff3 May, 2026

AI Business

Jensen Huang Critiques AI Doom Predictions, Calls for Fact-Based Discussions

Nvidia CEO Jensen Huang urges industry leaders to avoid alarmist claims about AI's future, citing concerns over inaccurate predictions like a 50% job displacement...

Marcus Chen2 May, 2026

AIPRESSA.COM

AI Technology

Nvidia Achieves 10x Cost Savings in AI Inference with Open-Source Models on Blackwell

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Technology

AMD Launches Ryzen AI Halo Mini-PC with 128GB RAM and NPU for Local AI Development

AI Generative

Nvidia Expands Partnerships with Asian Firms, Boosting AI Chip Demand by 90%

AI Business

Jensen Huang Critiques AI Doom Predictions, Calls for Fact-Based Discussions

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

Top Stories

Apple, Google, and Amazon Shine Post-Earnings as AI Demand Reshapes Tech Landscape

Top Stories

Cambricon Reports $423M Q1 Revenue, Surpassing Nvidia’s Market Share in China

Top Stories

Nvidia Launches 7 Million Korean Personas, Enters South Korea’s AI Market with Lock-In Strategy