Sarvam AI Surpasses Google and OpenAI with Homegrown Speech and Vision Models

Sarvam AI’s local models, Sarvam Audio and Sarvam Vision, outpace Google and OpenAI with 93.28% document accuracy, revolutionizing AI for Indian users.

Staff

Published

1 hour ago

New Delhi: India’s technology landscape is making significant strides with the introduction of Sarvam AI, a startup based in Bengaluru. This company has launched artificial intelligence models tailored specifically for Indian users, named Sarvam Audio and Sarvam Vision. These models are not only positioned to compete with global giants like Google and OpenAI but are also demonstrating superior performance in contexts relevant to India.

In a country where voice communication is paramount, Sarvam AI has developed systems that prioritize speech over traditional text-based interaction. Many Indians, including farmers and delivery workers, rely on verbal instructions in their daily activities. In response, Sarvam Audio has been meticulously trained on 22 Indian languages from the ground up, enabling it to effectively understand “code-mixing,” where speakers fluidly switch between languages, a common feature in Indian vernacular.

The performance metrics for Sarvam’s products reveal a compelling narrative. Sarvam Audio has consistently surpassed competitors on the IndicVoices benchmark, outperforming Google’s Gemini-3-Flash and OpenAI’s GPT-4o in transcription accuracy with a notably lower Word Error Rate (WER). Similarly, the visual model, Sarvam Vision, achieved an impressive 84.3 percent accuracy on the olmOCR-Bench, outperforming both Gemini 3 Pro and DeepSeek. In the realm of document analysis, Sarvam Vision scored 93.28 percent on the OmniDoc benchmark, demonstrating that smaller, specialized models can exceed the capabilities of larger global systems when addressing Indian-specific documents, tables, and formulas.

Among its innovative features, Sarvam Audio introduces a unique Speech-to-Command capability. Unlike traditional models that require speech to be transcribed into text before any action is taken, Sarvam Audio can initiate actions directly from voice commands. This eliminates latency and minimizes misunderstandings, particularly in noisy environments. For instance, when a user says “Nau” in Hindi, Sarvam Audio accurately interprets it as the numeral “9,” while other systems may misinterpret it as the English word “No.”

Additionally, Sarvam Audio integrates advanced speaker diarization technology, allowing it to distinguish between up to eight different voices in a single audio recording. This feature is particularly beneficial in the Indian context, where overlapping voices are commonplace in busy offices and call centers. The model is also optimized for 8kHz telephony, ensuring reliable performance even with the low-quality audio often found in traditional customer service calls.

Sarvam AI’s growth is backed by the IndiaAI Mission and government-supported GPU clusters, emphasizing a commitment to developing sovereign AI technology within India. By crafting models that cater exclusively to Indian users, Sarvam AI aims to reduce reliance on foreign systems, aligning with a broader vision of ensuring that India retains control over its digital landscape.

By launching Sarvam Audio and Sarvam Vision, Sarvam AI is not merely presenting itself as an alternative to major tech companies; rather, it is positioning itself as a leader in AI innovation that takes into account the unique needs of Indian users. This approach is particularly relevant as India seeks to serve the next billion users, demonstrating a commitment to shaping how artificial intelligence can enhance everyday life in the country.

For further insights, Microsoft has partnered with Sarvam AI to spur advancements in voice-based GenAI applications, highlighting the growing recognition of Sarvam’s capabilities in the tech ecosystem. In another notable collaboration, Republic and Sarvam AI made headlines by enabling real-time translation of Finance Minister Nirmala Sitharaman’s budget speech, showcasing the practical applications of this innovative technology.

As Sarvam AI continues to carve its niche in the competitive AI sector, its focus on local needs and challenges may serve as a beacon for future developments in technology tailored for specific regions and cultures.

Big Tech Loses $1T Amid $660B AI Spending Surge, Investors Demand Clarity on Returns

Microsoft, Amazon, Google, and Meta lost over $1 trillion in market value as investors question the viability of their $660 billion AI spending surge.

Staff24 minutes ago

AI Technology

Women Drive India’s AI Future with 10,000+ Innovators Joining AI Kiran Initiative

Over 10,000 women are empowered as innovators in AI through India's AI Kiran Initiative, setting the stage for inclusive tech development ahead of the...

Staff29 minutes ago

AI Technology

SpaceX Acquires xAI, Aiming for Orbital AI Data Centers Within 2 Years

SpaceX acquires xAI, targeting orbital AI data centers to capitalize on efficient, low-energy computing within two years amidst rising terrestrial power demands.

Staff4 hours ago

Broadcom’s AI Chip Surge and VMware Overhaul Set to Transform Tech Landscape

Broadcom anticipates double-digit growth in AI chip revenue amid a strategic VMware overhaul ahead of its Q1 2026 earnings report on March 4.

Staff4 hours ago

OpenAI Forms Ads Integrity Team to Ensure Trust Ahead of ChatGPT Ad Rollout

OpenAI establishes a dedicated ads integrity team as it prepares to test advertising in ChatGPT, requiring a $200,000 minimum spend from advertisers.

Staff5 hours ago

AI Technology

AI Augments Software Development: Demand for Engineers to Double by 2027

AI integration is set to double the demand for software engineers by 2027, as tools like GitHub Copilot enhance productivity by automating 46% of...

Staff10 hours ago

CIOs in Asia/Pacific to Boost Sovereign AI Investments by 50% Amid Governance Risks by 2028

CIOs in Asia/Pacific are set to increase sovereign AI investments by 50% by 2028 to navigate governance risks and comply with new regulations.

Staff10 hours ago

AI Generative

Sridhar Vembu Urges India to Prioritize Smaller AI Models Over Large LLMs

Sridhar Vembu of Zoho advocates for India to invest in smaller, energy-efficient AI models over costly large language models, estimating a $50B-$100B development burden.

Staff11 hours ago

AIPRESSA.COM

Top Stories

Sarvam AI Surpasses Google and OpenAI with Homegrown Speech and Vision Models

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

You May Also Like

Top Stories

Big Tech Loses $1T Amid $660B AI Spending Surge, Investors Demand Clarity on Returns

AI Technology

Women Drive India’s AI Future with 10,000+ Innovators Joining AI Kiran Initiative

AI Technology

SpaceX Acquires xAI, Aiming for Orbital AI Data Centers Within 2 Years

Top Stories

Broadcom’s AI Chip Surge and VMware Overhaul Set to Transform Tech Landscape

Top Stories

OpenAI Forms Ads Integrity Team to Ensure Trust Ahead of ChatGPT Ad Rollout

AI Technology

AI Augments Software Development: Demand for Engineers to Double by 2027

Top Stories

CIOs in Asia/Pacific to Boost Sovereign AI Investments by 50% Amid Governance Risks by 2028

AI Generative

Sridhar Vembu Urges India to Prioritize Smaller AI Models Over Large LLMs