Connect with us

Hi, what are you looking for?

Top Stories

Sarvam AI Surpasses Google and OpenAI with Homegrown Speech and Vision Models

Sarvam AI’s local models, Sarvam Audio and Sarvam Vision, outpace Google and OpenAI with 93.28% document accuracy, revolutionizing AI for Indian users.

New Delhi: India’s technology landscape is making significant strides with the introduction of Sarvam AI, a startup based in Bengaluru. This company has launched artificial intelligence models tailored specifically for Indian users, named Sarvam Audio and Sarvam Vision. These models are not only positioned to compete with global giants like Google and OpenAI but are also demonstrating superior performance in contexts relevant to India.

In a country where voice communication is paramount, Sarvam AI has developed systems that prioritize speech over traditional text-based interaction. Many Indians, including farmers and delivery workers, rely on verbal instructions in their daily activities. In response, Sarvam Audio has been meticulously trained on 22 Indian languages from the ground up, enabling it to effectively understand “code-mixing,” where speakers fluidly switch between languages, a common feature in Indian vernacular.

The performance metrics for Sarvam’s products reveal a compelling narrative. Sarvam Audio has consistently surpassed competitors on the IndicVoices benchmark, outperforming Google’s Gemini-3-Flash and OpenAI’s GPT-4o in transcription accuracy with a notably lower Word Error Rate (WER). Similarly, the visual model, Sarvam Vision, achieved an impressive 84.3 percent accuracy on the olmOCR-Bench, outperforming both Gemini 3 Pro and DeepSeek. In the realm of document analysis, Sarvam Vision scored 93.28 percent on the OmniDoc benchmark, demonstrating that smaller, specialized models can exceed the capabilities of larger global systems when addressing Indian-specific documents, tables, and formulas.

Among its innovative features, Sarvam Audio introduces a unique Speech-to-Command capability. Unlike traditional models that require speech to be transcribed into text before any action is taken, Sarvam Audio can initiate actions directly from voice commands. This eliminates latency and minimizes misunderstandings, particularly in noisy environments. For instance, when a user says “Nau” in Hindi, Sarvam Audio accurately interprets it as the numeral “9,” while other systems may misinterpret it as the English word “No.”

Additionally, Sarvam Audio integrates advanced speaker diarization technology, allowing it to distinguish between up to eight different voices in a single audio recording. This feature is particularly beneficial in the Indian context, where overlapping voices are commonplace in busy offices and call centers. The model is also optimized for 8kHz telephony, ensuring reliable performance even with the low-quality audio often found in traditional customer service calls.

Sarvam AI’s growth is backed by the IndiaAI Mission and government-supported GPU clusters, emphasizing a commitment to developing sovereign AI technology within India. By crafting models that cater exclusively to Indian users, Sarvam AI aims to reduce reliance on foreign systems, aligning with a broader vision of ensuring that India retains control over its digital landscape.

By launching Sarvam Audio and Sarvam Vision, Sarvam AI is not merely presenting itself as an alternative to major tech companies; rather, it is positioning itself as a leader in AI innovation that takes into account the unique needs of Indian users. This approach is particularly relevant as India seeks to serve the next billion users, demonstrating a commitment to shaping how artificial intelligence can enhance everyday life in the country.

For further insights, Microsoft has partnered with Sarvam AI to spur advancements in voice-based GenAI applications, highlighting the growing recognition of Sarvam’s capabilities in the tech ecosystem. In another notable collaboration, Republic and Sarvam AI made headlines by enabling real-time translation of Finance Minister Nirmala Sitharaman’s budget speech, showcasing the practical applications of this innovative technology.

As Sarvam AI continues to carve its niche in the competitive AI sector, its focus on local needs and challenges may serve as a beacon for future developments in technology tailored for specific regions and cultures.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

Top Stories

Microsoft, Amazon, Google, and Meta lost over $1 trillion in market value as investors question the viability of their $660 billion AI spending surge.

AI Technology

Over 10,000 women are empowered as innovators in AI through India's AI Kiran Initiative, setting the stage for inclusive tech development ahead of the...

AI Technology

SpaceX acquires xAI, targeting orbital AI data centers to capitalize on efficient, low-energy computing within two years amidst rising terrestrial power demands.

Top Stories

Broadcom anticipates double-digit growth in AI chip revenue amid a strategic VMware overhaul ahead of its Q1 2026 earnings report on March 4.

Top Stories

OpenAI establishes a dedicated ads integrity team as it prepares to test advertising in ChatGPT, requiring a $200,000 minimum spend from advertisers.

AI Technology

AI integration is set to double the demand for software engineers by 2027, as tools like GitHub Copilot enhance productivity by automating 46% of...

Top Stories

CIOs in Asia/Pacific are set to increase sovereign AI investments by 50% by 2028 to navigate governance risks and comply with new regulations.

AI Generative

Sridhar Vembu of Zoho advocates for India to invest in smaller, energy-efficient AI models over costly large language models, estimating a $50B-$100B development burden.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.