Connect with us

Hi, what are you looking for?

Top Stories

Google and Cohere Launch Advanced Audio AI Models, Boosting Transcription Accuracy to 5.42%

Google and Cohere launch advanced audio AI models, with Google’s Gemini 3.1 achieving a 90.8% performance score and Cohere’s Transcribe reaching a 5.42% word error rate, revolutionizing customer service and transcription tasks.

Google LLC and Cohere Inc. unveiled new artificial intelligence models optimized for audio processing tasks, aiming to enhance automation in customer service and transcription services. The announcements, made today, underline a growing trend in utilizing AI to streamline communication processes across various sectors.

Google’s latest algorithm, Gemini 3.1 Flash Live, is designed to automate customer service interactions effectively. Businesses can deploy this model to create voice agents capable of handling customer inquiries and requests, such as processing product returns. The technology not only interprets spoken language but also integrates visual inputs, allowing users to upload images of malfunctioning devices to assist in troubleshooting. This versatile approach aims to improve user experience by adapting responses based on emotional cues—detecting user frustration or confusion to adjust interactions accordingly.

The performance of Gemini 3.1 Flash Live shows notable advancements; it scored 90.8% on the ComplexFuncBench Audio benchmark, marking nearly a 20% improvement over its predecessor. Additionally, it set a record on another benchmark, Audio MultiChallenge, showcasing its enhanced capabilities in processing audio data.

Beyond customer support, Gemini 3.1 Flash Live can facilitate the development of voice interfaces for a variety of applications, underpinning features in Google’s Gemini chatbot and the Search Live multimodal search tool. According to Google product manager Valeria Wu and software engineer Yifan Ding, the model provides faster responses and can maintain the context of a conversation for longer periods, which is particularly useful during extensive discussions.

Meanwhile, Cohere has introduced its new AI model, Cohere Transcribe, which is tailored specifically for transcription tasks. The company claims it boasts the highest accuracy in its category, achieving an average word error rate of 5.42%. This has positioned it at the forefront of the Hugging Face Open ASR Leaderboard, a notable ranking in the field of automatic speech recognition.

Cohere Transcribe converts raw audio into mathematical representations through a specialized algorithm known as Conformer. This technology combines a convolutional neural network with a transformer model, resulting in a sophisticated process for audio analysis. Once the audio is represented mathematically, a standalone transformer generates the actual transcript. The model supports output in over a dozen languages, making it versatile for various global applications.

With 2 billion parameters across its Conformer and transformer components, Cohere Transcribe claims to operate efficiently, requiring relatively low computing power. The model is available under an open-source Apache 2.0 license, allowing companies to run it on their own infrastructure or utilize Cohere’s Model Vault managed inference service. The company also plans to incorporate this model into its North productivity platform, which aims to enhance document searching and automate repetitive tasks.

As businesses increasingly turn to AI to improve operational efficiency, the introduction of these models by Google and Cohere reflects a significant advancement in audio processing technology. The ability to automate customer interactions and accurately transcribe speech holds the potential to transform communication in various industries, paving the way for innovations that could redefine user engagement and productivity.

The ongoing developments in AI technology signal a broader trend toward integrating advanced machine learning capabilities into customer service and productivity tools, a shift that may shape the future of how businesses interact with their clients and manage their workflows.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Government

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

AI Marketing

BusySeed unveils Rankxa, a tool tracking brand visibility across AI-generated responses, revealing 90% of brands lack meaningful presence in this new landscape.

AI Generative

Google is set to unveil its new video-generation tool, Omni, at I/O 2026, potentially integrating Gemini's capabilities and enhancing competition against ByteDance's Seedance 2.0.

AI Marketing

ACME.BOT declares traditional SEO checklists obsolete, revealing a 27% drop in organic traffic as AI platforms disrupt content visibility.

Top Stories

Apple's Q2 earnings reveal a price hike for the Mac mini to $799, fueled by AI memory demand, as Google and Amazon also report...

AI Technology

Major tech giants, including Google and Amazon, are set to invest $3.7 trillion in AI infrastructure over five years, reshaping the workforce and economy.

AI Generative

Google's Gemini Embedding 2 enhances AI retrieval accuracy by 40%, enabling multimodal inputs and boosting search precision for platforms like Harvey and Nuuly.

AI Finance

AI technology is fueling a 38% surge in retirees' 401(k) portfolios while causing 16,000 job losses monthly among younger workers, highlighting stark generational disparities.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.