Connect with us

Hi, what are you looking for?

Top Stories

Google and Cohere Launch Advanced Audio AI Models, Boosting Transcription Accuracy to 5.42%

Google and Cohere launch advanced audio AI models, with Google’s Gemini 3.1 achieving a 90.8% performance score and Cohere’s Transcribe reaching a 5.42% word error rate, revolutionizing customer service and transcription tasks.

Google LLC and Cohere Inc. unveiled new artificial intelligence models optimized for audio processing tasks, aiming to enhance automation in customer service and transcription services. The announcements, made today, underline a growing trend in utilizing AI to streamline communication processes across various sectors.

Google’s latest algorithm, Gemini 3.1 Flash Live, is designed to automate customer service interactions effectively. Businesses can deploy this model to create voice agents capable of handling customer inquiries and requests, such as processing product returns. The technology not only interprets spoken language but also integrates visual inputs, allowing users to upload images of malfunctioning devices to assist in troubleshooting. This versatile approach aims to improve user experience by adapting responses based on emotional cues—detecting user frustration or confusion to adjust interactions accordingly.

The performance of Gemini 3.1 Flash Live shows notable advancements; it scored 90.8% on the ComplexFuncBench Audio benchmark, marking nearly a 20% improvement over its predecessor. Additionally, it set a record on another benchmark, Audio MultiChallenge, showcasing its enhanced capabilities in processing audio data.

Beyond customer support, Gemini 3.1 Flash Live can facilitate the development of voice interfaces for a variety of applications, underpinning features in Google’s Gemini chatbot and the Search Live multimodal search tool. According to Google product manager Valeria Wu and software engineer Yifan Ding, the model provides faster responses and can maintain the context of a conversation for longer periods, which is particularly useful during extensive discussions.

Meanwhile, Cohere has introduced its new AI model, Cohere Transcribe, which is tailored specifically for transcription tasks. The company claims it boasts the highest accuracy in its category, achieving an average word error rate of 5.42%. This has positioned it at the forefront of the Hugging Face Open ASR Leaderboard, a notable ranking in the field of automatic speech recognition.

Cohere Transcribe converts raw audio into mathematical representations through a specialized algorithm known as Conformer. This technology combines a convolutional neural network with a transformer model, resulting in a sophisticated process for audio analysis. Once the audio is represented mathematically, a standalone transformer generates the actual transcript. The model supports output in over a dozen languages, making it versatile for various global applications.

With 2 billion parameters across its Conformer and transformer components, Cohere Transcribe claims to operate efficiently, requiring relatively low computing power. The model is available under an open-source Apache 2.0 license, allowing companies to run it on their own infrastructure or utilize Cohere’s Model Vault managed inference service. The company also plans to incorporate this model into its North productivity platform, which aims to enhance document searching and automate repetitive tasks.

As businesses increasingly turn to AI to improve operational efficiency, the introduction of these models by Google and Cohere reflects a significant advancement in audio processing technology. The ability to automate customer interactions and accurately transcribe speech holds the potential to transform communication in various industries, paving the way for innovations that could redefine user engagement and productivity.

The ongoing developments in AI technology signal a broader trend toward integrating advanced machine learning capabilities into customer service and productivity tools, a shift that may shape the future of how businesses interact with their clients and manage their workflows.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Generative

Google introduces seven free AI tools, including Gemini for productivity and creative tasks, revolutionizing user experiences and enhancing workflows across Google services.

Top Stories

Meta and YouTube's landmark court loss over social media addiction could reshape liability standards for AI firms facing similar lawsuits linked to user safety.

AI Generative

OpenAI terminates its Sora video generation app and a $1 billion deal with Disney just nine months post-launch amid fierce competition and financial pressures.

Top Stories

Meta accelerates AI integration across Facebook and Instagram, aiming for significant feature enhancements to boost user engagement and revenue amidst intense competition.

AI Marketing

Apple appoints ex-Google VP Lilian Rincon to lead AI marketing as it aims to enhance Siri with Alphabet's Gemini AI technology amidst fierce competition

Top Stories

Cohere unveils Transcribe, an open-source ASR model achieving a 5.42% word error rate, claiming the top spot on the Hugging Face leaderboard.

AI Business

Google teams up with Kroger to enable targeted advertising on YouTube, driving measurable sales outcomes for brands in a competitive retail landscape.

Top Stories

Google Labs unveils NotebookLM, enhancing document interaction with innovative features like Audio Overviews, transforming user experience in education and beyond.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.