Voxtral Launches Transcribe 2 with 13-Language Support and Sub-200ms Latency

Mistral launches Voxtral Transcribe 2 featuring 13-language support and sub-200ms latency, revolutionizing transcription for just $0.003 per minute.

Staff

Published

6 February, 2026

In a significant advancement for the field of speech recognition, Mistral has unveiled the new Voxtral Transcribe 2, a suite of next-generation speech-to-text models designed to deliver exceptional transcription quality, speaker diarization, and ultra-low latency. The release includes the Voxtral Mini Transcribe V2 for batch transcription and Voxtral Realtime for live applications, aiming to enhance various voice-driven workflows across different industries. Voxtral Realtime is now available with open weights under the Apache 2.0 license, promoting broader accessibility for developers.

The newly launched audio playground in Mistral Studio allows users to experiment with Voxtral Transcribe 2 instantly, offering features like diarization and timestamps. This development emphasizes Mistral’s commitment to making powerful transcription tools readily available for diverse applications.

The Voxtral Mini Transcribe V2 boasts state-of-the-art transcription capabilities, including speaker diarization, context biasing, and word-level timestamps across 13 languages. The model’s efficiency is underscored by its industry-leading accuracy combined with a low cost of just $0.003 per minute, making it highly competitive. In contrast, Voxtral Realtime is optimized for live transcription, providing configurable latency down to sub-200 milliseconds, which is crucial for voice agents and other real-time applications.

Voxtral Realtime utilizes a novel streaming architecture, allowing it to transcribe audio as it arrives, thus minimizing delays. At a 2.4-second delay, it matches the performance of the Mini Transcribe V2, while at 480 milliseconds, it maintains a word error rate within 1-2%. This capability opens new possibilities for voice-first applications, emphasizing its multilingual strengths with robust performance across languages such as English, Chinese, Hindi, and Spanish.

The Mini Transcribe V2 enhances transcription and diarization quality significantly, achieving approximately 4% word error rate on the FLEURS benchmark, while outperforming competitors like GPT-4o Mini Transcribe and Deepgram Nova in accuracy. With a processing capability three times faster than ElevenLabs’ Scribe v2 and at a fraction of the cost, the model stands out in the market for its price-performance ratio.

Key features of the Mini Transcribe V2 include advanced speaker diarization, which generates precise start and end timestamps, making it ideal for applications like meeting transcription and interview analysis. Context biasing allows users to input specific phrases to guide the model’s understanding of technical terms or proper nouns, particularly valuable in specialized industries. The model also maintains accuracy in noisy environments and can handle longer audio recordings of up to three hours in a single request.

Mistral’s new audio playground enables users to upload multiple audio files, toggle diarization options, and choose timestamp granularity, supporting various formats up to 1GB each. This interactive platform encourages immediate testing of the new transcription capabilities.

Voxtral’s innovations are poised to transform voice applications across numerous sectors. The technology enhances meeting intelligence by accurately transcribing multilingual recordings with clear speaker attribution, thus enabling efficient annotation of meeting content. Additionally, it facilitates the development of responsive voice interfaces for virtual assistants by integrating with large language models and text-to-speech systems.

In contact centers, real-time transcription capabilities allow AI systems to analyze sentiment and populate customer relationship management fields during live conversations. Media and broadcast applications benefit from the ability to generate live multilingual subtitles with minimal latency, while compliance and documentation processes are streamlined through accurate monitoring and transcription of interactions.

Both models ensure compliance with regulations such as GDPR and HIPAA, reinforcing Mistral’s commitment to secure deployments on-premise or in private cloud environments.

Available now, the Voxtral Mini Transcribe V2 can be accessed via API for $0.003 per minute, while Voxtral Realtime is offered at $0.006 per minute and as open weights on the Hugging Face Hub. Mistral encourages interested developers to explore its audio and transcription capabilities through comprehensive documentation and invites those passionate about speech AI to consider joining their team.

Mistral Launches Medium 3.5, Enabling Cloud-Based Remote Coding Agents in Vibe and Le Chat

Mistral unveils Medium 3.5, a cloud-based coding agent with a 256,000 token context and 91.4 τ³-Telecom score, revolutionizing productivity for teams.

Staff1 May, 2026

AI Business

IBM Launches Bob, an AI Development Partner Boosting Productivity by 45% for Enterprises

IBM's new AI-driven platform, Bob, automates the software development lifecycle, boosting productivity by 45% for over 80,000 users across enterprises.

Marcus Chen30 April, 2026

AI Generative

Unsloth Reveals Custom Kernels, Enabling 2x Faster LLM Fine-Tuning on Consumer GPUs

Unsloth's new library accelerates large language model fine-tuning on consumer GPUs by 2x while reducing VRAM usage by up to 70%, breaking hardware barriers.

Staff26 April, 2026

Mistral Launches Connectors in Studio for Seamless AI Application Development

Mistral unveils Connectors in Studio, enabling seamless API integration for enterprise AI applications, streamlining workflows and reducing setup time significantly.

Staff16 April, 2026

AIPRESSA.COM

Top Stories

Voxtral Launches Transcribe 2 with 13-Language Support and Sub-200ms Latency

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

Top Stories

Mistral Launches Medium 3.5, Enabling Cloud-Based Remote Coding Agents in Vibe and Le Chat

AI Business

IBM Launches Bob, an AI Development Partner Boosting Productivity by 45% for Enterprises

AI Generative

Unsloth Reveals Custom Kernels, Enabling 2x Faster LLM Fine-Tuning on Consumer GPUs

Top Stories

Mistral Launches Connectors in Studio for Seamless AI Application Development

AI Technology

Rebellions Raises $400M for Inference Solutions, Mistral Secures $830M for Paris Data Center

Top Stories

Mistral Launches Voxtral TTS Model Supporting 9 Languages for Edge Devices

Top Stories

Mistral Launches Open-Source Speech Generation Model to Transform Voice AI Applications

Top Stories

Mistral Launches Voxtral TTS Model, Supporting 9 Languages with Custom Voice Features