Mistral AI Launches OCR 3, Reducing Costs to $1 per 1,000 Pages with Enhanced Accuracy

Mistral AI unveils OCR 3, enhancing accuracy with a 74% improvement and lowering costs to $1 per 1,000 pages for high-volume processing.

Staff

Published

20 December, 2025

Mistral AI has unveiled its latest optical character recognition service, Mistral OCR 3, designed to enhance the company’s Document AI stack. The model, designated mistral-ocr-2512, specializes in extracting interleaved text and images from PDFs and other documents while maintaining their original structure. At an aggressive pricing point of $2 per 1,000 pages, users can benefit from a 50% discount when utilizing the Batch API, significantly reducing costs for high-volume processing.

Optimized for common enterprise document workloads, Mistral OCR 3 targets a variety of document types. This includes forms, scanned documents, complex tables, and handwritten text. According to internal benchmarks based on real business scenarios, the model achieves a 74% win rate over its predecessor, Mistral OCR 2, when assessed using a fuzzy match metric against established ground truth datasets.

The new model outputs markdown that not only preserves the document layout but also enriches it with HTML-based table representations when specifically enabled. This ensures that downstream systems receive both the content and the structural information essential for retrieval pipelines, analytics, and automated workflows.

As a crucial component of Mistral Document AI, the OCR 3 model integrates seamlessly with the company’s broader document processing capabilities, which combine OCR with structured data extraction and Document QnA. This functionality is now showcased within the Document AI Playground in Mistral AI Studio, allowing users to upload PDFs or images and receive either clean text or structured JSON outputs without requiring any coding knowledge. The same underlying OCR pipeline can also be accessed via a public API, facilitating a smooth transition from exploratory use to production workloads.

The OCR processor supports multiple document formats through a unified API. Users can point the document field to various types, including document_url for PDFs and other document formats, and image_url for image formats like PNG and JPEG. The flexibility extends to uploaded or base64-encoded images and PDFs, thereby accommodating a diverse array of input types.

The response is a JSON object containing a pages array. Each page includes an index, a markdown string, a list of images, optional tables (if table_format=”html” is enabled), detected hyperlinks, and additional fields for headers and footers, if extraction is enabled. The response also includes a document_annotation field for structured annotations and a usage_info block for accounting details.

Mistral OCR 3 boasts several enhancements over Mistral OCR 2, emphasizing key improvements in four primary areas. First, the model offers better handwriting recognition, including more accurate interpretation of cursive text and annotations on printed templates. Second, it enhances forms processing by improving the detection of boxes, labels, and handwritten entries in complex layouts, which are frequently found in invoices and compliance documents. Third, it demonstrates greater robustness in handling scanned pages, overcoming challenges such as compression artifacts and low resolutions. Finally, it excels at reconstructing complex table structures with various hierarchies and can return HTML tables that maintain proper layout.

The pricing structure for Mistral OCR 3 is straightforward, with costs set at $2 per 1,000 pages for standard OCR and $3 per 1,000 pages for pages with structured annotations. When used through the Batch Inference API, the effective cost can drop to $1 per 1,000 pages, incentivizing large-scale processing. The model also integrates structured annotations and bounding box extraction features, enabling developers to label specific document regions and retrieve bounding boxes for better content mapping in downstream systems.

In summary, Mistral OCR 3 represents a significant advancement in optical character recognition technology, combining competitive pricing with enhanced capabilities. With its robust feature set, the model positions itself as a strong contender in both traditional and AI-native OCR landscapes, delivering valuable tools for document processing and analysis. As businesses increasingly seek efficiency and accuracy in document handling, Mistral’s latest offering could play a pivotal role in meeting these demands.

AI Technology

Nvidia Stock Rises as Mistral AI Orders 13,800 GPUs for $575M Paris Data Center

Nvidia's stock climbs as Mistral AI secures $830M in funding for a Paris data center, ordering 13,800 GPUs that could yield $575M in sales.

Staff4 days ago

Mistral Launches Open-Source Voxtral TTS, Competing with OpenAI and ElevenLabs

Mistral AI launches the open-source Voxtral TTS, delivering state-of-the-art text-to-speech performance across nine languages at a fraction of traditional costs.

Staff26 March, 2026

AIPRESSA.COM

Top Stories

Mistral AI Launches OCR 3, Reducing Costs to $1 per 1,000 Pages with Enhanced Accuracy

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Technology

Nvidia Stock Rises as Mistral AI Orders 13,800 GPUs for $575M Paris Data Center

Top Stories

Mistral Launches Open-Source Voxtral TTS, Competing with OpenAI and ElevenLabs

AI Finance

Mistral AI Reveals €150B AI Investment’s Key Challenge: Control Over Systems

Top Stories

Mistral Proposes Revenue-Based Levy for AI Training Copyright in Europe

Top Stories

Multiverse Computing Launches Compressed AI Models, Cuts Deployment Costs by 50%

Top Stories

Mistral AI Launches Leanstral: Open-Source Proof Verification for Efficient AI Coding

Top Stories

xAI Hires Mistral AI Co-Founder Devendra Chaplot for Grok Model Training with Musk

Top Stories

Mistral AI Secures Defense Agreement with France to Enhance Sovereign AI Capabilities