DeepSeek Unveils OCR Model Achieving 97% Accuracy with 10x Data Compression

DeepSeek introduces DeepSeek-OCR, achieving 97% accuracy with 10x data compression, challenging AI efficiency norms and transforming input processing for LLMs

Staff

Published

15 November, 2025

In a significant development last month, a team of researchers in China introduced a new Optical Character Recognition (OCR) model named DeepSeek-OCR. This innovation may have gone largely unnoticed, but it holds the potential to revolutionize the efficiency of AI models.

Initial expert feedback on DeepSeek-OCR has been favorable. While it is not marketed as a state-of-the-art solution and is primarily a proof-of-concept, it challenges prevailing assumptions in AI. Notably, Andrej Karpathy, co-founder of OpenAI, posits that DeepSeek-OCR could dispel a common misconception: “Perhaps (…) all inputs to LLMs should always be images.” The rationale behind this claim is that images may offer a more efficient processing route for large language models (LLMs) than traditional text.

Revolutionizing Data Compression

The current landscape of AI is marked by an obsession with data compression, where reducing data footprints translates into time, energy, and cost efficiencies. This push for compression occurs amidst a frenzy to build extensive AI factories capable of housing advanced AI chips. The prevailing belief is that despite efforts to streamline data, AI infrastructure must be expansive and ambitious.

However, DeepSeek-OCR suggests an alternative pathway for data reduction that has often been overlooked. Visual information, which has traditionally been sidelined in generative AI compared to textual applications, appears to fit more efficiently within the context window, or short-term memory, of LLMs. This allows AI models to process not just tens of thousands of words but potentially dozens of pages, leading to improved performance. In essence, pixels may prove to be superior compression tools for AI compared to text.

The DeepSeek-OCR operates using a compact visual encoder containing 380 million parameters. This encoder translates visual information—typically text documents—into a more efficient form. The compressed data is then sent to a decoder that consists of only 3 billion parameters, out of which just 570 million are activated for the computations. This architecture enables the model to achieve a tenfold compression of data while maintaining an impressive accuracy rate of 97 percent.

DeepSeek’s Growing Influence

Earlier this year, DeepSeek made headlines with the launch of DeepSeek-R1, an AI model characterized by 671 billion parameters and remarkable capabilities for its size. This model was available for open-source use and was developed at a relatively low cost of less than €300,000. Although models from OpenAI still dominate performance benchmarks, DeepSeek’s efficiency draws attention in the AI community.

The controversy surrounding DeepSeek-R1 stems from its potential reliance on outputs from ChatGPT or the OpenAI API, which raises questions on whether it merely mimicked or compressed the capabilities of existing models. With the introduction of OCR, DeepSeek is solidifying its role as a compression specialist within generative AI. Unlike proprietary models from notable companies like OpenAI, Meta, or Google, the research conducted by DeepSeek is openly accessible, which fosters collaboration and innovation in the sector.

It remains uncertain how other AI models are leveraging similar compression techniques. Google, for instance, has not disclosed whether its Gemini models utilize strategies akin to those of DeepSeek. Nonetheless, the optimization methods seen in DeepSeek may soon become standard practice across the industry, akin to Mixture-of-Experts, where only relevant components of an AI model are activated for specific tasks.

Implications for the Future

While DeepSeek-OCR itself may not represent a groundbreaking shift for AI applications, it indicates a broader possibility for enhancing the efficiency of AI workloads. Unanswered questions linger, such as whether LLMs will need to convert all inputs to images. Additionally, it remains unclear if major players like Google and OpenAI have already adopted similar strategies.

The implications of DeepSeek-OCR could be twofold. First, LLMs might become adept at processing information from prompts more effectively by converting text into images, thus minimizing accuracy loss. Moreover, this could allow AI models to manage larger datasets, such as extensive business documents or compliance materials, ultimately leading to more comprehensive and precise outputs than current capabilities permit.

1 Revolutionizing Data Compression
2 DeepSeek’s Growing Influence
3 Implications for the Future

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

Red Hat advances enterprise AI with Small Language Models that achieve over 98% validity in structured tasks, prioritizing reliability and data sovereignty.

Marcus Chen3 May, 2026

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff3 May, 2026

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

Korea Venture Investment Corp. unveils AI-driven fund management systems by integrating Nvidia H200 GPUs to enhance efficiency and support unicorn growth.

Staff3 May, 2026

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

Apple raises Mac mini starting price to $799 amid AI-driven inventory shortages, eliminating the $599 model in response to surging demand for advanced computing.

Staff3 May, 2026

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab

IBM launches a Chicago Quantum Hub to create 750 AI jobs and expands its MIT partnership to advance quantum computing and AI integration.

Staff3 May, 2026

AI Government

71% of Aussies Use Generative AI, Yet Only 36% Trust Its Implementation, Says Expert

71% of Australian employees use generative AI daily, but only 36% trust its implementation, highlighting urgent calls for better policy frameworks and safeguards.

Staff3 May, 2026

AI Regulation

Academy Confirms AI Performances Ineligible for Oscars Amid Growing Industry Tensions

The Academy of Motion Picture Arts and Sciences bars AI performances from Oscar eligibility, emphasizing human-authored content amid rising industry tensions over generative AI's...

Staff2 May, 2026

AIPRESSA.COM

Top Stories

DeepSeek Unveils OCR Model Achieving 97% Accuracy with 10x Data Compression

Revolutionizing Data Compression

DeepSeek’s Growing Influence

Implications for the Future

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab

AI Government

71% of Aussies Use Generative AI, Yet Only 36% Trust Its Implementation, Says Expert

AI Regulation

Academy Confirms AI Performances Ineligible for Oscars Amid Growing Industry Tensions