Google AI’s Gemini Model Deemed 91% Accurate, Yet Tens of Millions of Errors Annually

Google’s Gemini AI model claims 91% accuracy, yet it generates tens of millions of errors annually, raising alarms about misinformation in search results

Staff

Published

14 April, 2026

Google’s AI search overviews, which rely on the company’s Gemini large-language model (LLM), are reportedly facing significant inaccuracies, raising concerns within the tech community. A recent report conducted by AI startup Oumi and commissioned by the New York Times claims that while 91 percent of searches return accurate results, this still translates to tens of millions of incorrect answers given Google’s processing of over five trillion searches annually.

The volume of misinformation is alarming, with Futurism describing the situation as a potential “misinformation crisis.” In light of these findings, Google’s spokesperson, Ned Adriance, has contested the report, labeling it as flawed. He criticized the methodology used, which involved one AI grading another, calling it an “old benchmark that is known for being full of errors.” This, he argues, does not adequately reflect the nature of Google searches.

The research utilized a system called SimpleQA, a benchmark from OpenAI that assesses how effectively an LLM can answer straightforward, fact-based questions. Although OpenAI maintains that SimpleQA is accurate, its limited scope means it only measures short inquiries with a single verifiable answer. As noted in the report, the correlation between providing concise factual answers and creating comprehensive, accurate responses remains an open question.

Despite the high accuracy rate reported, Oumi’s examination of Google’s AI revealed instances where verifiable questions resulted in incorrect responses. The report highlighted several factual errors, with the AI sometimes citing unreliable sources or misinterpreting information from credible sites. In some cases, while the initial answer was correct, the additional context provided was inaccurate. Furthermore, the AI’s susceptibility to manipulation was evident, as even a blog post could mislead it into recognizing someone as an expert in an unrelated field.

Google has also pointed out flaws within the SimpleQA framework, citing a study by researchers at Google DeepMind that identified incorrect “ground truths,” or verified facts. The company emphasized the irony of using one imperfect AI model to evaluate another. This raises broader questions about the reliability of AI assessment methods in general.

Adriance brought attention to two specific examples from the New York Times report. In one instance, Gemini incorrectly stated that Bob Marley’s house became a museum in 1987, while the correct date is 1986. Google provided a screenshot of the Wikipedia entry that Gemini utilized, which at the time contained conflicting dates. The issue has since been rectified, with the entry now consistently stating 1986.

In another example, Gemini reportedly misidentified the Neuse River’s location in North Carolina, claiming it ran “west” of Goldsboro. Google contended that while the river primarily flows south, it does indeed run southwest, rendering the answer “plausible” rather than entirely incorrect. This statement underscores the challenges of capturing nuanced geographical information through AI.

The ongoing dialogue around AI accuracy highlights the evolving landscape of technology, where the stakes are high for platforms like Google, which underpin much of the information landscape. As scrutiny of AI systems continues, the industry must grapple with the balance between innovation and reliability, ensuring that users receive accurate information in an era increasingly reliant on automated systems.

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff3 May, 2026

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

BusySeed unveils Rankxa, a tool tracking brand visibility across AI-generated responses, revealing 90% of brands lack meaningful presence in this new landscape.

Sofía Méndez3 May, 2026

AI Generative

Google Prepares Omni Model for Gemini Video Generation Ahead of I/O 2026

Google is set to unveil its new video-generation tool, Omni, at I/O 2026, potentially integrating Gemini's capabilities and enhancing competition against ByteDance's Seedance 2.0.

Staff2 May, 2026

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

A1 Public Relations helps entertainment brands enhance AI visibility in 2026 by integrating structured content and fresh, authoritative media, ensuring they are recognized by...

Staff2 May, 2026

AI Generative

OpenAI Launches GPT Image 2, Surpassing Google Nano Banana 2 in Key Categories

OpenAI unveils GPT Image 2, achieving a record 242-point lead over competitors, transforming the AI image generation landscape with native reasoning capabilities.

Staff2 May, 2026

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

Apple CEO Tim Cook warns of several-month supply shortages for the Mac mini and Mac Studio as demand surges, pushing Mac revenue to $8.4...

Staff2 May, 2026

AI Marketing

ACME.BOT Reveals SEO Checklists are Obsolete as AI Search Reshapes Content Visibility

ACME.BOT declares traditional SEO checklists obsolete, revealing a 27% drop in organic traffic as AI platforms disrupt content visibility.

Sofía Méndez2 May, 2026

AIPRESSA.COM

Top Stories

Google AI’s Gemini Model Deemed 91% Accurate, Yet Tens of Millions of Errors Annually

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

AI Generative

Google Prepares Omni Model for Gemini Video Generation Ahead of I/O 2026

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

AI Generative

OpenAI Launches GPT Image 2, Surpassing Google Nano Banana 2 in Key Categories

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

AI Marketing

ACME.BOT Reveals SEO Checklists are Obsolete as AI Search Reshapes Content Visibility