AI Chatbots Produce 49.6% Problematic Health Responses in New BMJ Open Study

A new BMJ Open study reveals that five AI chatbots, including ChatGPT and Grok, deliver 49.6% problematic health responses, raising urgent oversight concerns.

Staff

Published

15 April, 2026

Five widely used AI chatbots have been found to frequently deliver problematic answers to health-related inquiries, according to a study published in the BMJ Open on April 15, 2026. The research tested five AI models—Gemini, DeepSeek, Meta AI, ChatGPT, and Grok—with 50 prompts across five categories known for misinformation: cancer, vaccines, stem cells, nutrition, and athletic performance. The study’s findings raise significant concerns about the deployment of AI in health settings without adequate oversight.

The researchers designed the questions to challenge the chatbots with potentially misleading advice. Out of 250 total responses, nearly 50% were rated as problematic, with 30% considered somewhat problematic and 19.6% classified as highly problematic. While the analysis revealed no statistically significant differences in overall performance among the chatbots, Grok had a higher incidence of highly problematic responses.

The performance of the chatbots varied across different health categories, with stronger results observed in responses to questions about vaccines and cancer. Conversely, the chatbots struggled most with prompts related to stem cells, nutrition, and athletic performance. The study also noted that open-ended questions elicited significantly more highly problematic responses compared to closed-ended inquiries.

In terms of citation quality, the chatbots fell short. Among 25 closed-ended questions, the tools produced references roughly 81% of the time, yet the median completeness score hovered around just 40%. Notably, none of the chatbots generated a fully accurate and complete reference list, raising further concerns about the reliability of the information provided.

The readability of the responses was another issue; answers were often difficult for the average user to comprehend, requiring a higher education level for better understanding. The study’s authors expressed alarm at the implications of these findings, warning that the continued use of AI chatbots in health contexts without enhanced oversight could exacerbate the spread of misinformation.

As AI technology evolves and becomes increasingly integrated into various sectors, the implications of such findings could influence regulatory discussions around AI deployment in healthcare. Stakeholders may need to consider stringent guidelines to ensure that AI systems offer safe, accurate, and accessible information to users, particularly in sensitive areas like health and medicine. The urgency of addressing these issues highlights the need for robust oversight as reliance on AI tools continues to grow.

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

BusySeed unveils Rankxa, a tool tracking brand visibility across AI-generated responses, revealing 90% of brands lack meaningful presence in this new landscape.

Sofía Méndez3 May, 2026

AI Generative

Google Prepares Omni Model for Gemini Video Generation Ahead of I/O 2026

Google is set to unveil its new video-generation tool, Omni, at I/O 2026, potentially integrating Gemini's capabilities and enhancing competition against ByteDance's Seedance 2.0.

Staff2 May, 2026

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

A1 Public Relations helps entertainment brands enhance AI visibility in 2026 by integrating structured content and fresh, authoritative media, ensuring they are recognized by...

Staff2 May, 2026

AI Finance

More Than 55% of Americans Use AI for Financial Advice, Risking Personal Data Exposure

More than 55% of Americans now turn to AI tools for financial advice, risking personal data exposure despite rising privacy concerns.

Marcus Chen2 May, 2026

DeepSeek Launches V4 Open-Source Model, Underpricing GPT-5.5 and Claude Opus 4.7

DeepSeek's V4 open-source model undercuts GPT-5.5 and Claude Opus 4.7 with costs of $1.74 per million tokens, promising a disruptive shift in AI pricing...

Staff2 May, 2026

Google DeepMind’s AI Co-Clinician Surpasses GPT-5.4 in Blind Doctor Tests

Google DeepMind's AI co-clinician outperformed GPT-5.4 in doctor tests, achieving 67 preferences in primary care queries and a remarkable 95% quality score in open-ended...

Staff1 May, 2026

AI Technology

US Lawmakers Launch Investigation into Cybersecurity Risks from PRC-Origin AI in Critical Infrastructure

US lawmakers initiate a probe into PRC-developed AI systems, citing national security risks and potential exploitation of American innovations by companies like DeepSeek and...

Staff1 May, 2026

ChatGPT vs. Perplexity AI: Which CarPlay Voice Assistant Outperforms the Other?

Apple's CarPlay now supports third-party voice assistants like ChatGPT and Perplexity AI, with Perplexity outperforming ChatGPT in navigation and calendar management.

Staff1 May, 2026

AIPRESSA.COM

Top Stories

AI Chatbots Produce 49.6% Problematic Health Responses in New BMJ Open Study

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

AI Generative

Google Prepares Omni Model for Gemini Video Generation Ahead of I/O 2026

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

AI Finance

More Than 55% of Americans Use AI for Financial Advice, Risking Personal Data Exposure

Top Stories

DeepSeek Launches V4 Open-Source Model, Underpricing GPT-5.5 and Claude Opus 4.7

Top Stories

Google DeepMind’s AI Co-Clinician Surpasses GPT-5.4 in Blind Doctor Tests

AI Technology

US Lawmakers Launch Investigation into Cybersecurity Risks from PRC-Origin AI in Critical Infrastructure

Top Stories

ChatGPT vs. Perplexity AI: Which CarPlay Voice Assistant Outperforms the Other?