Five widely used AI chatbots have been found to frequently deliver problematic answers to health-related inquiries, according to a study published in the BMJ Open on April 15, 2026. The research tested five AI models—Gemini, DeepSeek, Meta AI, ChatGPT, and Grok—with 50 prompts across five categories known for misinformation: cancer, vaccines, stem cells, nutrition, and athletic performance. The study’s findings raise significant concerns about the deployment of AI in health settings without adequate oversight.
The researchers designed the questions to challenge the chatbots with potentially misleading advice. Out of 250 total responses, nearly 50% were rated as problematic, with 30% considered somewhat problematic and 19.6% classified as highly problematic. While the analysis revealed no statistically significant differences in overall performance among the chatbots, Grok had a higher incidence of highly problematic responses.
The performance of the chatbots varied across different health categories, with stronger results observed in responses to questions about vaccines and cancer. Conversely, the chatbots struggled most with prompts related to stem cells, nutrition, and athletic performance. The study also noted that open-ended questions elicited significantly more highly problematic responses compared to closed-ended inquiries.
In terms of citation quality, the chatbots fell short. Among 25 closed-ended questions, the tools produced references roughly 81% of the time, yet the median completeness score hovered around just 40%. Notably, none of the chatbots generated a fully accurate and complete reference list, raising further concerns about the reliability of the information provided.
The readability of the responses was another issue; answers were often difficult for the average user to comprehend, requiring a higher education level for better understanding. The study’s authors expressed alarm at the implications of these findings, warning that the continued use of AI chatbots in health contexts without enhanced oversight could exacerbate the spread of misinformation.
As AI technology evolves and becomes increasingly integrated into various sectors, the implications of such findings could influence regulatory discussions around AI deployment in healthcare. Stakeholders may need to consider stringent guidelines to ensure that AI systems offer safe, accurate, and accessible information to users, particularly in sensitive areas like health and medicine. The urgency of addressing these issues highlights the need for robust oversight as reliance on AI tools continues to grow.
See also
Germany”s National Team Prepares for World Cup Qualifiers with Disco Atmosphere
95% of AI Projects Fail in Companies According to MIT
AI in Food & Beverages Market to Surge from $11.08B to $263.80B by 2032
Satya Nadella Supports OpenAI’s $100B Revenue Goal, Highlights AI Funding Needs
Wall Street Recovers from Early Loss as Nvidia Surges 1.8% Amid Market Volatility


















































