AI Research

Research Reveals ChatGPT Health’s 50% Under-Triage Rate in Emergency Scenarios

Icahn School of Medicine study reveals that ChatGPT Health under-triages over 50% of urgent cases, raising alarms over AI’s reliability in emergency care.

Staff

Published

24 February, 2026

Newswise — New York, NY [February 24, 2026] — A recent evaluation by researchers from the Icahn School of Medicine at Mount Sinai has raised serious concerns about the performance of ChatGPT Health, a consumer artificial intelligence (AI) tool designed to provide health guidance. The study, published in the February 23, 2026 online issue of Nature Medicine, suggests that the tool may inadequately direct users to emergency care in numerous serious situations, particularly in cases of self-harm.

ChatGPT Health, launched in January 2026 by OpenAI, has already attracted around 40 million daily users seeking health information and advice on urgent care. However, the researchers highlighted a lack of independent evidence regarding its safety and reliability, which prompted their investigation. Lead author Ashwin Ramaswamy, MD, noted, “We wanted to answer a very basic but critical question: if someone is experiencing a real medical emergency and turns to ChatGPT Health for help, will it clearly tell them to go to the emergency room?”

The study utilized 60 structured clinical scenarios across 21 medical specialties, with three independent physicians assessing the urgency of each case based on guidelines from 56 medical societies. The results indicated that while ChatGPT Health handled straightforward emergencies appropriately, it under-triaged more than half of the cases deemed urgent by physicians. For example, the system correctly identified clear-cut emergencies such as strokes but struggled with nuanced situations where clinical judgment is crucial.

In tests involving suicide-risk alerts, the tool was intended to direct users to the 988 Suicide and Crisis Lifeline in high-risk situations. However, the researchers found that these alerts were inconsistent, sometimes triggering in lower-risk scenarios while failing to appear when users outlined specific self-harm plans. Girish N. Nadkarni, MD, MPH, a senior author of the study, expressed alarm over this finding, stating, “When someone talks about exactly how they would harm themselves, that’s a sign of more immediate and serious danger, not less.”

In total, the team conducted 960 interactions with ChatGPT Health across various contextual conditions, including differences in race, gender, and barriers to care like lack of insurance. The findings underscored that the AI tool often recognized dangerous indicators in its explanations but continued to reassure users, ultimately failing to prompt necessary action in critical scenarios. For instance, in an asthma case, although it identified early signs of respiratory failure, it still recommended delaying emergency treatment.

The researchers caution users that for serious symptoms such as chest pain, shortness of breath, severe allergic reactions, or suicidal thoughts, they should seek medical assistance directly rather than relying solely on AI recommendations. While the findings raise significant concerns, the authors do not advocate for abandoning AI health tools altogether. Alvira Tyagi, a first-year medical student and co-author of the study, emphasized the need to integrate such technologies thoughtfully into medical care rather than view them as substitutes for professional clinical judgment.

As AI models continue to evolve, the researchers stress the importance of ongoing independent evaluations to ensure that updates lead to improved safety in patient care. Tyagi remarked, “Starting medical training alongside tools that are evolving in real time makes it clear that today’s results are not set in stone.” The study aims to continue assessing updated versions of ChatGPT Health, with future research focusing on pediatric care, medication safety, and non-English-language usage.

The paper, titled “ChatGPT Health performance in a structured test of triage recommendations,” highlights the urgent need for a careful examination of AI tools in healthcare settings. With millions turning to these technologies for health guidance, ensuring their reliability and safety is more critical than ever.

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

US Department of Defense partners with tech giants including SpaceX and OpenAI to launch an "AI-first" initiative aimed at enhancing military decision-making efficiency.

Staff3 May, 2026

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff3 May, 2026

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

BusySeed unveils Rankxa, a tool tracking brand visibility across AI-generated responses, revealing 90% of brands lack meaningful presence in this new landscape.

Sofía Méndez3 May, 2026

AI Generative

OpenAI Launches GPT Image 2, Surpassing Google Nano Banana 2 in Key Categories

OpenAI unveils GPT Image 2, achieving a record 242-point lead over competitors, transforming the AI image generation landscape with native reasoning capabilities.

Staff2 May, 2026

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

Apple CEO Tim Cook warns of several-month supply shortages for the Mac mini and Mac Studio as demand surges, pushing Mac revenue to $8.4...

Staff2 May, 2026

AI Finance

BsStrategy Launches AI Trading Platform Focused on Market Rhythm and User Awareness

BsStrategy launches an AI trading platform that enhances user awareness and market engagement by emphasizing structured timing and continuous improvement in trading processes.

Marcus Chen2 May, 2026

AI Marketing

Optimove Expands Data Connectivity, Enabling 200 Customer Attributes for AI-Driven Marketing

Optimove enhances data connectivity, empowering marketers to leverage up to 200 customer attributes for AI-driven campaigns, improving efficiency by 88%

Sofía Méndez2 May, 2026

DeepSeek Launches V4 Open-Source Model, Underpricing GPT-5.5 and Claude Opus 4.7

DeepSeek's V4 open-source model undercuts GPT-5.5 and Claude Opus 4.7 with costs of $1.74 per million tokens, promising a disruptive shift in AI pricing...

Staff2 May, 2026

AIPRESSA.COM

AI Research

Research Reveals ChatGPT Health’s 50% Under-Triage Rate in Emergency Scenarios

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Government

US Defense Partners with Anthropic, OpenAI, and Tech Giants for AI-First Military Initiative

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

AI Marketing

BusySeed Launches Rankxa to Measure Brand Visibility in AI-Generated Search Results

AI Generative

OpenAI Launches GPT Image 2, Surpassing Google Nano Banana 2 in Key Categories

AI Technology

Apple Faces Mac Mini and Studio Shortage as OpenClaw Drives AI Demand Surge

AI Finance

BsStrategy Launches AI Trading Platform Focused on Market Rhythm and User Awareness

AI Marketing

Optimove Expands Data Connectivity, Enabling 200 Customer Attributes for AI-Driven Marketing

Top Stories

DeepSeek Launches V4 Open-Source Model, Underpricing GPT-5.5 and Claude Opus 4.7