Connect with us

Hi, what are you looking for?

Top Stories

Google DeepMind’s AI Co-Clinician Surpasses GPT-5.4 in Blind Doctor Tests

Google DeepMind’s AI co-clinician outperformed GPT-5.4 in doctor tests, achieving 67 preferences in primary care queries and a remarkable 95% quality score in open-ended medication questions.

Google DeepMind is advancing healthcare technology with its development of an “AI co-clinician” designed to assist doctors in patient care. Initial simulation studies indicate promising results, though the AI system has not yet matched the performance of seasoned physicians. Additionally, the research highlights limitations of ChatGPT’s voice mode for serious applications, particularly in medical consultations.

The AI co-clinician operates within a framework termed “triadic care,” wherein AI agents support patients under the supervision of doctors, maintaining clinical authority. This collaborative approach aims to enhance patient treatment while ensuring oversight remains in the hands of qualified medical professionals.

To assess the system from a clinician’s viewpoint, researchers collaborated with academic physicians to implement the NOHARM framework, which evaluates two categories of mistakes: errors of commission and errors of omission. In a blind comparison involving 98 primary care queries, doctors favored the AI co-clinician’s responses over leading evidence synthesis tools. The AI co-clinician achieved 67 preferences compared to an existing clinical AI system’s 26, and it outperformed GPT-5.4-thinking-with-search by a score of 63 to 30. Notably, the AI co-clinician made a critical error in only one of the 98 cases evaluated.

The lead was particularly pronounced in medication inquiries. The RxQA benchmark, which includes 600 questions on active ingredients, interactions, and dosages sourced from national drug directories and vetted by licensed pharmacists, posed challenges for primary care physicians. With reference materials, doctors answered 61.3 percent correctly, but this dropped to 48.3 percent without external assistance. The AI co-clinician excelled with a score of 73.3 percent, surpassing GPT-5.4-thinking-with-search, which scored 72.7 percent. The performance gap increased when questions were posed in an open-ended format, typical of real-world searches; here, the AI co-clinician achieved a remarkable quality score of 95.0 percent, compared to 90.9 percent for OpenAI’s model.

In addition to text-based support, Google DeepMind is exploring the AI co-clinician’s capabilities in telemedicine through real-time audio and video interactions. Partnering with physicians at Harvard and Stanford, researchers conducted a randomized simulation study involving 20 synthetic clinical scenarios, 10 doctors acting as patient representatives, culminating in 120 hypothetical telemedicine consultations. The AI co-clinician demonstrated abilities that extend beyond text-only systems, such as correcting a patient’s inhaler technique and guiding patients through shoulder exams to identify rotator cuff injuries.

In patient-facing dialogues, the AI co-clinician employs a dual-agent configuration: a “Planner” module oversees the conversation to ensure the “Talker” agent adheres to safe clinical practices. When utilized by doctors, the system emphasizes solid clinical evidence and conducts verification and citation checks during information retrieval.

Despite these advancements, the study revealed that experienced physicians consistently outperformed the AI co-clinician across 140 assessed aspects of consultation quality, including triage, history taking, clinical reasoning, communication and counseling, treatment steps, recognizing warning signs, and conducting physical exams. The findings suggest that while the AI co-clinician matched or exceeded primary care physicians in 68 of the evaluated areas, it lagged behind seasoned doctors, especially in identifying critical warning signs and executing thorough physical examinations. OpenAI’s GPT-realtime ranked lowest across all seven evaluated domains. The researchers concluded that AI systems like this are best utilized as supportive tools for healthcare professionals rather than substitutes for their clinical judgment.

Moving forward, it remains uncertain whether this research initiative will evolve into a commercially available product. Although the results underscore progress in AI-driven evidence synthesis and telemedicine applications, there remains a clear gap when compared to the expertise of experienced physicians, particularly in safety-critical scenarios. “While it’s early days, the promise is clear,” noted DeepMind researcher Alan Karthikesalingam.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Generative

Experts warn that AI misuse, including deepfakes and misinformation, could undermine Nigeria's 2027 elections, threatening electoral integrity and public trust.

AI Regulation

Socify.ai, launched by TAC Security, onboarded 100 clients in six months, revolutionizing SOC 2 compliance with continuous monitoring and automation.

AI Technology

AMD predicts over 60% revenue growth driven by next-gen consoles and AI data center expansion, potentially elevating stock to $660 within five years

AI Business

IBM unveils agentic AI solutions at Think 2026, promising to enhance retail operations and customer experiences through intelligent, real-time insights and automation.

AI Cybersecurity

UAE faces 700,000 daily cyberattacks, with AI-driven threats from Iran escalating, prompting urgent public awareness and enhanced cybersecurity measures.

AI Generative

Bangalore's AI startups are creating proprietary generative models to penetrate global markets, enhancing competitiveness with tailored local solutions.

Top Stories

DeepMind alumni launch 38 startups across Europe, including David Silver's $1.1B-funded Ineffable Intelligence, reshaping the AI landscape.

AI Tools

X revamps its ad platform with AI tools to counter declining revenues and regain advertiser trust, promising enhanced performance and automation since April 2026.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.