Researchers at Flinders University in Australia are investigating the potential of vision-enabled AI systems, such as smart glasses, to transform healthcare documentation. Their study suggests that these technologies could significantly enhance the accuracy of doctors’ notes compared to traditional audio-only AI scribes, particularly in capturing visual elements of medical consultations.
Published in npj Digital Medicine in February, the study found that AI systems capable of analyzing video recordings of medical appointments led to improved documentation accuracy. While audio-based AI scribes have become increasingly prevalent, the exploration of multimodal AI—which incorporates both visual and auditory data—marks a notable advancement in health-tech capabilities.
In the study, ten clinical pharmacists simulated medical interviews while wearing Meta Ray-Ban smart glasses, recording video that was later analyzed by an AI scribe built on Google’s Gemini Pro 2.5 model. The pharmacists presented personal information and details about mock medications, allowing the AI to process both the spoken word and visual cues, such as medication packaging.
The results were striking: the vision-enabled scribe achieved an impressive 98 percent accuracy in documenting over 2,000 data points, a marked improvement over the 81 percent accuracy of the audio-only scribe. Notably, the advanced AI system outperformed its counterpart in capturing medication dosing details, achieving 97 percent accuracy compared to just 28 percent for the audio version.
“A lot of clinically important information is visual,” said study author and academic pharmacist Bradley Menz. He emphasized the value of visual cues in enhancing patient care, such as observing medicine containers, prescriptions, and even patients’ body language. According to Menz, a dual-capability AI system could help capture “more of the details that matter” during consultations.
Co-author and associate professor Ashley Hopkins noted that the integration of visual analysis into AI scribing could reduce the time clinicians spend editing documentation, allowing them to focus more on patient interactions. “These findings suggest the next step may be that all scribe systems can interpret visual information as well as speech, which could open the door to wider clinical uses,” he stated.
Despite the promising accuracy rates, the study also highlighted significant concerns regarding patient privacy and data security. The researchers underscored the need for robust safeguards, particularly given the sensitive nature of video recordings taken during clinical encounters. The visual documentation of a patient’s health concerns could potentially lead to discomfort and might inhibit patients from sharing sensitive information.
As outlined in the study, engaging patients and stakeholders in implementation planning is crucial. This includes developing informed consent processes and considering alternative methods, such as capturing still images instead of continuous video, to mitigate privacy concerns. Despite achieving a high accuracy rate, the vision-enabled scribe still made 46 errors, emphasizing the necessity for clinician oversight in the medication history process.
“This is an augmented tool, not a replacement for clinical judgment,” Menz stressed. He added that clinicians must review and approve AI-generated documents to maintain high standards of care. The researchers further noted the importance of “robust human-in-the-loop processes” to prevent healthcare providers from becoming complacent and over-relying on AI technology.
In response to the growing integration of AI in healthcare, Australia’s medical regulator, the Therapeutic Goods Administration (TGA), announced in September 2025 that it is enhancing regulatory efforts surrounding AI scribes. This move follows calls for increased oversight as these technologies become more widely adopted in clinical settings, underscoring the need for careful management of emerging health-tech tools.
See also
Sam Altman Praises ChatGPT for Improved Em Dash Handling
AI Country Song Fails to Top Billboard Chart Amid Viral Buzz
GPT-5.1 and Claude 4.5 Sonnet Personality Showdown: A Comprehensive Test
Rethink Your Presentations with OnlyOffice: A Free PowerPoint Alternative
OpenAI Enhances ChatGPT with Em-Dash Personalization Feature

















































