The rise of artificial intelligence (AI) is significantly reshaping the landscape of rare disease epidemiology. By leveraging machine learning (ML) and deep learning (DL) techniques, researchers can efficiently analyze extensive literature to glean vital data regarding epidemiological trends and treatment efficacy. One such condition that exemplifies this transformation is essential tremor (ET), a prevalent neurological disorder characterized by involuntary, rhythmic shaking, primarily affecting the hands, but also potentially involving the head, voice, and other body parts. ET is linked to abnormal cerebellum functioning and often has genetic predispositions, with familial patterns identified in numerous cases. Despite affecting approximately 1% of the global population, ET is frequently misdiagnosed, undiagnosed, and undertreated, complicating life for many individuals in the workforce.
This article explores the multifaceted advantages of AI in managing ET, detailing how these technologies can enhance accuracy and speed in data curation (section 1) while also identifying patterns for early diagnosis and assessing disease severity based on patient characteristics (section 2). Through the lens of ET, we uncover AI’s potential to illuminate the complexities surrounding this disorder, ultimately leading to improved patient outcomes.
A New AI Tool for Rare Disease Data Curation
Globally, around 400 million individuals are living with a rare disease. To aid in understanding these conditions, Clarivate Epidemiology employs the Incidence & Prevalence Database (IPD) and Epidemiology teams to conduct targeted literature searches for epidemiological data. However, many rare diseases suffer from a lack of available data due to the constraints of manual curation processes. This method is not only time-consuming but also requires specialized knowledge and integration of data from diverse formats such as medical records and research articles. AI offers a scalable, efficient, and reliable solution, swiftly identifying patterns and correlations that may elude manual efforts.
AI tools like self-supervised image search for histology (SISH), DL algorithms, and ML models excel at analyzing complex datasets, enhancing diagnostic accuracy for rare diseases. A notable advancement is EpiPipeline4RD, developed by W.Z. Kariampuzha et al., which automates the extraction of epidemiological data from rare disease literature. This pipeline has demonstrated high precision and recall for extracting vital data, achieving results comparable to Orphanet’s collection model. The advantages include faster data processing and improved accuracy in disease prediction, underscoring its alignment with the United Nations’ resolution aimed at enhancing rare disease data collection.
See also
Levi’s Partners with Microsoft to Launch AI-Driven Retail Innovations in 2026The Data Science Behind EpiPipeline4RD
The EpiPipeline4RD incorporates a new epidemiology dataset for Named Entity Recognition (NER) and a fine-tuned DL framework known as BioBERT for efficient data extraction. Its integration with ES_Predict assists in identifying epidemiological studies, employing APIs from the European Bioinformatics Institute (EBI) and the National Center for Biotechnology Information (NCBI) for PubMed article retrieval, all while applying strict filters to minimize false positives. The pipeline’s full implementation and supplementary data are accessible on GitHub, with the fine-tuned models downloadable via Hugging Face.
Challenges in Authenticity
Despite its advancements, David Lapidus highlights that while EpiPipeline4RD significantly outperforms Orphanet in quickly summarizing prevalence studies, it is limited to analyzing PubMed abstracts, potentially overlooking critical information within the full texts. This limitation could skew data interpretation. For instance, the AI tool failed to identify certain pivotal studies, such as Baujat’s 2017 research on fibrodysplasia ossificans progressive (FOP), a recognized standard in the field.
Understanding Essential Tremor
ET, often overlooked, has considerable implications on daily activities and quality of life. It affects about 1% of the global population, amounting to approximately 24.91 million individuals in 2020, with around 7 million in the U.S. alone. During the period from 2015 to 2019, around 1 million individuals sought treatment for ET, and in 2023, Clarivate’s Real World Data (RWD) indicated over 410,000 diagnosed cases in the U.S.
The incidence of ET rises with age, from 0.04% in those under 20 to 2.87% in individuals aged 80 and older. Risk factors for worse disease progression include being female, lacking a familial history, and presenting with a rest tremor at baseline. Comorbidities are common, with essential hypertension occurring in about 69% of ET cases, followed by hyperlipidemia in nearly 50%. Other frequently reported conditions include type 2 diabetes, anxiety disorders, and gastroesophageal reflux disease (GERD) without esophagitis.
Machine Learning in Diagnosis
Currently, ET diagnosis relies on clinical symptoms and neurological assessments. However, ML can enhance diagnostic accuracy by analyzing patient data to identify probable ET cases, which can then be validated against existing literature. Recent studies employing grey matter morphological networks and ML models have successfully differentiated between ET, dystonic tremor, and healthy controls. The Random Forest classifier demonstrated the best classification performance in this context, achieving a mean accuracy of 78.7% across a three-class classification task.
Genetic studies indicate a hereditary component to ET, with certain loci and polymorphisms potentially influencing its development. By leveraging historical patient data, ML algorithms can learn to identify early signs of ET. A study published in Open Life Sciences utilized ML to screen microRNAs, identifying potential biomarkers for ET, highlighting genes such as APOE, SENP6, and ZNF148 as differentially expressed in ET patients.
Assessing Severity with Intelligent Devices
Traditionally, ET severity has been evaluated using clinical observation and established rating scales like the Fahn–Tolosa–Marin Tremor Rating Scale (FTM-TRS). However, ML offers a more objective, quantitative approach. Technological advancements, including smartphones and smartwatches, are now being utilized for tremor measurement. A recent study indicated that sensor data collected via smartphones could improve assessment accuracy, yielding mean absolute error reductions of 78-81% compared to linear models.
Recent advancements in tremor research, including a new classification system introduced by the International Parkinson’s and Movement Disorders Society (IPMDS) in 2018, present ET as a syndrome rather than a singular condition. This nuanced understanding, combined with AI’s capabilities in data analysis, holds promise for revolutionizing how ET is understood and treated, allowing for more effective patient management and intervention.
As AI continues to evolve in the realm of rare diseases, its ability to process vast datasets and identify overlooked patterns could lead to unprecedented advancements in epidemiological research. However, the integration of AI tools like EpiPipeline4RD must be approached with caution to ensure data integrity and completeness.
Connect with an expert to learn more about DRG Epidemiology Intelligence.
Swarali Tadwalkar contributed to this article.
















































