Connect with us

Hi, what are you looking for?

AI Research

Classical Machine Learning Surpasses Foundation Models in Medical Classification Study

Traditional machine learning models outperformed advanced foundation models in medical classification tasks, challenging the dominance of AI in healthcare according to recent research.

Recent research from a team of scientists, including Meet Raval of the University of Southern California, Tejul Pandit from Palo Alto Networks, and Dhvani Upadhyay from Dhirubhani Ambani University, has highlighted the complexities surrounding the use of artificial intelligence in medical classification. Their comprehensive evaluation focused on comparing traditional machine learning techniques against modern foundation models, revealing that classical methods often outperform their advanced counterparts across various medical datasets. The study utilized four publicly available datasets, encompassing both text and image modalities, emphasizing the need for effective adaptation strategies when employing these powerful AI tools in critical healthcare applications.

The research unveiled a surprising trend: traditional machine learning models, such as Logistic Regression and LightGBM, consistently achieved superior performance in most medical classification tasks. In particular, structured text-based datasets showcased the classical models’ efficiency, which exceeded that of both zero-shot Large Language Models (LLMs) and Parameter-Efficient Fine-Tuned (PEFT) models. The results challenge the prevailing assumption that foundation models are universally more effective. The findings indicate that the performance of these advanced models, such as those using the Gemini framework, is contingent upon substantial fine-tuning and adaptation.

Central to this investigation was a rigorous benchmarking system that methodically assessed performance across text and image modalities. The researchers made a point to maintain consistent data splits and evaluation criteria to ensure a fair comparison of model efficacy. By scrutinizing three distinct model classes for each task—classical machine learning models, prompt-based LLMs/VLMs, and fine-tuned PEFT models—the study provided a robust evaluation framework. Notably, the models utilizing LoRA-tuned Gemma variants consistently underperformed, emphasizing that minimal fine-tuning can be detrimental to their generalization capabilities.

In their analysis, the zero-shot LLM/VLM pipelines employing Gemini 2.5 yielded mixed results. Although these models struggled with text-based classification, they demonstrated competitive performance on multiclass image categorization, particularly matching the baseline established by the traditional ResNet-50 model. This dichotomy underlines the nuanced strengths and weaknesses of both classical and contemporary models, further illustrating that established machine learning techniques remain a reliable option for medical categorization tasks.

The design of this study is particularly noteworthy for its innovative approach to benchmarking. By addressing limitations in existing methodologies—such as inconsistent evaluation rigor and the lack of cross-modality alignment—the researchers created a detailed framework for assessing model performance. The inclusion of both binary and multiclass tasks, especially in the realm of medical imaging, broadened the evaluation’s scope beyond simpler classification challenges and revealed the traditional models’ enduring relevance in medical contexts.

As the team continues to analyze the implications of their findings, they underscore the importance of effective adaptation strategies in utilizing foundation models. Their work not only offers crucial insights for practitioners in medical AI but also highlights the necessity of thorough evaluation criteria when selecting modeling approaches for healthcare applications. The study advocates for a more nuanced understanding of the conditions under which advanced AI models can be considered viable alternatives to classical methods.

The ongoing exploration into the optimization of PEFT strategies, as well as the potential of multimodal AI in complex medical classification tasks, remains a crucial area for future research. By fostering a comprehensive understanding of how various models perform across diverse data types, the study aims to enhance diagnostic accuracy and ultimately improve patient care outcomes. This research reinforces the idea that while artificial intelligence offers exciting possibilities for healthcare, it is not a panacea, and traditional methods still play a vital role in the landscape of medical classification.

👉 More information
🗞LLM is Not All You Need: A Systematic Evaluation of ML vs. Foundation Models for text and image based Medical Classification
🧠 ArXiv: https://arxiv.org/abs/2601.16549

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Cybersecurity

Palo Alto Networks reveals AI-driven strategies that cut cybersecurity response times by 50%, highlighting urgent needs amid escalating cyber threats.

AI Cybersecurity

CrowdStrike's faulty software update left 8.5 million PCs inoperable, triggering a 20% stock drop and raising concerns about the stability of cybersecurity investments.

AI Cybersecurity

First Trust's Nasdaq Cybersecurity ETF surpasses $11B AUM with 32 top tech holdings, reflecting a tripling in value since its 2015 launch.

AI Cybersecurity

CrowdStrike's AI-native Falcon platform drives a remarkable 120% ARR growth to $1.69 billion, challenging Palo Alto Networks' broader cybersecurity strategy.

AI Tools

Google addresses a High-risk AI vulnerability in Gemini linked to Chrome, while Microsoft boosts Copilot security with new data protection controls.

AI Cybersecurity

Cyber attacks now escalate to data exfiltration in just 72 minutes, driven by AI, as the OpenClaw NPM bypass exposes critical vulnerabilities.

AI Cybersecurity

Anthropic's launch of Claude Code Security triggers an 8% drop in cybersecurity stocks, wiping billions from market valuations as AI disrupts the sector.

Top Stories

Shares of JFrog plummet 24% and other cybersecurity firms decline sharply as Anthropic unveils Claude Code Security tool for identifying software vulnerabilities.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.