Arslan Munir, Ph.D., co-author and associate professor in the FAU Department of Electrical Engineering and Computer Science. (Photo credit: New York University, Abu Dhabi)
As the demand for high-performance artificial intelligence (AI) systems continues to surge, researchers from the Florida Atlantic University (FAU) College of Engineering and Computer Science have made significant strides in addressing the cooling challenges faced by modern data centers. The research, prominently featured by NVIDIA, illustrates how direct-to-chip liquid cooling can markedly enhance the performance and energy efficiency of high-density GPU clusters.
Traditional air-cooling systems are increasingly struggling to meet the stringent thermal requirements of GPU clusters, which are essential for training and deploying large-scale AI models. As these systems heat up, performance can suffer, energy consumption rises, and operational costs escalate. The FAU study, titled “Comparison of Air-Cooled Versus Liquid-Cooled NVIDIA GPU Systems,” reveals an innovative solution that could change the landscape of AI infrastructure.
See also
Roadrunner Launches AI Software to Cut Missed Pickups Below 0.5% for LTL ShippingUtilizing NVIDIA HGX™ H100 GPU systems, the research demonstrates that liquid-cooling technology can yield up to a 17% increase in computational throughput while simultaneously reducing node-level power consumption by 16%. This translates to potential annual savings ranging from $2.25 million to $11.8 million for AI data centers with a scale of 2,000 to 5,000 GPU nodes.
“Our research shows that liquid cooling fundamentally unlocks higher sustained performance, superior energy efficiency, and the thermal headroom needed to push AI workloads to their full potential,” stated Arslan Munir, Ph.D., lead author and associate professor of Electrical Engineering and Computer Science at FAU. This innovation not only enhances performance but also offers a pathway to more sustainable AI operations, a crucial consideration given the substantial energy demands of AI systems.
Significance of Liquid Cooling in AI Infrastructure
The implications of this research emerge at a critical juncture, as U.S. investments in AI and data-center infrastructure are projected to exceed $400 billion in the coming years. These investments are aimed at establishing specialized data centers or “AI factories” that will be pivotal in sectors like healthcare, national security, transportation, and climate research. However, the environmental impact of these facilities remains a pressing concern due to their vast energy consumption.
“By directly confronting the thermal and power-efficiency bottlenecks that define today’s large-scale GPU clusters, this research is charting a clear path toward truly sustainable, high-performance computing,” emphasized Stella Batalma, Ph.D., dean of the College of Engineering and Computer Science. The findings suggest that innovative cooling solutions can not only lower operational costs but also contribute to a reduced carbon footprint in the rapidly expanding AI sector.
Key findings from the FAU-led study highlight several advantages of liquid cooling:
- Thermal Advantage: Liquid cooling maintained GPU temperatures between 46°C to 54°C, compared to 55°C to 71°C for air cooling.
- Performance Gain: Up to 17% higher computational throughput and 1.4% faster training times for large AI workloads.
- Energy Efficiency: An average reduction of 1 kilowatt per server node, translating to 15% to 20% lower facility-level energy use.
- Operational Impact: Potential annual energy-cost savings of up to $11.8 million for large AI data-center deployments.
- Sustainability: Liquid cooling effectively redistributes thermal loads, allowing for better energy management and reduced environmental impact.
This research positions FAU at the forefront of intelligent systems and sustainable computing, with Munir’s lab focusing on areas such as AI hardware acceleration and hybrid quantum-classical computing. “Energy-efficient AI infrastructure isn’t just an engineering optimization – it’s a national imperative,” Munir concluded, highlighting the broader significance of these advancements in the context of sustainability and operational efficiency.
Collaborators on this pivotal study included experts from Johnson Controls and Lawrence Berkeley National Laboratory, alongside FAU doctoral researchers Hayat Ullah and Ali Shafique from Kansas State University. Their work contributes vital insights into developing more efficient AI systems that align with both technological and environmental goals.
For those interested in detailed insights, the white paper “Comparison of Air-Cooled Versus Liquid-Cooled NVIDIA GPU Systems” is available for review.
-FAU-


















































