Connect with us

Hi, what are you looking for?

AI Technology

AMD Advocates Integrated AI Compute with EPYC CPUs for 50% Cost Savings and Enhanced Performance

AMD’s EPYC processors enable firms like Kakao Enterprise to cut AI infrastructure costs by 50% while boosting performance by 30%, redefining compute strategies for AI.

Graphics processing units (GPUs) have emerged as the primary upgrade for companies enhancing their AI systems, particularly in the inferencing stage, where trained models produce outputs from new data. However, semiconductor firm AMD warns that solely depending on GPUs can hinder performance and escalate costs.

In a recent interview with Newsbytes.PH, AMD’s Asia Pacific general manager, Alexey Navolokin, emphasized the growing need for effective coordination among CPUs, GPUs, memory, and networking as AI workloads expand and agentic AI systems shift towards real-world applications.

“Today’s large models operate across clusters of GPUs that must work in parallel and exchange data constantly,” Navolokin explained. He noted that overall performance hinges not only on GPU speed but also on the efficiency with which data is transferred and computation is coordinated across the entire system architecture.

Navolokin pointed out a prevalent misconception that GPUs serve as the singular powerhouse for AI inferencing. He highlighted that modern AI models typically exceed the capacity of a single device, necessitating substantial support from host CPUs to facilitate data movement, synchronization, and latency-sensitive tasks. “A fast CPU keeps the GPU fully utilized, reduces overhead in the inference pipeline, and cuts end-to-end latency,” he stated, adding that even minor reductions in CPU delays can significantly enhance application responsiveness.

Tokenization, the process of converting inputs into numerical units, is heavily reliant on the interaction between CPU and GPU. “Inference runs token by token, and tasks such as tokenization, batching, and synchronization sit directly on the critical path,” Navolokin said. “Delays on the host CPU can slow the entire response.”

Beyond performance, Navolokin argued that optimizing CPU-GPU balance can lead to lower infrastructure costs by increasing GPU utilization and decreasing hardware requirements. “Higher efficiency enables teams to meet demand with fewer CPU cores or GPU instances,” he noted.

He cited a case study involving South Korean IT firm Kakao Enterprise, which reportedly reduced its total cost of ownership by 50% and its server count by 60%, while improving AI and cloud performance by 30% after deploying AMD’s EPYC processors.

The fifth-generation EPYC processors, according to Navolokin, can deliver comparable integer performance to earlier systems while using up to 86% fewer racks, effectively lowering both power consumption and software licensing requirements. He added that the demand for CPUs is exacerbated by the rise of agentic AI systems, designed to plan, reason, and act autonomously.

“These systems generate significantly more CPU-side work than traditional inference,” Navolokin explained. “Tasks such as retrieval, prompt preparation, multi-model routing, and synchronization are CPU-driven.” In these scenarios, the CPU functions as a control node across distributed resources that span data centers, cloud platforms, and edge systems.

AMD is focusing its EPYC processors as host CPUs for these demanding workloads. The latest EPYC 9005 Series boasts up to 192 cores, expanded AVX-512 execution, DDR5-6400 memory support, and PCIe Gen 5 I/O—features designed to support large-scale inferencing and GPU-accelerated systems. Navolokin mentioned that this latest generation shows a 37% improvement in instructions per cycle for machine learning and high-performance computing workloads compared to previous EPYC processors.

He also referenced Malaysian reinsurance firm Labuan Re, which anticipates reducing its insurance assessment turnaround time from weeks to less than a day after migrating to an EPYC-powered AI platform.

As AI deployments extend beyond centralized data centers, Navolokin urged organizations to rethink their infrastructure design. “The priority should not be the performance of a single compute resource, but the ability to deploy AI consistently across heterogeneous environments,” he advised. He underscored the importance of open platforms and distributed compute strategies, noting that real-time inference often runs more efficiently on edge devices or AI PCs closer to data sources.

“Success in inferencing is no longer defined solely by raw compute power,” Navolokin concluded. “It depends on latency, efficiency, and the ability to operate across data center, cloud, and edge environments.”

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Technology

AMD unveils the Ryzen AI Halo Mini-PC, boasting a 16-core Ryzen AI Max+ 395 APU and the capability to process models with up to...

AI Technology

Nebius acquires Eigen AI for $643M to enhance AI infrastructure and efficiency, integrating advanced model optimization technologies for scalable applications.

AI Technology

AMD predicts over 60% revenue growth driven by next-gen consoles and AI data center expansion, potentially elevating stock to $660 within five years

AI Technology

Intel projects Q2 revenue of up to $14.8B, driven by AI demand for its Xeon CPUs, despite a GAAP loss per share of $0.73...

AI Education

Acer's Edu Summit 2026 showcased AI innovations, uniting 12 Asian nations to drive a human-centered educational framework and enhance learning outcomes through technology.

AI Regulation

Trump administration challenges Colorado's forthcoming AI hiring law, backed by Elon Musk, amid rising scrutiny on automated employment practices.

AI Technology

Broadcom's revenue is projected to soar by 35.6%, potentially surpassing Nvidia's growth as the semiconductor market shifts towards custom AI chip solutions.

AI Generative

Revolutionizing OCT analysis, a new 3D multi-modal model enhances retinal diagnosis accuracy by 30%, promising significant advances in AMD management.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.