Connect with us

Hi, what are you looking for?

AI Technology

Meta Unveils Four New MTIA Chips with 25x Compute Gains for AI Inference

Meta unveils four new MTIA chips, achieving up to 25x compute gains for AI inference, aimed at revolutionizing its AI ecosystem and reducing Nvidia reliance.

Meta announced the development of four new generations of its in-house chips, the Meta Training and Inference Accelerator (MTIA), designed in collaboration with Broadcom. These chips, slated for deployment over the next two years, aim to enhance the company’s AI capabilities significantly. “We’ve developed a competitive strategy for MTIA by prioritizing rapid, iterative development,” Meta stated in its press release, highlighting an inference-first focus and seamless adoption through native support for industry standards.

The four models unveiled are the MTIA 300, 400, 450, and 500. The MTIA 300 is already in production for ranking and recommendations training, while the MTIA 400 is undergoing lab testing prior to its data center deployment. The MTIA 450 and 500 are positioned for AI inference, with mass deployment expected in early and late 2027, respectively. Meta’s technical blog outlines significant advancements across these models, including a 4.5 times increase in HBM bandwidth and a 25 times uplift in compute FLOPs from MTIA 300 to MTIA 500.

Meta asserts that the MTIA 450 doubles the HBM bandwidth of the MTIA 400, claiming it surpasses the capabilities of existing top-tier commercial products, including Nvidia’s H100 and H200. The MTIA 500 is expected to provide an additional 50% increase in HBM bandwidth compared to its predecessor, alongside up to 80% more HBM capacity. This distinction is crucial, as HBM bandwidth is identified as the chief bottleneck during the decode phase of transformer inference. Current mainstream GPUs are designed to maximize FLOPs for large-scale pre-training, which incurs costs and power overhead that Meta argues are superfluous for inference tasks.

The MTIA chips are characterized by distinct performance metrics. The MTIA 300 focuses on ranking and recommendations, with a thermal design power (TDP) of 800 W and an HBM bandwidth of 6.1 TB/s. The MTIA 400 has a TDP of 1,200 W and an HBM bandwidth of 9.2 TB/s, while the MTIA 450 and 500 feature TDPs of 1,400 W and 1,700 W, respectively, with HBM bandwidths of 18.4 TB/s and 27.6 TB/s. The peak performance also scales, with the MTIA 500 boasting up to 30 PFLOPS.

Moreover, Meta’s approach incorporates hardware acceleration for FlashAttention and mixture-of-experts feed-forward network computations, alongside custom low-precision data types specifically designed for inference. The MTIA 450 supports MX4 performance, delivering six times the MX4 FLOPs of FP16/BF16, effectively reducing the software overhead associated with data type conversion.

In terms of deployment, the MTIA 400, 450, and 500 will utilize a common chassis, rack, and network infrastructure. This modularity allows for easier interchangeability of chip generations, contributing to an accelerated chip cadence of approximately six months, outpacing the industry’s typical one- to two-year cycles. The software stack for these chips is compatible with popular frameworks such as PyTorch, vLLM, and Triton, permitting simultaneous deployment of production models on both GPUs and MTIA without the need for MTIA-specific adjustments.

Currently, Meta has already deployed hundreds of thousands of MTIA chips across its applications for inference on organic content and advertisements. This announcement comes shortly after the company revealed a long-term, $100 billion AI infrastructure partnership with AMD. This suggests a broader strategy to reduce reliance on Nvidia across various components of Meta’s AI ecosystem while maintaining the MTIA chips as the foundation for its inference workloads.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

Top Stories

Broadcom and AMD secure lucrative AI chip partnerships with Meta and Alphabet, anticipating significant revenue growth as demand surges in the sector.

Top Stories

Meta's 2026 smart glasses, praised by Wired as the best in the market, blend style with AI features, positioning the company ahead of Apple...

Top Stories

Nvidia aims for a $20 trillion valuation by 2030, driven by projected $930 billion data-center sales and a surge in AI infrastructure demand.

AI Finance

Meta partners with Broadcom to develop custom AI processors, committing over 1 gigawatt of computing capacity to enhance AI features across its platforms.

Top Stories

Mistral AI secures a $14 billion valuation by promoting sovereign, open-weight models that empower local control over data in the evolving global AI landscape

AI Technology

Tech stocks soar as Oracle, AMD, and Microsoft hit critical benchmarks, driven by booming AI infrastructure spending and robust gains in the semiconductor sector.

AI Regulation

EU Commission mandates Meta to restore third-party AI assistants' access to WhatsApp after ruling restrictions violate antitrust laws, risking competition.

AI Technology

Broadcom launches VMware Tanzu Platform for secure AI application management, expanding into AI software infrastructure post-VMware acquisition to boost enterprise adoption.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.