What is Meta's strategy with custom AI chips?

Meta is developing its own custom AI chips, called MTIA, to handle AI inference workloads and reduce reliance on external chipmakers like Nvidia and AMD. They plan to deploy four new generations of these chips by the end of 2027, optimizing performance and cost for their AI infrastructure. While still purchasing chips from Nvidia and AMD, Meta's custom chips are tailored for specific, high-volume tasks.

Why did Meta scrap its AI training chip, Olympus?

Meta scrapped its more advanced AI training chip, Olympus, due to design hurdles, but is still committed to custom silicon. They are focusing on inference workloads first, which are more predictable, allowing them to gain expertise and cost efficiencies before potentially tackling the more demanding training models. This allows them to optimize price per performance across its data center fleet.

What are MTIA 300 chips used for?

Meta has already deployed its MTIA 300 chip for ranking and recommendations within its systems. The upcoming MTIA 400, 450, and 500 chips are designed to handle a broader range of workloads, primarily supporting GenAI inference production in the near future and into 2027. For example, one Meta data center rack will incorporate 72 of the MTIA 400 chips, specifically optimized for AI inference tasks.

Meta's AI Chip Strategy: 4 New Inference Chips by 2027

Q: What does Meta mean by AI inference?

AI inference refers to running AI models to make predictions or recommendations. These tasks often have more predictable computational patterns compared to AI training, which is the process of building the models themselves. Meta is focusing on inference workloads to optimize price per performance across its data center fleet.

Meta Platforms is accelerating its in-house artificial intelligence chip development, announcing plans to deploy four new generations of its custom silicon by the end of 2027. This strategic move, focusing primarily on AI inference workloads, aims to reduce Meta's reliance on external chipmakers like Nvidia and Advanced Micro Devices, streamline costs, and optimize performance for its rapidly expanding AI infrastructure, according to Bloomberg.

Optimizing for Inference and GenAI

The new MTIA roadmap emphasizes inference workloads, which involve running AI models to make predictions or recommendations. These tasks often have more predictable computational patterns compared to AI training, which is the process of building the models themselves and is currently dominated by Nvidia's powerful Graphics Processing Units (GPUs). This is a crucial distinction. Yee Jiun Song, Meta's Vice President of Engineering, explained to CNBC that custom chips allow Meta to "squeeze more price per performance" across its data center fleet.

Meta has already deployed its MTIA 300 chip, which is being used for ranking and recommendations training within its systems. The upcoming MTIA 400, 450, and 500 chips are designed to handle a broader range of workloads. However, Meta's blog post indicates they will "primarily use these chips to support GenAI inference production in the near future and into 2027." For example, one Meta data center rack will incorporate 72 of the MTIA 400 chips, specifically optimized for AI inference tasks.

While the immediate focus is on inference, Meta's long-term vision includes expanding custom chip design to encompass training models as well. Meta CFO Susan Li noted at Morgan Stanley's tech conference earlier this month that the company "eventually" plans for this expansion. This suggests a methodical approach: tackle the more predictable inference workloads first to gain expertise and cost efficiencies, then potentially take on the more demanding training chip development later.

Meta Doubles Down on Inference Chips After Scrapping Training Chip

Optimizing for Inference and GenAI

What This Means For You

Frequently Asked Questions

Related Articles

The Download: OpenAI’s US military deal, and Grok’s CSAM lawsuit

The Download: Quantum computing for health, and why the world doesn’t recycle more nuclear waste

World ID wants you to put a cryptographically unique human identity behind your AI agents

Google details new 24-hour process to sideload unverified Android apps

OpenAI is acquiring open source Python tool-maker Astral

AI is Everywhere, But CISOs are Still Securing It with Yesterday's Skills and Tools, Study Finds

AI Flaws in Amazon Bedrock, LangSmith, and SGLang Enable Data Exfiltration and RCE

Apple Fixes WebKit Vulnerability Enabling Same-Origin Policy Bypass on iOS and macOS