At this year’s Next ’25 conference, Google unveiled “Ironwood”, its seventh-generation Tensor Processing Unit (TPU) — a chip that promises record-breaking performance and is purpose-built to accelerate the “thinking” processes of artificial intelligence. Now, Google Cloud has announced that Ironwood will officially enter production in the coming weeks, designed to power large-scale model training and high-throughput, low-latency AI inference workloads — addressing the surging computational demands of the Agentic AI era.
According to Google, Ironwood delivers over four times the performance of its predecessor, the sixth-generation “Trillium” TPU, in both training and inference workloads. It is the company’s most powerful and energy-efficient custom chip to date, optimized to enhance reasoning and insight generation within AI models, enabling intelligent agents to operate with unprecedented speed and sophistication.
Google Cloud also confirmed a multi-billion-dollar, multi-year partnership with Anthropic, incorporating Ironwood into its infrastructure. The agreement includes access to up to one million TPU units, dedicated to training and running Anthropic’s Claude models — underscoring Ironwood’s role as a cornerstone of the next generation of AI infrastructure.
In its technical brief, Google revealed that each Ironwood pod comprises 9,216 liquid-cooled chips interconnected via an Inter-Chip Interconnect (ICI) network, delivering an astounding 42.5 exaFLOPS of compute performance — nearly 24 times the capacity of El Capitan, the world’s largest supercomputer. Each individual chip boasts a peak performance of 4,614 TFLOPS, making Ironwood a formidable engine for parallel AI workloads at planetary scale.
Google emphasized that TPUs are the beating heart of its “AI Hypercomputer”, an integrated supercomputing ecosystem. Ironwood represents a leap forward in scalability and system-level performance, leveraging ICI bandwidth of up to 9.6 Tb/s to eliminate traditional data bottlenecks — allowing thousands of chips to operate in concert as if they were a single, unified brain.
- Massive Shared Memory: Ironwood’s expanded design enables up to 1.77 petabytes (PB) of shared High Bandwidth Memory (HBM), described by Google as a record-breaking “collaborative workspace” for AI supermodels. This vast shared memory allows even the largest models to be fully loaded, dramatically increasing computational efficiency while reducing total cost of ownership (TCO).
- Optical Circuit Switching (OCS): The inclusion of Optical Circuit Switching technology introduces a dynamic optical fabric capable of rerouting connections instantly in the event of disruption — ensuring uninterrupted operation of mission-critical AI services with enterprise-grade reliability.
Google’s supplementary materials note that a single Ironwood pod delivers 118 times more FP8 exaFLOPS than its “next closest competitor,” showcasing its dominance in dedicated AI computation. Anthropic’s commitment to purchase and deploy up to one million TPUs serves as a resounding endorsement of Google Cloud’s AI infrastructure.
Google added that its own flagship models — including Gemini, Veo, Imagen, and Anthropic’s Claude — are all trained and served on TPUs. The announcement aligns with Google Cloud’s latest earnings report, which highlighted unprecedented demand for AI infrastructure, particularly TPUs, as a key driver of growth.
In parallel with the TPU launch, Google underscored the importance of tight orchestration between general-purpose CPUs and AI accelerators to support Agentic AI workflows. To that end, the company introduced updates to its Arm-based CPU lineup:
- N4A Instances (Axion CPU): Built on Google’s fourth-generation N-series virtual machines (VMs), the new Axion CPU-powered N4A instances are now available in preview.
- Performance: The N4A boasts twice the price-performance ratio of comparable x86-based VMs and delivers an 80% improvement in performance per watt.
- C4A Metal (Bare-Metal): The first bare-metal implementation of the Axion processor, C4A Metal, is also entering preview soon.
Google attributes its AI infrastructure success to its philosophy of “system-level co-design” — the seamless integration of model research, software development, and hardware engineering under one roof. From the first TPU a decade ago, through the birth of the Transformer architecture eight years later, to today’s gigawatt-scale, 99.999% uptime liquid-cooled clusters, Ironwood represents the latest expression of that enduring design vision.