Google's Cloud Next 2026 keynote? Fire. đ„
The TPU is now two chips instead of one â 8t for training, 8i for inference â but more interestingly, it's two scale-up networking topologies too.
Austin Lyons (Chipstrat) and Vik Sekar (Vik's Newsletter) walk through what actually changed, one day after the announcement. OCS? Yes. AECs? Yep. Copper? Yep. Optics? Yep.
We cover Virgo (Google's 47 petabit/second scale-out fabric, built entirely on OCS), Boardfly (the new scale-up topology for MoE inference that cuts hop count from 16 to 7), and the 3D torus Google still uses for training.
Why is optical circuit switching the substrate of Google's data center? Why do active electrical cables still carry scale-up traffic inside racks? Why did Google split the CPU layer too, with custom ARM Axion head nodes to keep the TPUs fed?
Along the way we trace the Dragonfly topology lineage to a 2008 paper by John Kim, Bill Dally, Steve Scott, and Dennis Abts. Abts went on to build Groq's rack-scale interconnect before landing at Nvidia.
Chapters:
 0:00 Intro
 0:21 Two TPUs for two workloads
 2:31 HBM, SRAM, and Axion CPUs
 7:22 Why networking is the new bottleneck
 17:14 Virgo: rebuilding scale-out on optics
 25:24 3D torus Rubik's Cube scale-up for training
 34:50 Boardfly: scale-up for MoE inference
 42:07 Workload-specific everything
Follow Chipstrat:
Newsletter: https://www.chipstrat.com
X: https://x.com/austinsemis
Follow Vik:
Newsletter: https://www.viksnewsletter.com/
X: https://x.com/vikramskr