
Google will unveil a new generation of custom-designed tensor processing units (TPUs) — the company’s in-house AI accelerators, designed as an alternative to Nvidia’s GPUs — at Google Cloud Next in Las Vegas this week, pressing its push into AI inference chips as demand from rival model developers reshapes the semiconductor market.
The launch sharpens Alphabet's challenge to Nvidia in a category where graphics processing units have dominated but are increasingly contested by specialized silicon tuned for running trained models rather than building them. By signaling a growing split between training and inference hardware, Google is betting that the next phase of AI spending favors operators of custom stacks over buyers of general-purpose GPUs.
A widening split between training and inference
Chief Scientist Jeff Dean told Bloomberg it "now becomes sensible to specialize chips more for training or more for inference workloads." Amin Vahdat, who leads Google's AI infrastructure, declined to confirm a dedicated inference chip but said more would be shared "in the relatively near future." Cloud Next messaging will center on generative AI, infrastructure, security, and agentic workflows.
Anthropic, Meta, and the supply squeeze
Momentum has built quickly. Anthropic last October expanded its agreement to access as many as 1 million TPUs, Meta signed a multibillion-dollar deal to use TPUs via Google Cloud, and Citadel Securities and Abu Dhabi's G42 are evaluating deployments. Nvidia countered late last year with an inference-focused chip built on technology from its Groq deal — a non-exclusive IP license and asset purchase that CNBC reported at around $20 billion, though neither company has confirmed the price.
Supply remains the near-term constraint. According to one startup executive cited by Bloomberg, a tension Google must manage as it balances external customers against its own Gemini roadmap.
The Daily Crypto Integrated Newsletter
Stay updated on the latest crypto news across Ethereum, Solana, AI and Macro, distilled into a 2-minute read.
Delivered via Substack.

