Can Google Escape Nvidia’s Gravity?
If Gemini’s training run proves anything, it is that Google’s in-house silicon is no longer a science project. The bigger question for markets is whether TPUs can bend the economics of AI at scale—and, in doing so, redraw the cloud pecking order.
Google has spent a decade building towards this moment. The company that authored the “transformer” paper and catalysed the generative-AI era is now field-testing a parallel bet: a compute stack built around its own Tensor Processing Units (TPUs), not just Nvidia’s flagship accelerators. With its latest Gemini generation trained on TPUs and deployed across Google Cloud, the firm has signalled a strategic ambition that extends well beyond model releases. It wants to change the cost curve of intelligence.

Google CEO Sundar Pichai. Image: Google
For investors and enterprise buyers alike, the implications are twofold. First, if TPUs prove a cheaper, more power-efficient path to training and inference at scale, the industry’s compute inflation could finally moderate. Second, if Google can translate silicon control into cloud share gains, the hyperscale hierarchy may not be as fixed as it appears.
From Showcase to Supply Chain
AI has been running into two hard constraints: accelerator availability and total cost of ownership. Nvidia’s hardware and CUDA software stack have dominated because they deliver performance and an unrivalled developer ecosystem. That combination has conferred pricing power and made capacity the ultimate gating factor for AI roadmaps.
Google’s counter is vertical integration. By co-designing models, compilers and data-centre infrastructure with TPUs at the centre, the company argues it can deliver comparable performance at a lower unit cost—and do so predictably, because it controls much of the supply chain from datacentre design to scheduling software. For enterprises used to waiting in the queue for H-series capacity, a credible alternative is more than a bargaining chip; it is a way to keep product roadmaps on time.
Crucially, Google is selling TPUs as a cloud service, not merely an internal advantage. That positions TPUs as a demand valve for customers who care less about the badge on the chip and more about throughput per dollar and per kilowatt-hour. If those economics hold in production, TPUs become not just an anti-inflation tool for Google’s own AI spend but a market share lever for Google Cloud.
Economics Will Decide the Winner
Hype will not unseat CUDA. Economics might. Most AI P&Ls are now dominated by two lines: compute and power. Training frontier models soaks capital; serving them at useful latencies consumes opex and grid headroom. To “escape Nvidia’s gravity”, TPUs must demonstrate three things repeatedly and transparently: predictable availability at scale, favourable $/token for training and inference, and credible energy efficiency within real datacentre envelopes.
Google’s pitch is that its system-level engineering—custom interconnects, compiler optimisation and software scheduling—yields higher utilisation and, consequently, better effective economics than like-for-like accelerators. If customers see those savings on their own workloads, a portion of them will re-platform. If they do not, the centre of gravity will remain where the developers already are.
The Software Moat is Real—but Not Immovable
Nvidia’s most durable advantage is not silicon; it is software. CUDA and its surrounding libraries are where years of engineering and community practice live. Google’s answer—principally XLA and JAX, with support for leading frameworks—has matured quickly, but enterprise AI teams are pragmatic: they migrate only when switching costs are outweighed by speed or savings.
That is why Google’s TPU strategy is as much ecosystem as engineering. Porting toolchains, reference architectures, tuned kernels and managed services that reduce the cognitive load of change are essential. So too are partnerships with high-signal model developers and systems integrators who can attest to performance and shorten buyers’ time to confidence. If Google can make “CUDA-adjacent” feel near-native, the moat narrows.
Cloud Competition and Co-opetition
There is a paradox at the heart of this market. Google, Microsoft, Amazon and others all sell Nvidia capacity, even as they race to wean themselves from a single-vendor constraint with their own silicon. Expect this co-opetition to persist. In the medium term, the hyperscalers will run mixed estates: Nvidia for customers with CUDA-bound workloads or specific performance profiles; house silicon where economics or availability demand it.
For Google Cloud, TPUs are a differentiator in two segments. First, AI-native companies that care about scale, predictability and unit costs more than they care about brand loyalty. Second, large enterprises reassessing multi-cloud strategies to de-risk procurement and improve resilience. In both cases, TPU capacity and pricing can be used to win incremental share or to move strategic workloads that anchor broader platform consumption.
Training, Inference—and the Race to Power Efficiency
Even if TPU economics prove compelling for training, inference is where the market will be won. The real cost shock for enterprises is not a single blockbuster training run but millions of daily interactions that must be served at low latency and reasonable cost. Here, energy efficiency becomes decisive—especially as grids tighten and jurisdictions tighten reporting on carbon intensity.
Google’s system-level approach, including networking and cooling design, aims to push more useful work through each watt. If TPUs can consistently deliver lower $/1,000 tokens with acceptable latency for mainstream tasks, CFOs will take note—and so will sustainability committees. That is particularly true for firms deploying agentic systems and retrieval-augmented applications that keep models resident and hot.
What Would “Conquering Dependency” Actually Mean?
Total independence is neither likely nor necessary. “Conquering dependency” in practice would mean three things. First, Google can meet its own model roadmaps without external bottlenecks, allocating between TPUs and third-party accelerators as portfolio economics dictate. Second, Google Cloud can offer enterprise customers a credible choice that insulates them from spot shortages and price spikes. Third, TPU demand is sufficiently broad-based that continued investment in the stack is self-funding and compounding.
To get there, Google must keep doing the unglamorous work: publishing repeatable benchmarks on real workloads, expanding software tooling, hardening migration paths, and securing long-dated supply for its own datacentres. It must also demonstrate that TPUs are not a niche for a few marquee customers but a mainstream option for model training, fine-tuning and inference across industries.
Implications for Nvidia—and for Everyone Else
None of this implies Nvidia is toppled. Far from it. The company’s roadmap, execution and ecosystem depth remain formidable, and demand for accelerators continues to outstrip supply across segments. In a growing market, losing share can still mean growing revenues. But pricing power is not a law of nature. If credible alternatives normalise delivery and economics, the industry moves from scarcity to choice. Margins compress at the edges, and capital allocation gets a little more rational.
For enterprises, that competition is healthy. It promises more predictable access to compute, more resilient supply chains and, over time, a gentler slope for unit costs. For investors, it shifts the question from “who owns the chip du jour?” to “who controls enough of the stack to bend the curve on utilisation and power?”.
The Verdict—For Now
Google has proved that TPUs can train and serve state-of-the-art models and can be productised as a cloud service. Whether that becomes a structural discount to the cost of intelligence—and a catalyst for cloud share gains—will be settled not in keynote demos but in procurement halls and monthly invoices.
The most likely outcome over the next few years is a hybrid equilibrium. Nvidia remains the anchor for a vast share of AI workloads; Google grows a TPU franchise that is large enough to matter and disciplined enough to compound; enterprises arbitrage availability and price between them. If Google’s integration continues to unlock meaningfully lower $/token at scale, that equilibrium tilts.
CFI.co’s take: the market is moving from a single-lane bridge to a dual carriageway. Google does not need to dethrone Nvidia to win. It needs to make AI compute less scarce, less volatile and more economically rational—for its own models and for its customers. If TPUs continue to deliver on that brief, Google will not have escaped gravity so much as rewritten it.
You may have an interest in also reading…
OPIC: Low Write-Offs in Emerging Markets Investments
What organization do you represent? Overseas Private Investment Corporation, the U.S. Government’s development finance institution. In your experience or view,
International Chamber of Commerce: G20 Openness Could Improve
Presenting pre-findings of an ICC Open Markets Index at an international business dialogue in Paris on March 4th, ICC Chairman
Lord Waverley: Drone Industry Needs a Coherent Voice — and Some Interest From Investors
Drones are changing the way in which we interact with one another. From commercial applications to lifesaving transportation, drones are










































































