The Silicon Ceiling: Why AI's Biggest Bottleneck Isn't Intelligence - It's Atoms

Mar 14
8 min read

Saturday morning, packing for another trip to Singapore, halfway through a three-hour interview between Dwarkesh Patel and Dylan Patel (I know, I lead an exciting life) - founder of SemiAnalysis and arguably the most cited analyst in AI infrastructure - and I realise I've paused the video four times to type notes and WhatsApp friends. Not because I understand all the detail of semiconductor manufacturing. I absolutely do not. But because the numbers Dylan was dropping made me rethink every assumption I'd been carrying about how fast AI will actually change things. I’ve written recently about the gap between the capability exponential and the adoption exponential being organisational plumbing. This new set of insights adds another plumbing layer to consider. Infrastructure. And its not (all) about energy and GPUs that dominates media commentary right now.

Now I'm a risk person, not a chip person. I’m beginning to understand the importance of certain manufacturing concepts but my fairly basic understanding is precisely why I wrote this piece. If the physical limits of AI matter - and they do - then the rest of us need to understand them, even imperfectly. So consider this me taking you along on the journey I am going on: starting from "I don't know what EUV stands for" and arriving at a framework for understanding how this may play out.

What You'll Learn (and Why Most People Miss It)

This piece lays out a simple framework: the three physical ceilings that will govern how fast AI actually changes your work, your industry, and your career decisions over the next three to five years.

You should care because the gap between AI hype timelines and physical delivery timelines is where fortunes get made, careers get misread, and strategies get built on sand (silicon pun intended).

Unfortunately, almost nobody outside the semiconductor industry is paying attention to this. The conversation stays at the software layer - models, benchmarks, demos - while the hardware layer quietly dictates the pace (guilty as charged, m’lud).

This one will involve a bit of geekery and a bit of maths but stick with it. I promise it’s worth it.

Here's the thesis: AI's limiting factor isn't intelligence. It's memory. Specifically, it's a type of memory chip most people have never heard of, made by three companies, in a supply chain so concentrated it makes OPEC look diversified.

Dylan dropped the data to back this up, and it should be required viewing for anyone making AI strategy decisions. The good news? You don't need a semiconductor engineering degree to grasp what's happening. You just need to follow the numbers.

And a caveat. I’ve taken some of Dylan’s data and layered in some additional maths. Any errors here are my own not his. As I opened with, this blog is to take you along the learning journey I’m on in a way that I hope is useful.

With that, let's dive in.

Three Ceilings Made of Silicon, Copper, and Concrete

For decades, the semiconductor industry followed a comfortable script. Moore's Law delivered predictable improvements. Memory got cheaper. Power consumption stayed manageable. Chip designers worked within known constraints, and the rest of us barely thought about it.

AI has blown up that script. As Patel lays out, the bottlenecks have been migrating like a rolling blackout across the supply chain: CoWoS (Chip-on-Wafer-on-Substrate – I didn’t know either) advanced packaging constraints in 2023, data centre power shortages in 2024-25, and now semiconductor fab capacity (fabrication plant – start talking about fabs and you’ll sound like a semiconductor native) and memory production emerging as the binding constraints through 2028 and beyond.

Side note: you hear the term semiconductor a lot and I want to explain what it means. It simply describes what its doing on the tin – it is a material whose ability to conduct electricity sits between that of a good conductor (like copper) and an insulator (like glass) and can be controlled. In this context the semiconductor in question is silicon, but others such as germanium are used in niche applications. It’s very simple but not always obvious so wanted to leave no reader behind.

Now back to the point. The numbers are staggering. Building one gigawatt of AI compute capacity using NVIDIA's next-generation Rubin chips requires 55,000 wafers of 3nm logic, 6,000 wafers of 5nm chips, and 170,000 wafers of DRAM - roughly two million EUV lithography passes in total. An EUV pass is a single exposure of a wafer to extreme ultraviolet light, which etches one layer of the chip's circuitry. Each chip needs many layers built up on top of each other - a cutting-edge 3nm processor might need over twenty passes, while a memory chip needs fewer but there are far more of them. Think of it like printing a book one colour at a time: each page goes through the press multiple times, and you can only print as fast as the press allows. Each EUV machine processes around 220 wafers per hour, running near-continuously. At that rate, each tool can handle roughly 570,000 passes per year - meaning a single gigawatt of AI capacity ties up about 3.5 EUV machines for an entire year. These machines cost $300-400 million each, and ASML, the sole manufacturer, produces just 60-70 per year. By 2030, the total accumulated stock of EUV tools across the global ecosystem will be approximately 700..

Stick with me here. I know this is maths heavy but the implications are important.

But importantly those 700 machines don't belong to AI. They make chips for the entire digital economy - every iPhone, every AMD processor, every Qualcomm modem, every automotive chip. AI is muscling in on a finite manufacturing capacity that was already fully allocated. Even if the AI industry captured 25-30% of global EUV capacity - an aggressive assumption that would squeeze every other semiconductor customer - that gives you a ceiling of roughly 50-60 gigawatts. Today, OpenAI and Anthropic have about 2 or 3 gigawatts of data centre capacity each. They're targeting 10-15 gigawatts combined by end of 2026. Layer on Google, Amazon, Meta, and Microsoft's own AI ambitions, and you can see how quickly a 50-gigawatt ceiling starts to bind.

Then there's memory. High Bandwidth Memory - HBM - is the component that feeds data to AI chips fast enough for them to actually think. Each next-generation Rubin GPU needs 288 gigabytes of HBM4. A single gigawatt of AI compute capacity - about 3,300 server racks, 240,000 GPUs - requires roughly 70 petabytes of the stuff. Every major supplier - SK Hynix, Micron, Samsung - is sold out through 2026. Memory prices rose 246% in 2025 alone, with another 70% projected for 2026. The HBM market, worth $35 billion in 2025, is forecast to hit $100 billion by 2028 - two years ahead of previous estimates.

To understand why this won't resolve quickly, follow the production maths using that same gigawatt lens. The world's DRAM fabs produce roughly 40 exabytes per year - that's 40 million terabytes, for context - growing at just 10-15% annually. AI-related high-speed memory demand will consume nearly 20% of that global output in 2026, according to TrendForce. But here's the trap: each gigabyte of HBM eats four times the wafer capacity of standard DRAM. So that 20% demand in gigabytes actually consumes closer to half the available manufacturing capacity in wafers. Manufacturing that single gigawatt of Rubin AI capacity requires 170,000 DRAM wafers. The global industry starts roughly 2.25 million wafers per month. OpenAI's Stargate project alone has contracted for 900,000 wafers per month - 40% of global output, for a single customer. Meanwhile, Micron has acknowledged it can currently meet only 55-60% of core customer demand. New fabs - Micron's Boise facility, SK Hynix's Yongin plant - won't produce first wafers until mid-2027 at the earliest.

Having spent a bit chunk of my career at this point watching supposedly temporary supply-chain bottlenecks or disruptions become semi-permanent features of complex systems, I recognise this pattern. When three companies control 95% of global DRAM production and the product they're making requires four times the wafer capacity of conventional memory, "sold out through 2026" doesn't mean "available in 2027." It means every phone, laptop, and gaming console is now competing with AI for the same silicon.

Think of it this way: AI software is a Formula 1 car. HBM is the fuel. And right now, there are three petrol stations on the planet, all with queues stretching round the block.

And just when you think the supply chain might catch up, the power bill arrives. Something I’ve written about in the context of the UK, but now let’s look through a global lens.

The Gap Between Ambition and Atoms

The big four hyperscalers - Google, Amazon, Meta, Microsoft - are spending a combined $600 billion on AI infrastructure this year. The total supply chain capital expenditure is approaching $1 trillion annually. Google alone is committing $180 billion; Amazon, $200 billion. Anthropic's January revenue was $4 billion. By February, it was $6 billion.

So the demand is real. But that isn't the full picture. PJM Interconnection, the largest US grid operator serving 65 million people, projects it will be six gigawatts short of reliability requirements by 2027 - roughly the electricity demand of Philadelphia. US data centre IT load could grow from 80 gigawatts in 2025 to 150 gigawatts by 2028. Power constraints are extending data centre construction timelines by 24 to 72 months.

Now layer the human dimension on top. Harvard Business Review reported in February 2026 that while 88% of companies claim regular AI use, 80% of employees harbour significant AI-related anxiety. Sixty-five percent worry about being replaced by someone more skilled with AI. Perhaps most telling: employees with the highest AI anxiety actually use AI more than their relaxed colleagues - 65% of their job is AI-assisted versus 42% - but show 2.2 times greater resistance. Fear drives compliance, not commitment. Usage isn't adoption.

As I’ve written about before, the technology adoption operates at the speed of plumbing – procurement, legal, compliance, risk et al. The human integration moves at the speed of trust - and trust doesn't scale with Moore's Law.

A Better Map: The Three-Horizon Reality Check

Instead of asking "when will AI transform everything?", try asking three better questions.

Horizon One (Now to 18 months): What can you do with the chips that already exist? The H100, despite being three years old, is worth more today than at launch because software optimisation has tripled its token throughput. The immediate opportunity is in extracting more value from what's already deployed.

Horizon Two (18 months to 3 years): Where will the bottleneck be when your plans mature? If you're building an AI strategy today that depends on abundant, cheap compute by 2028, you're building on sand. Memory prices are rising, not falling. Power connections take two to six years to provision. Plan for scarcity, not abundance. This is where the commercial implications get interesting. Nascent compute commodity markets - platforms like Ornn are already tracking live spot prices for GPU compute across H100, H200, and B200 hardware - are beginning to offer cash-settled forward contracts that let companies lock in prices and hedge against volatility. But the market remains immature and opaque. If you're an enterprise planning to consume significant AI compute in 2028, you may need to think less like a software buyer and more like an airline hedging jet fuel: securing forward capacity with hyperscalers, negotiating multi-year commitments, or even taking positions in the emerging compute derivatives market. The companies that treat compute as a utility they can summon on demand will be the ones caught short.

Horizon Three (3-5 years): What does asymmetry look like? The organisations that navigate this well won't be the ones with the most compute. They'll be the ones that match their AI ambitions to physical reality - smaller models, better fine-tuning, edge deployment, human-AI collaboration models that don't require a gigawatt per use case.

The Bottom Line

The old framing: AI progress is limited by algorithms and data. Build better models, everything accelerates.

The new reality: AI progress is governed by atoms - silicon wafers, memory chips, copper power cables, and concrete data centres. And behind those atoms sit human beings who need more than a login and a prompt to genuinely change how they work.

I know this blog has been… intense. But this topic required me to update my “speed of plumbing” comments. I would modify my position to this: the AI revolution is real, but it will arrive at the speed of infrastructure, not the speed of imagination. And that's probably a good thing - because it gives us time to do the harder work of genuine adoption rather than performative compliance.

Until next time, you'll find me checking the memory prices before the model benchmarks.

therealityof.ai