Stop talking just about GPUs
A few weeks ago, I listened to Dwarkesh Patel’s interview with Jensen Huang. It’s a great interview and one I recommend listening to if you haven’t already. In it, Jensen said chip-side bottlenecks will be resolved in two or three years. The bottleneck he’s more worried about is energy, and specifically he calls out his concern over the shortage of plumbers and electricians.
This was on my mind when I sat down the other week with Ben Pouladian, founder of BEP Research, an independent research shop covering GPUs, memory, optical interconnects, and data center power as one converging system. His work has become required reading for hedge funds, asset managers, and the engineers building the stack, and has been cited everywhere from the WSJ to sell-side research desks.
Most of the AI infrastructure conversation right now is about GPUs. Who has them and who can get them. But GPUs are just one ingredient in a much more complex supply chain, and I don’t think they’re the most urgent constraint.
“The biggest constraints are energy or electricity, finding powered land,” Ben told me. “And then once you find that powered land, finding the people and the money to help build that data center.”
This is a physical problem, and it’s slower and more operationally complex than buying chips. But if AI is going to scale anything close to what current capex commitments imply, there’s a massive opportunity in building the physical stack underneath it and the software and hardware layers that coordinate it.
The bottleneck is a chain, not a point
PJM, the grid operator covering 67 million people across the mid-Atlantic, just came out with a report about rising demand and constrained supply where they say “we are facing a possible decade-long structural reality where demand growth will continually threaten to outpace supply additions.”
The GPU shortage is an important part of the story, but it isn’t the full story. AI infrastructure comes online as a sequence, not a single event. So the idea that the constraint is a single choke point oversimplifies how the stack actually gets built.
In fact, one of the last things that goes into the data center is the compute hardware. “First, you need to actually build the thing and make sure there’s power and it works,” Ben said.
And even once the GPUs arrive, the bottleneck just moves one layer deeper to memory architecture. Things like High Bandwidth Memory and the KV cache that holds an inference in working memory gate how much intelligence you can extract per watt. A GPU starved of memory bandwidth draws full power but delivers a fraction of the output. So the constraint is also around getting the right memory onto the silicon once it’s in the data center.
Working backwards, this means you need software orchestration, racks, chips, cooling and electrical, construction, community acceptance, permitting, grid interconnection, powered land, and electricity.
Each of these layers has a different production timeline. As Ben put it: “Chips are scarce this quarter. Power is scarce this decade.” Interconnection queues are years long, and grid capacity is already under pressure from electrification, EV charging, and manufacturing reshoring. On top of that, permitting is slow and depends heavily on the speed of local politics.
And then there is the issue of labor. Tradespeople take decades to train. “It’s purely physical, human, blue-collar labor,” Ben said. “You can’t spin it up like an AWS instance.” This is what Jensen meant when he said plumbers and electricians are the most challenging bottleneck right now.
Manufacturing intelligence
Ben kept using the phrases “AI factory“ and “token factory,” borrowing a framing that Jensen Huang has used many times when describing the next generation of data centers.
Traditional data centers hosted software; they stored data and ran enterprise workloads. AI data centers are production plants; they turn energy and data into tokens. “The modern factory is not making metal,” Ben said. “It’s making intelligence.”
And although the output is tokens instead of parts, the production-system questions are the same as any factory: throughput, yield, energy efficiency, utilization, predictive maintenance. The units of output are metrics like tokens per watt or tokens per dollar of capex. These are financial metrics. Every watt of power and every dollar of capex now has a token-denominated yield attached to it. As inference workloads grow faster than training workloads, these questions become more acute.
The need to efficiently convert watts into intelligence becomes even more urgent.
This drives what kind of opportunities need to be built next. Traditional factories spawned entirely new categories of software and hardware. AI factories will need the equivalents, but for tokens-as-output. Almost none of that exists yet.
The opportunity is at the seams
Production systems are complex, and coordination between the layers is mostly manual or opaque. There’s a massive labor shortage on top of that. Wherever coordination is fragmented, slow, or expensive, there’s room for new companies. Software, hardware, and everything in between.
Starting at the bottom of the stack is the need to find viable powered land. And once you find it, the procurement process is often painful. Tapestry, which spun out of Alphabet’s X moonshot factory, is essentially building Google Maps for the grid. It’s a knowledge graph that helps developers and utilities operate at a much higher speed and resolution than they can today.
As an aside, there are bets trying to escape these constraints entirely by moving data centers to space. Google’s Project Suncatcher and the recent SpaceX talks are the most visible. They sidestep some of the problems (land, grid interconnection) but not all of them. Most coverage focuses on launch costs, which would need to fall by an order of magnitude before any of this is viable at scale. But the harder constraint is thermal. In vacuum there’s no air to carry heat away (you can only radiate it) and that physics is what really gates the architecture.
Once you have powered land, the factory itself needs to be built and operated. This is the layer Jensen was pointing at when he talked about plumbers and electricians. Scheduling tradespeople, sequencing trades on site, managing lead times are all still very manual processes. As mentioned earlier and as I’ve written about previously, such as in my conversation with Saman Farid, the founder of Formic, we have a massive labor shortage. Training tradespeople takes decades. We don’t have decades. There’s an opportunity to leverage robotics to do a lot of the manual work that humans have historically done. Companies like Watney Robotics are examples of types of companies I’m very excited about here.
Crusoe is an example of what a vertically integrated AI factory company looks like. They source their own energy, build their own modular data centers, manufacture them in their own facility, and run a cloud layer on top. Every layer of the stack (power, building envelope, cooling, hardware, software) is something they’re either building or coordinating.
Once the factory is running, routing power matters because every watt matters. Power routed to cooling is power not routed to compute. As power becomes the binding constraint, there’s an opportunity to optimize the thermal and electrical envelope in real time.
Above the silicon, the orchestration layer is just as early. GPUs sit idle for a meaningful share of their lives. AMP, an Alphabet-affiliated public benefit corporation, is pooling compute across independent AI labs to smooth utilization across the field. So when one lab is in a training run and another is in deployment mode, the aggregate demand curve is much smoother than any individual workload.
And above the orchestration layer, sits the financial layer. Compute is becoming the most important commodity of the decade. Like oil, we need market infrastructure to enable a liquid market for buyers and sellers of compute to transact. This means we need tooling for GPU pricing, hedging, and financing. As I wrote about here, compute will eventually become a more liquid market where capacity is procured on demand rather than primarily through long-dated bilateral contracts. Ornn is one company I’m excited about that is building in this space.
“The investment surface around power is deep,” Ben said, “and the layers on top of it are almost entirely greenfield.”
Why this matters
This buildout isn’t going to stop next year. “We spent 15 years building regular data centers with CPUs to run regular websites,” Ben said. “This is not the same thing.”
The stack is more physical, more tightly coupled across layers, and built around producing something rather than hosting it. Some of the most important companies of the last generation were born during the cloud buildout. The next generation is getting built now.
The opportunities are in the seams between power and compute, between construction and capital, between watts and tokens.
Author’s note: An LLM was used for light copy editing only (spelling, grammar, and clarity). Content, meaning, tone, and structure remain unchanged.



Was a pleasure speaking with you!
Very insightful. Have you also looked at the innovations in the physical layer of AI infrastructure?