Skip to main content
Compiling 3D campus…
Forge · v.1 · Interactive Build Guide

From bitsto atoms.Build the physical infrastructure of AI.

An interactive deep-dive into the racks, megawatts, gallons, and gigabits that turn silicon into intelligence. Based on first principles — orbit the campus, explode the rack, calculate the BOM, and trace every kilowatt.

GPT-4 training
≈25k
A100s · ~3 mo
Frontier model 2025
100k+
GPU cluster
Hyperscaler '25 capex
$320B
+45% YoY
US DC electricity
4.0%
183 TWh / yr
Scroll
Section 01 · Architecture

How a data center actually works

Eight subsystems, one machine. Hover any block to see the role it plays — and the first-principles constraint that determines its size, cost, and location.

HOT AISLECOLD AISLE8 servers · 8 GPUs/server · ~40-120 kW/rack
GPU Racks & Servers
First Principles

GPU Racks & Servers

The compute. 1,000-1,500 racks arranged in hot-aisle/cold-aisle pairs.

An H100 DGX rack is ~40 kW. A Blackwell GB200 NVL72 rack is 120 kW — three times denser. That step change forced the entire industry to liquid cooling.

Reference benchmark120 kW per GB200 rack
Section 02 · Subsystems

The four hardest problems

Power. Water. Density. Heat. Each one binds independently — and each one drives the cost, location, and timeline of every AI data center built today.

Animated power flow

Utility → Transformer → UPS → Rack

Toggle redundancy and simulate a path failure. In a 2N topology, every component has a complete twin — kill one path and the other carries 100% of load with zero blip.

GRID230 kVTX-A34.5 kVSWG-AUPS-ATX-B34.5 kVSWG-BUPS-BRACK100 MW ITGEN-N+1Both A and B paths live · zero blip
IT Load
100 MW
Total Capacity
232 MW
Redundancy Cost Δ
+38%
First principle

Power is the binding constraint

A 100 MW data center needs an interconnect that takes 4+ years in PJM today. You can pour concrete in 4 months; you can't pour a substation. That's why hyperscalers buy land based on grid headroom and transmission queues, not real estate price.

First principle

Tier rating is a systems-thinking proof

Tier III = concurrently maintainable. Tier IV = fault-tolerant (2N). Tier rating doesn't certify any single piece of equipment — it proves the architecture survives any single failure without dropping load. The proof is in the commissioning sequence.

First principle

Generators are ride-through, not primary

EPA Tier 4 standards cap diesel generator runtime to ~100 hours/year. They exist to bridge utility outages until grid recovery or to graceful shutdown — not to power the load. If you're running gens monthly, your grid isn't fit-for-purpose.

Section 03 · BOM

Build a $5B machine, line by line

100 MW AI campus. Adjust quantities, watch the lead-time bottlenecks shift in real time. Based on Brian Potter's BOM breakdown and 2025 hyperscaler pricing.

Components

IT Equipment
$673M
NVIDIA H100 80GB SXM5bottleneck
$28,000 / GPU · 26wk lead
$448M
NVIDIA GB200 NVL72bottleneck
$3,000,000 / Rack-system · 40wk lead
$180M
Dual-socket CPU compute server
$22,000 / Server · 16wk lead
$26M
All-flash NVMe storage array
$380,000 / PB · 18wk lead
$19M
Networking
$50M
InfiniBand NDR 400G switch (Quantum-2)bottleneck
$75,000 / Switch · 22wk lead
$30M
Active optical cable 400G
$850 / Cable · 14wk lead
$20M
Electrical
$139M
Utility-grade transformer 230kV/34.5kVbottleneck
$8,500,000 / Unit · 156wk lead
$34M
Medium-voltage switchgear (15kV)bottleneck
$1,200,000 / Lineup · 60wk lead
$9.6M
Lithium-ion UPS system
$280,000 / MW · 32wk lead
$31M
2.5 MW diesel standby generatorbottleneck
$950,000 / Unit · 48wk lead
$46M
Overhead busway 800A
$480 / Linear ft · 20wk lead
$8.6M
Rack PDU (60A 3-phase)
$4,200 / PDU · 10wk lead
$10M
Mechanical / Cooling
$89M
Coolant Distribution Unit (CDU)
$380,000 / Unit · 36wk lead
$30M
Centrifugal chiller 1,500 ton
$1,400,000 / Unit · 52wk lead
$31M
Counter-flow cooling tower 2,000 ton
$680,000 / Unit · 38wk lead
$12M
Computer-Room Air Handler
$95,000 / Unit · 18wk lead
$5.7M
End-suction process pump 800 GPM
$42,000 / Unit · 22wk lead
$1.7M
Carbon steel chilled water pipe (8")
$320 / Linear ft · 12wk lead
$7.7M
Building Shell
$98M
Tilt-up concrete data hall shell
$240 / sq ft · 32wk lead
$91M
Raised access floor (36")
$32 / sq ft · 14wk lead
$7.0M
Site & Utilities
$17M
Site grading, utilities, fiber
$380,000 / acre · 26wk lead
$15M
Long-haul dark fiber landing
$120,000 / Route-mile · 28wk lead
$1.4M
Total CapEx · 100 MW campus
$1.1B
Benchmark band $4.2B $7.8B · per Potter '25
IT Equipment63%
Networking5%
Electrical13%
Mechanical / Cooling8%
Building Shell9%
Site & Utilities2%
Critical-path lead times
156 weeks
Drives the program schedule. Order before the permits land.
Utility-grade transformer 230kV/34.5kV
156 weeks · $34M
Medium-voltage switchgear (15kV)
60 weeks · $9.6M
2.5 MW diesel standby generator
48 weeks · $46M
Per-MW benchmark
Industry rule-of-thumb is $8M–$15M / MW non-IT plus GPU CapEx of $32k–$42k per H100-equivalent. Your build: $11M /MW.
Section 04 · Schedule

A 4-year program disguised as a 18-month build

Pour the foundations in 4 months. Wait 3 years for transformers. Welcome to AI infrastructure. Click any bar to see the systems-thinking constraints that shape it.

Phase Gantt

PowerSiteShellMEPITCommissioning
-36-24-12+0+6+12+18
Power
Site
Shell
MEP
IT
Commissioning
Month 0 = ground-breaking · negative = pre-construction (power queue, design, long-lead orders)
Shell · +3 to 10

Shell & Core

Tilt-up panels, steel, roof, envelope. Concurrent equipment yard build-out.

First principle

Shell is a race against the equipment delivery date. Building must be weather-tight before transformers arrive.

Risks
  • Crane availability
  • Skilled trades shortage
Duration
7 mo
Critical path
Optional
Section 05 · Geography

The US data center fleet

Where the 4,500+ facilities sit, who runs them, and the bottleneck each cluster is up against — power, water, or political welcome.

Operating facilities
~4,500
US fleet
Operating MW
19 GW
2025 nameplate
Planned + permitted
312 GW
~17× current
US electricity share
4%
183 TWh / yr
Filters
57 sites · 28 GW total
Loading US fleet…
Owners
AWS
Microsoft
Google
Meta
Equinix
Digital Realty
Oracle
Bubble size ∝ MW · click to inspect
Pick a site

Click any bubble to inspect a campus — owner, capacity, build year, water-stress classification, and operating PUE where disclosed.

Top hubs
Northern Virginia
4.8 GW
11 GW future
Dallas–Ft. Worth
2.4 GW
+ TX queues
Phoenix / AZ
2.2 GW
water-stressed
Iowa / Nebraska
1.7 GW
wind-rich
Oregon (PNW)
1.9 GW
free cooling
Section 06 · Economics & Politics

The second-order consequences

A 100 MW campus isn't just a building. It's a 30-year claim on a slice of the grid, the watershed, and the local political contract. Here are the trade-offs.

US data center electricity

From 4% to ~12% in five years

LBNL base case puts US data center load at 580 TWh by 2030. High case is 800 TWh — roughly the entire annual generation of Texas.

Local economic ledger

Who wins, who pays

Per typical 100 MW campus. Construction is short-lived. Tax base is real but often offset by abatements. Permanent jobs are few — and increasingly contested.

Systems thinking · the four binding constraints

Grid interconnect queue

4+ years

PJM queue is the longest in US history. Texas (ERCOT) is faster but its summer peak constrains AI campuses to nights or curtailment contracts.

Transformer manufacturing

80-210 weeks

Grain-oriented electrical steel + lifetime test cycles. Global capacity hasn't scaled with hyperscaler demand. Pre-buying inventory has become a strategic moat.

Water permitting

case by case

Phoenix, Dallas, Atlanta now require closed-loop. AZ Senate Bill 1393 caps new evaporative withdrawals. The trend is one-way — and irreversible.

Local political welcome

wild card

Loudoun County moratoriums, NIMBY-driven rezonings, school-impact studies. The 'data center incentive arbitrage' is closing as the local-vs-economic-development tension intensifies.

Section 07 · Mental models

The five lenses to see all of this clearly

If you've made it this far, the components are the easy part. The hard part is the worldview that lets you reason about them together. Five frames that unlock the rest.

MDA Framework

Mechanics → Dynamics → Aesthetics

Designers build mechanics (rules, components). Users experience dynamics (the play). What they feel is aesthetics (the wonder). A data center is the same: build mechanics (power, cooling, fabric); operators experience dynamics (load, failure, recovery); customers feel aesthetics (latency, availability, cost).

Bottleneck theory of constraints

Goldratt — find it, exploit it, subordinate to it

Every system has exactly one binding constraint at a time. For AI build-outs in 2025, it is power. Pouring money into anything else (faster GPUs, denser racks, better cooling) doesn't speed up the program until the substation lights up. Find it; exploit it; subordinate everything else.

Leverage points

Donella Meadows — small shifts that change behavior

From least to most powerful: parameters, buffers, stocks/flows, delays, feedback loops, system goals, paradigms. AI build-outs are stuck at parameters (more GPUs!) when the leverage is at paradigms (rethink the grid). PUE measurement was a paradigm-level intervention; liquid-cooling reframing is another.

First principles

What is physically required vs. inherited?

Raised floors exist because mainframes vented downward in 1965. We kept them for 40 years. Now AI racks are 120 kW — air doesn't carry that heat anymore — and the entire 'data hall' archetype is being rebuilt from physics up. Always ask: which assumptions are physics, and which are habit?

Hyperobjects

Things distributed in time and space

A data center is a hyperobject — a 30-year contract spread across counties, watersheds, transmission rights, and labor markets. Decisions made in Year 1 commit you to Year 30 outcomes. You don't 'finish' building a data center; you join its lifecycle.

Forge synthesis

From bits to atoms

The story of AI is a story of moving information. The story of AI infrastructure is a story of moving heat — and the matter, watts, and water it takes to do it. Master both, and you'll see why the next decade of AI is a civil-engineering problem dressed in software clothes.

The bottleneck is always physical