AI Compute Wave: Blackwell GB200 Racks Shipping; Liquid-Cooling Stabilizes; What GB300 Means for 2026 Capex

AI Compute Wave: Blackwell GB200 Racks Shipping; Liquid-Cooling Stabilizes; What GB300 Means for 2026 Capex

AI Compute Wave: The Blackwell era isn’t theoretical anymore. GB200 NVL72—NVIDIA’s rack-scale system that stitches 72 Blackwell GPUs + 36 Grace CPUs into a single NVLink domain—has crossed from pilot to shipping with HPE, Dell and QCT confirming deliveries. Early friction around overheating and liquid-cooling leaks has been addressed, with suppliers reporting fixes and a production ramp.

Meanwhile, GB300 NVL72 (Blackwell Ultra) is entering cloud roadmaps for 2026—pushing higher memory bandwidth and a denser NVLink Switch fabric. Here’s what’s shipping, how many kilowatts each rack swallows, why liquid-cooling no longer looks fragile, and how 2026 capex may tilt to GB300. 

What’s Shipping Today (and Who’s Shipping It)

Rack-scale GB200 NVL72 is in the field

  • QCT (Quanta Cloud Technology) announced first shipments of GB200 NVL72 systems on May 27, 2025—an early marker that the rack-scale pipeline was turning.
  • HPE followed with a Feb 13, 2025 note: GB200 NVL72 “now available,” highlighting direct liquid cooling and the full 72-GPU NVLink domain. HPE’s QuickSpecs quantify the rack: ~132 kW per rack (≈115 kW liquid-cooled, 17 kW air-cooled auxiliary).
  • Dell had already shipped fully integrated, liquid-cooled IR7000 racks with GB200 NVL72 to cloud customers in late 2024; by July 3, 2025 Dell also delivered the first GB300 NVL72 to CoreWeave, foreshadowing rapid generational turnover.

Why this matters: procurement teams can move from PO to pre-integrated racks (factory-built, fluid-tested), shortening on-site bring-up and reducing leak risk. 

The Liquid-Cooling Story—From Problem Reports to Fixes

What went wrong at first

Reports in late 2024–early 2025 flagged overheating and connectivity issues in early 72-GPU racks, often tied to manifold/cold-plate tuning and installation complexity at new sites. Some hyperscalers deferred orders while vendors reworked cooling assemblies and validation. 

What’s been fixed

By mid-2025, Financial Times reported that key suppliers—Foxconn, Inventec, Dell, Wistron—had resolved the rack issues and increased shipments, citing fixes to leaks, software bugs and NVLink connectivity, and standardizing integration to the current “Bianca” layout. 

Why the fixes are holding

  • Factory-integrated racks (Dell IR7000, HPE validated designs) arrive pre-plumbed and pressure-tested, minimizing field-side fluid work.
  • Direct-to-chip liquid cooling is now mainstream; thermal vendors and OEMs quantify large gains vs air for Blackwell/Hopper-class densities.
  • NVIDIA’s own positioning emphasizes NVLink Switch plus liquid-cooled NVL72 designs to hit density/perf targets with lower facility water use compared to prior air-cooled deployments.

What a GB200 NVL72 Rack Actually Is

Architecture in one paragraph

GB200 NVL72 connects 72 Blackwell GPUs and 36 Grace CPUs into one NVLink domain—a single, giant accelerator for training and real-time inference at trillion-parameter scales. Racks are fully liquid-cooled, cabled with thousands of high-speed copper links to an NVLink Switch spine. NVIDIA claims orders-of-magnitude throughput gains over Hopper for LLM inference. 

Power & floor space

HPE’s reference numbers (~132 kW per rack) give planners a realistic power/cooling budget. Multiply that by a 100-rack “AI hall” and you’re planning ~13 MW just for accelerators—before networking and storage. Facilities must plan manifolds, CDU capacity, water loops, and redundancy accordingly. 

Packaging supply (the quiet bottleneck)

Blackwell relies on TSMC CoWoS and large HBM stacks; analysts projected CoWoS capacity surging 150% (2024) and >70% (2025) to feed GB200 demand—one reason delivery schedules were tight in early 2025. 

Keywords in context: Blackwell GB200, GB200 NVL72, NVLink Switch, liquid cooling, HBM3e, CoWoS capacity, AI racks, hyperscaler capex, GB300 NVL72, data-center power.

Enter GB300 NVL72—What Changes for 2026

The platform

GB300 NVL72 (Blackwell Ultra) keeps the 72-GPU/36-CPU rack-scale pattern but with higher per-GPU power and memory bandwidth, and an upgraded NVLink Switch fabric. NVIDIA describes GB300 NVL72 as optimized for test-time scaling inference and massive reasoning workloads, with 50× higher output vs Hopper-era platforms in some inference cases. 

Real deployments are starting

  • Dell → CoreWeave (July 2025): first GB300 NVL72 delivered, paired with liquid cooling and pre-integration to cut on-site time.
  • NVIDIA’s NVLink brief underscores how rack-level switching turns many GPUs into one logical accelerator—vital for the GB300 class.
  • NVIDIA’s own Blackwell blog details the Switch spine (thousands of copper cables), showing why on-rack engineering matters as much as chips.

Why GB300 reshapes 2026 capex

Third-party outlooks suggest CSP capex > US$600B in 2026, with full-rack AI systems a big line item; investment banks and tech forecasts also point to a steeper 2026 curve as GB300 and next-gen cloud services launch. Simply put: more watts, more water, more racks—and bigger tickets per cluster. 

The Upgrade Math for Infra Teams

If you have Hopper today

  • Workloads: GB200 NVL72 offers large NVLink domains with lower latency than Ethernet-only clusters; GB300 raises the ceiling again. Consider model sharding changes and checkpoint compatibility during migration.
  • Power/cooling: Moving from H100 air-cooled islands to GB200/GB300 liquid-cooled racks can cut floor space while raising power density; your constraint becomes CDUs, water loops, and utility MW.
  • Networking: GB200/GB300 racks expect NVLink inside, then scale via Quantum-X800 InfiniBand or Spectrum-X Ethernet—plan your fabric tiers accordingly.

If you skipped a generation

  • Buy pre-integrated racks (Dell/HPE/QCT) to avoid first-time liquid-cooling pitfalls—these arrive pre-plumbed.
  • Stage a pilot hall: 4–8 racks to validate manifolds, leak detection, service SOP, and safety drills, then replicate.

Budget heuristics for 2026

  • Expect higher $/MW due to liquid-cooling plants, NVLink cabling, and substrate/HBM costs; but higher perf/W can offset TCO per token for large inference.
  • Watch CoWoS/HBM availability; slot timing can dominate delivery more than rack assembly.

Risk & Reality Check

From delays to momentum

The Reuters/Information cycle (overheating reports) was real; so are the FT updates on fixes and shipment ramp. Treat Q1–Q2 2025 as the debug phase, H2-2025 as the ramp, and 2026 as GB300 scale-out—subject to packaging and power availability. 

Power & water

A 50-rack block of GB200 NVL72 pulls ≈6.6 MW of IT load; GB300 rises further. Facility teams need water budgets, heat-rejection, and redundant CDUs sized early. HPE’s public numbers are a sound planning anchor. 

Vedio Credit: NVIDIA Developer

Practice With Principles

Compute, But With Conscience

A values lens of truthfulness, non-harm and dignity as suggested by Enlightened Sant Rampal Ji Maharaj fits AI-factory buildouts: publish plain-English KPIs (leak incidents, downtime, energy & water per token), run independent safety drills for liquid systems, and reserve cycles for public-interest work (education, health) where feasible. For a deeper spiritual perspective on ethics-in-action, explore the teachings and service initiatives shared by Sant Rampal Ji Maharaj on his official site and Youtube channel

Call to Action

For cloud & enterprise buyers

Upgrade in two beats

  • Beat-1 (now): lock GB200 NVL72 slots with factory-integrated racks; validate CDU sizing and manifold maintenance.
  • Beat-2 (’26): pilot GB300 NVL72 for inference-heavy fleets; model perf/$ vs power and water budgets.

For facility & safety teams

Make liquid routine

  • Adopt leak-detection, HSE SOPs, and annual audits; keep spare manifolds and hot-swap plans. (Partners now ship pre-plumbed.)

For policymakers & utilities

Power, water, permits

  • Align grid upgrades, recycled-water loops, and fast-track permits for AI blocks; the 2026 capex wave assumes faster utility timelines.

Read Also: 2-nm race: TSMC’s N2 mass- production timing and new fabs; India’s opening in advanced packaging

FAQs: AI compute wave

1) Are GB200 NVL72 racks really shipping now? Who’s delivering?

Yes—QCT announced first shipments (May ’25); HPE says GB200 NVL72 is available with direct liquid cooling; Dell shipped fully integrated liquid-cooled racks to cloud customers. 

2) Didn’t Blackwell racks overheat? What changed?

Early reports cited overheating/leaks and connectivity glitches. By mid-’25, FT reported vendors had fixed the issues (manifolds, software, NVLink cabling) and ramps resumed. 

3) How much power does one GB200 NVL72 rack draw?

HPE QuickSpecs put it near 132 kW per rack (≈115 kW liquid, 17 kW air). Plan CDUs, manifolds, and water loops accordingly. 

4) What’s different about GB300 NVL72? When does it land?

GB300 NVL72 (Blackwell Ultra) brings higher per-GPU power/bandwidth and an updated NVLink Switch fabric; Dell delivered the first to CoreWeave in July ’25. Broader rollouts stack into 2026. 

5) How does all this shape 2026 capex?

Analysts project CSP capex to exceed US$600B in 2026; research notes and bank forecasts see GB300/VR200 full-rack systems as major beneficiaries. 

6) Any packaging or memory constraints to watch?

Yes—CoWoS and HBM3e capacity remain gating factors; TrendForce tracked aggressive capacity adds to support Blackwell ramps. 

Leave a Reply

Your email address will not be published. Required fields are marked *