A Microsoft inaugurated what can be considered the biggest leap in infrastructure for artificial intelligence of history: a production-scale supercomputer with over 4,600 NVIDIA GB300 NVL72 GPUsbased on the architecture Blackwell Ultra. The cluster, already operational on Azure, is uniquely designed to accelerate OpenAI workloads and enable training of models with hundreds of trillions of parameters in days, not weeks. This is the first time a system of this magnitude has been delivered into production, marking a new standard for accelerated computing and consolidating the strategic partnership between Microsoft, NVIDIA and OpenAI.
Another first for our AI fleet… a supercomputing cluster of NVIDIA GB300s with 4600+ GPUs and featuring next gen InfiniBand.
First of many as we scale to hundreds of thousands of GB300s across our DCs, and rethink every layer of the stack across silicon, systems, and software… pic.twitter.com/EtNvnSAFr6
– Satya Nadella (@satyanadella) October 9, 2025
NVIDIA GB300 NVL72
The heart of this supercomputer is the NVIDIA GB300 NVL72a rack-scale platform that integrates 72 Blackwell Ultra GPUs and 36 Arm-based NVIDIA Grace CPUs into a single, fully liquid-cooled system. Each rack delivers specifications that push the limits of computational physics: 288 GB of HBM3E memory per GPU, 37 TB of total fast memory, 130 TB/s of in-rack NVLink bandwidth, and up to 1,440 petaflops of precision FP4 Tensor Core performance. Connecting between racks is via the next-generation NVIDIA Quantum-X800 InfiniBand network, providing 800 Gbps of bandwidth per GPU for cross-rack scaling — double the bandwidth of the GB200 NVL72.

The Blackwell Ultra architecture represents a generational advance over Hopper, with 208 billion transistors divided into two dies manufactured using the TSMC 4NP process. Compared to the H100, the GB300 offers 2x greater attention layer acceleration and 1.5x more floating point operations for AI computing. In inference benchmarks, the GB300 NVL72 platform demonstrated up to 50x increase in overall AI factory output performance compared to Hopper-based systems, combining 10x more responsiveness per user and 5x more throughput per megawatt. For broadcast-based video generation models, the gain reaches 30x, enabling real-time video generation from foundation models like NVIDIA Cosmos.
Infrastructure designed for frontier AI
The world’s first large-scale @nvidia GB300 NVL72 supercomputing cluster for AI workloads is now live on Microsoft Azure.
The deployment connects 4,600+ NVIDIA Blackwell Ultra GPUs using next-gen InfiniBand network—built to train and deploy advanced AI models faster than… pic.twitter.com/CmmDtcrlwn
— Microsoft Azure (@Azure) October 9, 2025
Building a supercomputer of this magnitude required Microsoft to reimagine every layer of the infrastructure stack—compute, memory, networking, data centers, cooling, and power—as a unified system. At the rack level, NVLink and NVSwitch reduce memory and bandwidth bottlenecks by connecting 37 TB of fast memory with up to 130 TB/s of intra-rack data transfer. To scale beyond the rack, Azure implements a non-blocking full fat-tree architecture using NVIDIA Quantum-X800 InfiniBand, the fastest network fabric available today.
Microsoft’s co-engineered stack, including custom protocols, collective libraries and in-network computing, ensures that the network is highly reliable and fully utilized by applications. Features like NVIDIA SHARP accelerate collective operations and double effective bandwidth by performing mathematical calculations directly on the switch, making large-scale training and inference more efficient. Azure’s advanced cooling systems utilize standalone heat exchanger units and facility cooling to minimize water usage while maintaining thermal stability for dense, high-performance clusters like the GB300 NVL72.
From GB200 to GB300: accelerated evolution
Previously, Azure introduced ND GB200 v6 virtual machines, accelerated by NVIDIA’s original Blackwell architecture. These VMs have quickly become the backbone of some of the industry’s most demanding AI workloads, including organizations like OpenAI and Microsoft already using massive GB200 NVL2 clusters in Azure to train and deploy frontier models. Now, with ND GB300 v6 VMs, Azure raises the bar again, specifically optimizing for reasoning models, agentic AI systems, and multimodal generative AI.
The GB300 represents a focused evolution in the era of AI reasoning, with significant improvements over the GB200: 1.5x more AI compute performance, 288 GB of HBM3e memory per GPU (versus 192 GB in the GB200), 20,480 CUDA cores (versus 18,432), and improved focus on test-time scaling. The GB300’s graphics thermal power (TGP) reaches 1,400W, up from the GB200’s 1,200W, which requires advanced cooling and power distribution solutions. Microsoft has developed new power distribution models capable of supporting the high power density and dynamic load balancing required by the ND GB300 v6 class of GPU clusters.
The context of the race for AI infrastructure
This announcement builds on Microsoft’s ambitious strategy to invest $80 billion in AI-optimized data centers by 2028 — the largest infrastructure commitment in the company’s history. The investment spans 25 new Azure regions across North America, Europe, Asia and Africa, with a focus on high-density, liquid-cooled GPU clusters, including Microsoft’s Maia 100 and Maia 200 custom chips. The company is also developing sovereign clouds and dedicated edge clusters for regulated industries, as well as sustainable energy systems with solar, wind and on-site battery storage.

In September 2025, the Microsoft unveiled its world’s most powerful AI datacenter in Mount Pleasant, Wisconsin, equipped with Fairwater infrastructure that delivers 10 times more computing power. These new AI data centers are part of a global network of Azure AI data centers, interconnected via Wide Area Network (WAN), operating as a single powerful distributed AI machine. This distributed, resilient, and scalable system allows customers to harness the power of a giant AI supercomputer, with large-scale distributed training across multiple, geographically diverse Azure regions.
Source: https://www.hardware.com.br/noticias/microsoft-nvidia-gb300-nvl72-openai-supercomputador/
