Blogs -NVIDIA’s GB200s for up to 27 Trillion Parameter Models: Scaling Next-Gen AI Superclusters

NVIDIA’s GB200s for up to 27 Trillion Parameter Models: Scaling Next-Gen AI Superclusters


March 21, 2025

author

I/O Fund

Team

Supercomputers and cutting-edge AI data centers are fueling the artificial intelligence (AI) revolution. Large-scale systems need comprehensive builds that are increasingly integrated to meet the evolving demands of complex workloads. As AI applications become more sophisticated, the need for infrastructure that's not only incredibly powerful but also energy-efficient is growing exponentially. Innovations like NVIDIA’s GB200 are designed to deliver the scalability needed for next-generation AI superclusters.  

At the 2025 NVIDIA GPU Technology Conference (GTC), VP and Chief Architect of Systems, Mike Houston, and Senior Director of Applied Systems Engineering, Julie Bernauer, discussed large-scale systems design principles in their May 18 presentation, “Next-generation at Scale Compute in the Data Center.”

NVIDIA’s First Rack-Scale Product is the GB200 Superchip

The NVIDIA Grace Blackwell 200 (GB200) Superchip combines two Blackwell GPUs and one Grace CPU. It’s NVIDIA’s first rack-scale product. The NVIDIA GB200 NVL72 is a configuration and rack-scale, liquid-cooled AI computing platform, which is purpose-built for AI training and inferencing, handling up to 27 trillion parameters for generative AI models. The GB200 includes base components like Grace Hopper compute trays, NVLink switches (a connector in the middle of the rack linking all GPUs) and cable cartridges (literally miles of cables in the back to tie everything together). The design includes quantum switches for InfiniBand (a high-speed network for linking clusters) and spectrum switches for Ethernet.

AI 101: What are Clusters and Superclusters?

Clusters 101: are a network of independent computers (called nodes) connected by a high-speed network. A cluster serves as a unified resource, as they are separate machines configured to work together to act as a single powerful computing system. They are often used for parallel processing, which breaks down a large task into smaller parts distributed across the nodes, enabling faster processing than just a single computer could do. A key benefit of a node is high availability, meaning if one node (computer) fails, the other nodes can take over its workload, ensuring that the system remains operational. High-performance compute (HPC) clusters are used for tasks like research, scientific simulations and AI training.

Join thousands of investors who trust I/O Fund’s expert stock analysis on AI, semiconductors, cryptocurrency, and adtech — sign up for free! Click here!

Superclusters 101: are very large clusters that may be comprised of hundreds to thousands of GPUs through many data centers. For example, Elon Musk’s xAI supercomputer Colossus, powered by 100,000 NVIDIA GPUs, is definitely a supercluster.

DGX started as single machines for AI but evolved into clusters for AI training. Pre-training can involve superclusters, but post-training can still involve 16,000 GPUs with smaller setups for fine-tuning and inference using trained AI to answer questions.

NVIDIA AI and HPC platform architecture diagram featuring the GB200 NVL72 SuperPOD, designed for large-scale AI training and high-performance computing. Source: NVIDIA.

NVIDIA AI & HPC Platform architecture diagram, featuring GB200 NVL72 SuperPOD.
Source: NVIDIA

Optimizing the Benefits of Rack-Scale Architecture with GB200

NVIDIA’s GB200 NVL72 is a rack-scale system. Rack-scale designs a whole rack as one big, coordinated unit, not just random machines stuck together. Rack scale refers to integrating and compressing systems that may span across multiple servers, storage and networking devices onto a single server rack. GB200 can replace or consolidate a large number of GPU compute servers. This provides many benefits, including:

  • Improved GPU Density: The GB200 NVL72 contains 72 Blackwell GPUs, and 36 Grace CPUs interconnected with NVLink, NVIDIA’s proprietary high-speed (130 TB/s) signaling interconnect that enables all 72 GPUs and 36 CPUs to act as a single massive GPU. It's designed to offer exceptional performance in AI training and inference for large language models (LLMs).
  • Performance: The GB200 delivers up to 720 petaFLOPs for AI training and 1.4 exaFLOPs for inference. Since all components are within proximity in a single rack, communication between components has much lower latency, which is especially beneficial in data-intensive tasks, reducing bottlenecks and improving data throughput.
  • Increased Efficiency: Rack-scale architecture allows for better utilization of hardware by pooling resources to optimize performance. Consolidating resources within a single rack reduces the need for separate units, saving space and power in the data center.
  • Easier Management: Centralized management of the entire rack's resources simplifies setup and maintenance, also enabling automation tools for scaling, provisioning and monitoring to reduce manual interventions.
  • Cost Efficient: Fewer servers, storage, networking equipment, physical space, cooling, and energy usage save money. As IO Fund discussed in its article “AI Power Consumption: Rapidly Becoming Mission-Critical," the GB200 is “expected to consume 2,700W”, which can add dramatically to operating expenses, especially without rack-scale architecture.
  • Future Proofing: Rack-scale architecture enables the integration of evolving technologies as components can be switched out, repaired and upgraded, enabling more adaptability for future growth.
  • Unified Power and Cooling: Housing multiple components within a single rack reduces the complexity of cooling systems and improves energy efficiency to lower operational costs.

Scaling Up AI Factories with DGX SuperPOD, Reference Architecture and Fabric

At the 2025 NVIDIA GPU Technology Conference (GTC), NVIDIA unveiled its next-generation DGX SuperPOD AI infrastructure. In the “Next-generation at Scale Compute in the Data Center” presentation, VP and Chief Architect of Systems, Mike Houston, and Senior Director of Applied Systems Engineering, Julie Bernauer, spoke about

The SuperPOD is NVIDIA’s all-in-one HPC solution designed to handle the massive computational needs of AI models and simulations. Grace Blackwell nodes are the building blocks of the SuperPOD. When scaling up clusters and superclusters, there are three factors to consider. Reference architecture is comprised of pre-tested system designs that serve as a blueprint for new data center deployments to ensure optimal installation and performance, accelerating time to the first token.

Fabric refers to the data center’s network infrastructure that connects all the servers and devices enabling them to seamlessly communicate with each other to reduce latency between components, especially GPUs. Cooling is critical in large data centers. Liquid cooling is preferred to manage the heat produced by thousands of GPUs as it is much more efficient for high-density platforms. Future GPU architectures aim for higher density and more efficient connectivity to push the limits of AI computation.

The I/O Fund recently entered five new small and mid-cap positions that we believe will be beneficiaries of this AI spending war. We discuss entries, exits, and what to expect from the broad market every Thursday at 4:30 p.m. in our 1-hour webinar. For a limited time, get $110 off an Annual Pro plan with code PRO110OFF [Learn more here.]

Disclaimer: This is not financial advice. Please consult with your financial advisor in regards to any stocks you buy.

Recommended Reading:

Gains of up to 2,250% from our Free Newsletter.


Here are sample stock gains from the I/O Fund’s newsletter --- produced weekly and all for free!

2,250% on Nvidia

670% on Bitcoin

*as of Mar 04, 2025

Our newsletter provides an edge in the world’s most valuable industry – technology. Due to the enormous gains from this particular industry, we think it’s essential that every stock investor have a credible source who specializes in tech. Subscribe for Free Weekly Analysis on the Best Tech Stocks.

If you are a more serious investor, we have a premium service that offers lower entries and real-time trade alerts. Sample returns on the premium site include 3,580% on Nvidia, 860% on Chainlink, and 1,010% on Bitcoin. The I/O Fund is audited annually to prove it’s one of the best-performing Funds on the market, with returns that beat Wall Street funds.

beth
head bg

Get a bonus for subscription!

Subscribe to our free weekly stock
analysis and receive the "AI Stock: 5
Things Nobody is Telling you" brochure
for free.

More To Explore

Newsletter

Futuristic AI data center featuring NVIDIA’s GB200 Superchip, designed for AI superclusters, high-performance computing, and generative AI training with up to 27 trillion parameters.

NVIDIA’s GB200s for up to 27 Trillion Parameter Models: Scaling Next-Gen AI Superclusters

Supercomputers and advanced AI data centers are driving the AI revolution, enabling breakthroughs in deep learning and large-scale model training. As AI workloads become increasingly complex, next-gen

March 21, 2025
NVIDIA Blackwell Ultra GPU unveiled at GTC 2025, revolutionizing AI and HPC with unprecedented efficiency and power.

NVIDIA Blackwell Ultra Fuels AI & HPC Innovation, Efficiency and Capability  

NVIDIA’s latest Blackwell Ultra GPU, unveiled at NVIDIA GTC 2025, is transforming AI acceleration and high-performance computing (HPC). Designed for the “Age of Reasoning,” these cutting-edge GPUs del

March 21, 2025
Nvidia CEO Jensen Huang discusses AI market dominance at GTC 2025, addressing demand concerns and future growth projections.

Nvidia CEO Predicts AI Spending Will Increase 300%+ in 3 Years

Nvidia has traversed choppy waters so far in 2025 as concerns have mounted about how the company plans to sustain its historic levels of demand. At GTC, Huang threw cold water on many of the Street’s

March 20, 2025
AI data centers are driving the AI revolution, but their soaring energy demands pose sustainability challenges. With power consumption projected to rise 160% by 2030, data centers are integrating brown, clean, and renewable energy sources. Goldman Sachs predicts 40% of new capacity will come from renewables, but can solar, wind, and nuclear sustain AI’s 24/7 operations? Explore how hyperscalers are evolving their energy strategies to meet growing AI demands.

AI Data Center Power Wars: Brown vs. Clean vs. Renewable Energy Sources

AI data centers are at the heart of the AI revolution, but their massive energy demands raise critical questions. With power consumption expected to grow 160% by 2030, data centers are turning to a mi

March 19, 2025
Natural gas pipelines supporting AI data centers as energy demand surges, with Texas and Louisiana emerging as key hubs for AI infrastructure growth.

Why Gas Pipelines Are the Unsung Heroes of AI Data Center Expansion

Natural gas is emerging as the backbone of AI data center expansion, with demand expected to reach up to 6 billion cubic feet per day by 2030. As AI-driven infrastructure surges, data centers are turn

March 19, 2025
Alibaba’s AI revenue growth accelerates, but remains significantly lower than U.S. tech leaders like Microsoft, highlighting China’s competitive AI landscape.

Alibaba Stock: China Has Low AI Revenue Compared to United States

Alibaba’s AI-driven cloud revenue is surging with six consecutive quarters of triple-digit growth. However, its AI earnings remain a fraction of what U.S. tech giants report, with Microsoft leading at

March 14, 2025
By 2030, AI data centers may consume 9% of U.S. electricity as GPU power usage surges, with Nvidia’s GB200 reaching 2,700W. To ensure sustainability, data centers are adopting long-term PPAs and exploring high-efficiency energy sources like nuclear and SOFCs.

Unlocking the Future of AI Data Centers: Which Fuel Source Reigns Supreme in Efficiency?

AI data centers are projected to consume 9% of U.S. electricity by 2030, driven by soaring GPU power demands, with Nvidia’s GB200 reaching 2,700W—a 300% increase over previous generations. As AI racks

March 13, 2025
Tesla faces declining deliveries in 2024 and mounting challenges in 2025, with sharp sales drops in China and Europe, margin pressures, and shifting growth targets.

Tesla Has a Demand Problem; The Stock is Dropping 

Tesla’s growth faces major hurdles in 2025 after its first annual decline in deliveries. Sales are plunging in key markets like China and Europe, while margins remain under pressure. Optimism around r

March 07, 2025
Stock market data with AI and crypto trends highlighted. I/O Fund provides institutional-grade stock analysis, offering insights on AI, semiconductors, and Bitcoin. Stay ahead with expert research and real-time trade transparency.

I/O Fund’s Top 10 of 2024

The digital world is overloaded with noise—millions of posts, comments, and messages flood the internet every minute. For investors, this creates a challenge: filtering out distractions to focus on hi

March 06, 2025
Stock market data overlaid with social media activity metrics, highlighting the challenge of information overload for investors and the importance of quality stock analysis in the tech sector.

10 Timeless Free Articles You Won't Want to Miss

In a world flooded with information, investors face an overwhelming amount of noise. Quality stock analysis is the key to cutting through the clutter. At I/O Fund, we provide in-depth, free investment

March 05, 2025
newsletter

Sign up for Analysis on
the Best Tech Stocks

https://bethtechnology.cdn.prismic.io/bethtechnology/e0a8f1ff-95b9-432c-a819-369b491ce051_Logo_Final_Transparent_IOFUND.svg
The I/O Fund specializes in tech growth stocks and offers in-depth research for Premium Members. Investors get access to a transparent portfolio, a forum, webinars, and real-time trade notifications. Sign up for Premium.

We are on social networks


Copyright © 2010 - 2025