AI Power Consumption: Rapidly Becoming Mission-Critical
June 24, 2024
Beth Kindig
Lead Tech Analyst
This article was originally published on Forbes on Jun 20, 2024,04:13pm EDT
Big Tech is spending tens of billions quarterly on AI accelerators, which has led to an exponential increase in power consumption. Over the past few months, multiple forecasts and data points reveal soaring data center electricity demand, and surging power consumption. The rise of generative AI and surging GPU shipments is causing data centers to scale from tens of thousands to 100,000-plus accelerators, shifting the emphasis to power as a mission-critical problem to solve.
Increasing Power Consumption Per Chip
As Nvidia, AMD, and soon Intel begin to roll out their next generation of AI accelerators, the focus is now shifting towards power consumption per chip, whereas the focus has been primarily on compute and memory. As each new generation boosts computing performance, it also consumes more power than its predecessor, meaning that as shipment volumes rise, so does total power demand.
Nvidia’s A100 max power consumption is 250W with PCIe and 400W with SXM (Server PCIe Express Module), and the H100’s power consumption is up to 75% higher versus the A100. With PCIe, the H100 consumes 300-350W, and with SXM, up to 700W. The 75% increase in GPU power consumption happened rapidly, within two brief years, across one generation of GPUs.
When we look at other GPUs on the market today, AMD’s MI250 accelerators draw 500W of power, up to 560W at peak, while the MI300x consumes 750W at peak, up to a 50% increase. Intel’s Gaudi 2 accelerator consumes 600W, and its successor, the Gaudi 3, consumes 900W, again another 50% increase over the previous generation. Intel’s upcoming hybrid AI processor, codenamed Falcon Shores, is expected to consume a whopping 1,500W of power per chip, the highest on the market.
Nvidia’s upcoming Blackwell generation boosts power consumption even further, with the B200 consuming up to 1,200W, and the GB200 (which combines two B200 GPUs and one Grace CPU) expected to consume 2,700W. This represents up to a 300% increase in power consumption across one generation of GPUs with AI systems increasing power consumption at a higher rate. SXM allows the GPUs to operate beyond the PCIe bus restrictions, offer higher memory bandwidth, high data throughput and higher speeds for maximal HPC and AI performance, thus drawing more power.
It’s important to note that each subsequent generation is likely to be more power-efficient than the last generation, such as the H100 reportedly boasting 3x better performance-per-watt than the A100, meaning it can deliver more TFLOPS per watt and complete more work for the same power consumption. However, GPUs are becoming more powerful in order to support trillion-plus large language models. The result is that AI requires more power consumption with each future generation of AI acceleration.
Sign up for I/O Fund's free newsletter with gains of up to 2600% because of Nvidia's epic run - Click here
Big Tech’s AI Ambitions Lead to Surging GPU Shipments
From Big Tech’s perspective, we’re still in the early stages of this AI capex cycle. Most recently, we covered how Big Tech is boosting capex by more than 35% YoY in 2024, likely upwards of $200 billion to $210 billion, predominantly for AI infrastructure. The majority is flowing to GPU purchases and custom silicon, to power AI training, model development, and to meet elevated demand in the cloud.
2023 was a breakout year for Nvidia’s data center GPUs, with reports placing annual shipments at 3.76 million, for an increase of more than 1.1 million units YoY. A report stated that at peak of 700W and ~61% annual utilization, each GPU would draw 3.74 MWh; this means that Nvidia’s 3.76 million GPU shipments could consume as much 14,384 GWh (14.38 TWh). A separate report estimated that with 3.5 million H100 shipments through 2023 and 2024, that H100 alone could see total power consumption of 13.1 TWh annually.
The 14.4 TWh is equivalent to the annual power needs of more than 1.3 million households in the US. This also does not include AMD, Intel, or any of Big Tech’s custom silicon, nor does it take into account existing GPUs deployed or upcoming Blackwell shipments in 2024 and 2025. As such, the total energy consumption is likely to be far higher by the end of the year as Nvidia’s Blackwell generation comes online in larger quantities.
To read more about Nvidia’s upcoming Blackwell architecture, reference our previous analysis: Nvidia Q1 Earnings Preview: Blackwell and the $200B Data Center. If you own AI stocks, or are looking to own AI stocks and want to learn more, we encourage you to attend our upcoming weekly webinar, held this Thursday at 4:30 pm EST. Learn more here.
A Path to Million GPU Scale
Nvidia and other industry executives have laid out a path for GPU clusters in data centers to scale from the tens of thousands of GPUs per cluster to the hundred-thousand-plus range, even up to the millions of GPUs by 2027 and beyond. We’re already seeing signs of strong demand for Nvidia’s Blackwell platform, but overall, the million-plus GPU data center target is still years away.
Oracle’s Chairman Larry Ellison sees this creating secular tailwinds for data center construction, due to both rising GPU demand and increased power requirements driving a shift to liquid cooling:
“This AI race is going to go on for a long time. It's not a matter of getting ahead, just simply getting ahead in AI, but you also have to keep your model current. And that's going to take larger and larger data centers. … The data centers we are building include the power plants and the transmission of the power directly into the data center and liquid cooling. And because these modern data centers are moving from air cooled to liquid cooled, and you have to engineer them from scratch. And that's what we've been doing for some time. And that's what we'll continue to do.”
As the industry progresses towards that million-GPU scale, this puts more emphasis on future generations of AI accelerators to focus on power consumption and efficiency while delivering increasing levels of compute. Data centers are expected to adopt liquid cooling technologies to meet the cooling requirements to house these increasingly large GPU clusters.
For more information on investing in AI, check out our 1-hour interview “AI is the Best Opportunity of our Lifetime.”
AI Electricity Demand Forecast to Surge
As a result of booming demand for generative AI and for GPUs, AI’s electricity demand is forecast to surge, especially in the data center. We have a handful of different viewpoints and analyst projections that, while differing slightly in the timelines, all point to that same conclusion.
For example, Morgan Stanley is estimating global data center power use will triple this year, from ~15 TWh in 2023 to ~46 TWh in 2024. This coincides with the ramp of Nvidia’s Blackwell chip later in the year as well as utilization of the entirety of its deployed Hopper GPUs, and increased shipments from AMD and custom silicon ramps from Big Tech.
Morgan Stanley also projects generative AI power demand may exceed 2022’s data center power usage by 2027 if GPU utilization rates are high, at ~90% on average; however, their base case still calls for a nearly 5x increase in power demand over the next three years.
Morgan Stanley calls for a nearly 5x increase in generative AI power demand over the next three years in their base case scenario. Source: I/O Fund
Wells Fargo is projecting AI power demand to surge 550% by 2026, from 8 TWh in 2024 to 52 TWh, before rising another 1,150% to 652 TWh by 2030. This is a remarkable 8,050% growth from their 2024 projected level. AI training is expected to drive the bulk of this demand, at 40 TWh in 2026 and 402 TWh by 2030, with inference’s power demand accelerating at the end of the decade. In this model, the 652 TWh projection is more than 16% of the current total electricity demand in the US.
Source: I/O Fund
The Electric Power Research Institute forecasts that data centers may see their electricity consumption more than double by 2030, reaching 9% of total electricity demand in the US. The IEA is projecting global electricity demand from AI, data centers and crypto to rise to 800 TWh in 2026 in its base case scenario, a nearly 75% increase from 460 TWh in 2022. The agency’s high case scenario calls for demand to more than double to 1,050 TWh.
Source: I/O Fund
Arm’s executives also see data center demand rising significantly: CEO Rene Haas said that without improvements in efficiency, "by the end of the decade, AI data centers could consume as much as 20% to 25% of U.S. power requirements. Today that’s probably 4% or less." CMO Ami Badani reiterated Haas’ view that that data centers could account for 25% of US power consumption by 2030 based on surging demand for AI chatbots and AI training.
How the Supply Chain is Addressing Power Requirements:
Taiwan Semiconductor is an example of a supply chain company that plays a crucial role here, as its most advanced nodes tout lower power consumption and increased performance, which is why AI accelerators will soon shift from primarily being produced on the 5nm node to the 3nm node and eventually 2nm.
Here’s what we said previously in our free newsletter about TSMC:
“At the foundry level, the 3nm process offers 15% better performance than the 5nm process when power level and transistors are equal. TSMC also states the 3nm process can lower power consumption by as much as 30%. The die sizes are also an estimated 42% smaller than the 5nm. …
N3E is the baseline for IP design with 18% increased performance and 34% power reduction, N3P has higher performance and lower power consumption, whereas the N3X will offer high-performance computing with very high performance but with up to 250% power leakage.
The 2nm will be the first node to use gate-all-around field-effect transistors (GAAFETs), which will increase chip density. The GAA nanosheet transistors have channels surrounded by gates on all sides to reduce leakage, yet will also uniquely widen the channels to provide a performance boost. There will be another option to narrow the channels to optimize power cost. The goal is to increase the performance-per-watt to enable higher levels of output and efficiency. The N2 node is expected to be faster while requiring less power with an increase of performance by 10%-15% and lower power consumption of 25%-30%.”
CEO C.C. Wei noted in Q1’s call that TSMC’s “customers are working with TSMC for the next node. Even for the next, next node, they have to move fast because, as I said, the power consumption has to be considered in the AI data center. So the energy-efficient is fairly important. So our 3-nanometer is much better than the 5-nanometer. And again, it will be improved in the 2-nanometer. So all I can say is all my customers are working on this kind of a trend from 4-nanometer to 3 to 2.”
The power problem is being addressed throughout the supply chain, from TSMC’s chip designs to renewable energy power agreements for Big Tech’s data centers. It’ll likely require the industry to move in tandem due to the sheer pace of GPU upgrades from Nvidia, soon AMD and possibly Intel.
We’re covering how another critical part of the supply chain is working to address power consumption this week for our premium members. Learn more here.
Conclusion
AI power demand is forecast to rise at a rapid rate. GPU demand is showing no signs of slowing as Big Tech continues to spend billions on AI infrastructure, with each GPU generation seeing higher peak power consumption. The industry is quickly taking steps to address this, and power consumption, or more specifically, power efficiency per chip, looks to be emerging as the third realm of competition.
We’ve covered the first two realms of competitions, raw computing power and memory, extensively in previous analysis, including “Here’s Why Nvidia will Reach $10 Trillion in Market Cap.” We think it’s important to keep a keen eye on this space as new winners will emerge as AI power consumption becomes mission critical.
Every Thursday at 4:30 pm Eastern, the I/O Fund team holds a webinar for premium members to discuss how to navigate the broad market, as well as various stock entries and exits. We offer trade alerts plus an automated hedging signal. The I/O Fund team is one of the only audited portfolios available to individual investors. Learn more here.
Recommended Reading:
Gains of up to 2,880% from our Free Newsletter.
Here are sample stock gains from the I/O Fund’s newsletter --- produced weekly and all for free!
2,880% on Nvidia
750% on Bitcoin
*as of Nov 20, 2024
Our newsletter provides an edge in the world’s most valuable industry – technology. Due to the enormous gains from this particular industry, we think it’s essential that every stock investor have a credible source who specializes in tech. Subscribe for Free Weekly Analysis on the Best Tech Stocks.
If you are a more serious investor, we have a premium service that offers lower entries and real-time trade alerts. Sample returns on the premium site include 4,490% on Nvidia, 900% on Chainlink, and 1,120% on Bitcoin. The I/O Fund is audited annually to prove it’s one of the best-performing Funds on the market, with returns that beat Wall Street funds.
Get a bonus for subscription!
Subscribe to our free weekly stock
analysis and receive the "AI Stock: 5
Things Nobody is Telling you" brochure
for free.
More To Explore
Newsletter
AI Spending To Exceed A Quarter Trillion Next Year
Big Tech’s AI spending continues to accelerate at a blistering pace, with the four giants well on track to spend upwards of a quarter trillion dollars predominantly towards AI infrastructure next year
Palantir Stock: How High Is Too High?
Palantir proved again in Q3 that it’s undeniably one of the stronger AI software stocks in the market outside of the cloud hyperscalers. The company reported visible AI-driven growth and persisting bu
Bitcoin Bull Market Intact as Risk Increases
In December 2022, we boldly stated that “Bitcoin is a buy” when it was trading around $17,000. We were positioning for a new bull cycle and projected a target between $75,000 - $132,000. Despite Bitco
Tesla Stock: Margins Bounce Back For AI-Leader
Tesla is arguably one of the most advanced AI companies in the world, yet its stock is dictated by margins. Over the past three years, Tesla’s average gross profit per vehicle has declined by 60%, fal
This Stock Is Crushing Salesforce, MongoDB And Snowflake In AI Revenue
In this article, I break down how Palantir’s AIP is putting it a step above peer Salesforce, MongoDB and Snowflake with visible AI growth, and its undeniable ‘secret sauce’.
Nvidia, Mag 7 Flash Warning Signs For Stocks
In this report, my team will address the risks brewing in the market. The strange behavior in the bond market could be signaling that the FOMC has made a policy error. This coupled with key tech stock
Why the I/O Fund is Not Buying Nvidia Right Now: Video Interview
In an interview with Darius Dale, Beth Kindig stated: “We ultimately think you can get Nvidia lower than where it is trading now. We are likely to take gains between $120 and $150 based on technical l
Cybersecurity Stocks Seeing Early AI Gains
Below, I look at the demand environment for leading cybersecurity stocks CrowdStrike, Zscaler, Palo Alto, and Fortinet, and which ones have key metrics hinting toward underlying strength.
4 Things Investors Must Know About AI
We’re still in the early innings of AI, but the pace of transformation that AI is driving is unlike any other technology seen before, and that was evident at Communacopia. Below, I dig in to the four
AI PCs Have Arrived: Shipments Rising, Competition Heating Up
Chipmakers Qualcomm, Intel and AMD are working to bring AI-capable PCs to the “mainstream”, delivering powerful neural processing units to PCs for on-computer AI operations. AI PCs are not only a cons