DeepSeek Creates Buying Opportunity for Nvidia Stock
January 31, 2025
Beth Kindig
Lead Tech Analyst
Monday’s DeepSeek-driven rout of AI stocks and Nvidia, which at one point had erased more than $600 billion in value during the day, boiled down to one mission-critical question – did the model’s release fundamentally rewrite the AI capex story?
Referring to DeepSeek as AI’s “Sputnik” moment is catchy for headlines, yet the most important takeaway for investors about this analogy is that Sputnik spurred massive investments. It was not the final destination, rather, it was the beginning of a multi-decade space race. The Sputnik satellite cost $15 to $20 million or $33 million with cost-adjusted inflation, yet the United States would spend an estimated $1 trillion over a sixty year period in response. As you can see, the United States takes it quite seriously to stay in the lead.
The sheer noise around DeepSeek followed by the 17% drop in the world’s most valuable company is enough to make any investor panic. Humans are driven by sentiment, and for this reason, it has been by firm’s stance since we launched in 2019 that every tech investor should use a combination of techncials and fundamentals to beat the Street.
By now, I probably do not have to remind you that my firm has the world’s best track record on Nvidia’s stock. But what I would like to remind my readers is that we are determined to continue having the best track record on this stock. This includes telling you when we are not buying, such as when the stock was trading in the $140s, and telling you when we plan to buy, which included an entry at $126 and could include an entry as low as $90 to $100. You will find dozens (perhaps hundreds) of articles on DeepSeek at this point; yet you will be hard pressed to find one other person helping investors navigate Nvidia’s stock at this granular level.
Below, I provide evidence that DeepSeek is not the black swan that killed Nvidia overnight – in fact, driving down the costs of AI development has always been the plan. To illustrate, Nvidia launched a $3,000 supercomputer dubbed Project Digits at CES three weeks ago to help facilitate the lower costs of AI development for researchers and scientists.
I also touch base on what investors should keep an eye on price-wise moving forward for the GPU juggernaut. As always, I hope you find the information below noise-free and helpful.
DeepSeek’s DualPipe Algorithm
The market’s reaction can be distilled into two major concerns – is Big Tech spending too much to train models, and separately, has the era of AI software arrived to where it will now commoditize hardware. We have to start by looking at the question of whether AI software has officially begun to commoditize hardware first.
DeepSeek’s DualPipe Algorithm optimized pipeline parallelism, which essentially reduces inefficiencies in how GPU nodes communicate and how mixture of experts (MoE) is leveraged. MoE refers to distributing a computational load across “multiple experts” (or neural networks) to train across thousands of GPUs using what is called model and pipeline parallelism. This enables more compute-efficient training yet the parameters still need to be loaded in RAM, so the memory requirements remain high.
By allowing the routing of tokens to experts and the aggregation of results to be handled in parallel through code called PTX (Parallel Thread Execution), DualPipe helped to drive down costs. The software essentially optimized the hardware. The company also created a 4-node maximum to limit nodes and reduce traffic, allowing for a more efficient communication framework.
Tom’s Hardware wrote a prescient article on this topic a month ago, called quite literally: “Chinese AI company says breakthroughs enabled creating a leading-edge AI model with 11X less compute — DeepSeek's optimizations could highlight limits of US sanctions.” The article stated: “The DualPipe algorithm minimized training bottlenecks, particularly for the cross-node expert parallelism required by the MoE architecture, and this optimization allowed the cluster to process 14.8 trillion tokens during pre-training with near-zero communication overhead.”
This may seem like a moment where AI software is triumphant, yet we are at the end of the Hopper generation with the H100s (and H800) GPUs being available for two years now. Two years is eternity in the AI arms race, and the fact Hopper may be commoditized at the very moment that Blackwell is shipping is not a shocking new problem --- rather, it’s the point of keeping a fast-paced product road map.
Here is the differences between the H100 and GB200 NVL72 systems on Mixture of Experts (MoE) throughput (for inference) and training speeds. Per Nvidia’s Computex keynote, from Pascal to Blackwell, their AI systems will deliver “1,000 times increase in AI compute,” while simultaneously decreasing the “energy per token by 45,000X. Therefore, the market is a bit confused to think the 11X increase in compute from software optimizations is going to catch Nvidia off guard.
DeepSeek acquiesced the limitations they faced in deploying the model is “expected to be naturally addressed with the development of more advanced hardware.” Note, they are not saying with the development of more advanced software.
I made the point nearly a year ago that Nvidia is competing with Nvidia with its one-year product release road map stating: “The product road map is the single most important thing investors should be focused on. A good chunk of the AI accelerator story is understood at this point. What is not understood is how aggressive Nvidia is becoming by speeding up to a one-year release cycle for its next generation of GPUs instead of a two-year release cycle."
DeepSeek used “cold-start data” to jump start the pretraining process, meaning it used data from prior models rather than needing a new, proprietary large data set. This shifts the reliance from large data sets to train models to relying more on compute for features such as two-model training processes that use reasoning traces.
In addition, by open sourcing the model, there will be more developers who can build new AI capabilities. As stated in a Predibase analysis, there were 500 derivative models of DeepSeek created in a few days’ time. Nvidia was early to this eventual outcome with the launch of Project Digits, a $3,000 supercomputer that can run 200B-parameter models. The readthrough is that Nvidia seeks the proliferation of its GPUs – much like Apple’s iPhone -- whereas companies like OpenAI are the ones most challenged by an open source LLM that drives down input token and output token costs that are 27X less expensive than OpenAI’s o1 model.
Blackwell Inches United States Toward General Artificial Intelligence (AGI)
The reason that software has not officially begun to commoditize hardware, and we could be as far as 5-10 years away from this moment, is because AI development is incredibly nascent. Blackwell and future generations of GPUs are a necessity for AI development to inch closer to the start of general artificial intelligence (AGI).
There have been discussions questioning if it is possible to reach AGI with reinforcement learning: “artificial general intelligence can be achieved if an agent tries to maximize a reward in a complex environment because the complexity of the environment will force the agent to learn complex abilities like; social intelligence, language, etc.”
Reinforcement learning is an ML method where an agent or model learns to make decisions through interactions in its environment, via rewards or punishments. Agents will interact with the environment, receive a positive or negative reward, and adjust its decisions/actions based on the feedback it has received.
AGI refers to the creation of a machine that is capable of performing intellectual tasks on par with humans, and have the ability to understand, learn and apply knowledge to a wide range of domains. Both RL and AGI involve learning from interactions with the environment, though RL is typically more focused on specific tasks or environments where AGI aims to be ‘all-encompassing.’
If software efficiencies from China are relatable to Sputnik, then the arrival of AGI will be the moment we land on the moon. AGI requires an order of magnitude larger models – minimum 1 trillion, up to 10 trillion or more. Reinforcement learning is a certainly a step in the right direction, yet trillion+ parameter models are inevitable. It’ll require Nvidia and other AI accelerators to get there.
It took only two years from the advent of Nvidia’s Hopper GPUs for China to pass the US, though US firms are likely already building on Blackwell and will soon surpass any Hopper-based breakthroughs.
Mixed Point Precision and Multi-Head Latent Attention Lowers Memory Usage
DeepSeek’s success is also found in lowering memory usage with multi-head latent attention lowering memory usage by 5% to 13%. As pointed out by ML Engineer Zain ul Abideen, “MLA achieves superior performance than MHA, as well as significantly reduces KV-cache boosting inference efficiency.” This ultimately reduces memory requirements during inference by processing long sequences of text
It’s important to remember that Nvidia’s compute machines are expensive, yet within these systems, memory is increasing in dollar content even faster than compute. Memory is an expensive component and Hopper is known for its memory limitations, therefore, reducing memory usage is a one path to making the Hopper GPUs more efficient. It has been estimated that HBM3e’s component costs in Hopper GPUs could be as much as 25% higher than HBM3-equipped GPUs, with HBM4 adding more costs due to its technical complexities.
DeepSeek’s success primarily stems from its pioneering approach to model architecture. The company introduced a novel MLA (multi-head latent attention) method that lowers memory usage to just 5–13% of what the more common MHA architecture consumes.
Nvidia’s hardware excellence stands out in the Hopper generation of GPUs with the Transformer Engine. Two years ago, Hopper’s transformer engine brought about Chat-GPT’s big moment as the OpenAI model could eliminate the need to find patterns between elements mathematically, and this opens up which datasets can be used and how quickly.
The H100s also leverage the transformer engine to switch between floating precision points, such as FP8, FP16 or FP32, depending on the workload. Nvidia architected the ability to switch between floating precision points in order to require less memory usage. Here is what Nvidia states:
“There are numerous benefits to using numerical formats with lower precision than 32-bit floating point. First, they require less memory, enabling the training and deployment of larger neural networks. Second, they require less memory bandwidth which speeds up data transfer operations. Third, math operations run much faster in reduced precision, especially on GPUs with Tensor Core support for that precision. Mixed precision training achieves all these benefits while ensuring that no task-specific accuracy is lost compared to full precision training. It does so by identifying the steps that require full precision and using 32-bit floating point for only those steps while using 16-bit floating point everywhere else.”
DeepSeek says that FP8 allowed to it “achieve both accelerated training and reduced GPU memory usage,” as it validated FP8’s usage for training large scale models for a fraction of the cost. A majority of the “most compute-density operations are conducted in FP8, while a few key operations are strategically maintained in their original data formats,” such as those that require higher precision due to sensitivity reasons.
Though lower-precision training has often been “limited by the presence of outliers in activations, weights, and gradients,” and tests have shown that FP8 training was prone to higher instability and more frequent loss spikes, it is now emerging as a solution for efficient training due to hardware advancements (ie. Hopper brining powerful FP8 support, Blackwell bringing FP4).
DeepSeek also provided recommendations for future chips to accommodate low-precision training and replicate this at scale, suggesting chip designs should increase accumulation precision in Tensor Cores to support full-precision accumulation, or select an appropriate accumulation bit-width according to the accuracy requirements of training and inference algorithms.”
This is what Blackwell was designed to address, with new precisions in Tensor Cores, FP4 precision, increased SM count, and more CUDA cores versus the Hopper. Blackwell also packs 208 billion transistors to provide up to 20 petaflops of FP4, compared to the H100’s 4 petaflops of FP8. The B200 features a second-generation transformer engine supporting 4-bit floating point (FP4), with the goal of doubling the performance and size of models the memory can support while maintaining accuracy.
To simply recreate DeepSeek’s training efficiencies and develop large-scale models, Hopper GPUs are a requirement due to support for FP8, with Blackwell bringing FP4 to power real-time inference and supercharged training for trillion parameter models.
I’ll stick my neck out and say that Nvidia is not caught off guard that DeepSeek used it’s GPUs for mixed floating-point precision to drive down memory costs, and thus reduced the overall training time. Rather, Nvidia is likely to be quite pleased as the impetus to cut costs in the AI community will help necessitate the need for FP4 in the Blackwell generation. Per my previous write-up: “B200 will have a second-generation transformer engine that supports 4-bit floating point (FP4) with the goal of doubling the performance and size of models the memory can support while maintaining accuracy.”
MoE models like DeepSeek’s can provide numerous benefits, and this is what DeepSeek is showing – an ability to train larger models at a lower cost with much faster pre-training, faster inference, and an ability to deliver decreased first-token latency. However, MoE also can require higher VRAM to store all experts simultaneously and can face challenges in fine-tuning.
Understanding the nuances of Nvidia’s hardware is the reason that I first called out Nvidia’s AI GPU thesis and CUDA moat in late 2018, and in 2019, Volta’s AI capabilities prompted me to say on my premium stock research site: “I believe Nvidia will be one of the world’s most valuable companies by 2030.” This has led to gains of over 4,000% for our free readers.
DeepSeek is Cheap
There is no denying that DeepSeek is cheap. DeepSeek V3 was trained on 14.8 trillion tokens in pre-training, with each 1 trillion tokens taking 180K H800 GPU hours (3.7 days) on its 2,048 H800 cluster. Compare this to Meta’s 405 billion parameter Llama 3.1 model, which was trained in 54 days (30.8M GPU hours) on a 16,384 H100 GPU cluster, estimated to cost ~$80 million at a $2.6/hour rental price.
The estimated $6 million cost for DeepSeek (notably, this cost is hotly debated) would also be pennies on the dollar compared to estimated training costs for OpenAI’s GPT-4 and Alphabet’s Gemini Ultra.
This accounted for a majority of the costs at $5.328 million, assuming a $2/hour rental price for the H800s (this is about in-line with long-term contract rates at Lambda for the H100, but ~20% lower than short-term costs from other startup cloud providers).
While the training costs were estimated utilizing rental prices (and up for debate), what’s important to note is that the $5.6 million cost only excluded “costs associated with prior research and ablation experiments on architectures, algorithms, or data.”
Democratization of AI Helps Nvidia
Nvidia’s goal is to not only sell GPUs to the fortresses of Big Tech. Rather, all tech companies seek large addressable markets and the democratization of AI will assist Nvidia in reaching worldwide AI device ubiquity.
In the meantime, Blackwell is sold out, so investors have nothing to worry about in the near term. Microsoft just stated this week that they are still supply constrained in the cloud, and Azure needs more supply to grow. Meta doubled down on its planned capex for 2025, and signaled a willingness to spend “hundreds of billions” towards AI infrastructure in the long run.
In the medium term, Nvidia will allocate supply to enterprises and edge AI, as lower costs will facilitate enterprise AI and edge computing. The market grew concerned that there would not be ROI on these large AI invests, and on the other hand, the market now is panicking when costs are lowered to the point that ROI can be achieved.
This has always been Nvidia’s goal, with CEO Jensen Huang and other executives declaring at high profile stages such as CES and GTC that the cost of computing will go down with each generation of GPUs. At GTC 2024, Huang explained that Nvidia “accelerated algorithms so quickly that the marginal cost of computing has declined so tremendously over the last decade that it enabled” generative AI to emerge. He further explained that reducing the cost of computing and accelerating computing is what Nvidia “does for a living at its core,” and that the “pricing that we create always starts from TCO.”
Nvidia VP Ian Buck corroborated this at BofA GTC Conference in June 2024: “The opportunity here is to help [customers] get the maximum performance through a fixed megawatt data center and at the best possible cost and optimized for cost.”
The Street has the takeaway all wrong: you want more development in AI and Big Tech is racing to make sure they're stack is chosen by startups. We are now closer to startups and enterprises being able to afford AI, and if anything, the DeepSeek release will show that spending grows in times of ‘war’. This relates back to Sputnik – it has been the strategy of the United States to spend a lot for dominance and to ask questions later.
Sign up for I/O Fund's free newsletter with gains of up to 2600% because of Nvidia's epic run - Click here
What to Monitor for Nvidia’s Earnings Report
There seems to be no end to the amount of Blackwell supply rumors with yet another one circulated in January 2025. According to Taiwan Semiconductor’s management team: “Rick, as you said, there's a lot of rumor. That's a rumor, I assure you. We are working very hard to meet the requirement of my customers' demand. So cut the order, that won't happen. Actually continue to increase.”
Here are the additional points that cause us to believe Blackwell revenue will show up soon enough:
1) Nvidia CEO Jensen Huang at CES reconfirmed that Blackwell is in full production, adding that “every single cloud service provider now has systems up and running."
2) CFO Colette Kress in Q3’s earnings: “Blackwell demand is staggering and we are racing to scale supply to meet the incredible demand customers are placing on us. Customers are gearing up to deploy Blackwell at scale.”
3) CFO Colette Kress in Q3: “While demand is greatly exceeding supply, we are on track to exceed our previous Blackwell revenue estimate of several billion dollars as our visibility into supply continues to increase.”
4) CFO Colette Kress at UBS’ Global Technology Conference: “What we see in terms of our Blackwell, which will be here this quarter is also probably a supply constraint that is going to take us well into our next fiscal year for several quarters from now. So, no, we don't see a slowdown. We continue to see tremendous demand and interest, particularly for our new architecture that's coming out.” UBS Analyst Tim Arcuri added, “you are actually shipping more Blackwell than you thought you would three months ago.”
5) CFO Colette Kress at CES: “We are able to increase our demand and increase our revenue each quarter as well. … When we think about the demand that is in front of us, it is absolutely a growth year.”
Big Tech’s Capex
Two weeks ago, my firm covered how Big Tech capex (AI spending) came in significantly above expectations in 2024, and is on track to repeat that in 2025. To recap, analysts had initially estimated capex of ~$200 billion for Big Tech in 2024, an increase of ~30% YoY. However, our latest checks suggest that Big Tech is on track to spend at least $236 billion in capex in 2024, 18% higher than analyst estimates and representing YoY growth of more than 52%.
For 2025, Big Tech is on track to spend $300 billion or more on capex, based on initial commitments from Microsoft and Meta totaling at least $140 billion combined. This is already tracking ~7% higher than ~$280 billion estimated by analysts, with Microsoft’s $80 billion commitment 40% higher than estimated and Meta’s $60-65 billion more than 18% higher than estimated.
AI capex and ROI is likely to be front and center in Big Tech’s upcoming earnings reports, and it will be a telling sign if executives double down on these forecasts, or begin to reign in spending. I believe it’s likely to be the former, as DeepSeek has highlighted one major tailwind that has been overlooked – if AI models can now be trained quicker and cheaper, it’s likely to catalyze demand for GPU instances in the cloud from leading providers, as GPUs can be rented much faster than setting up new infrastructure. In order to meet elevated demand, hyperscalers will need to continue purchasing, or even accelerate purchasing, Nvidia’s GPUs in order to prevent chip constraints from impeding revenue growth.
Analyst Revisions for Nvidia’s Stock Remain Unchanged
In June 2024, in the analysis Here's Why Nvidia Stock Will Reach $10 Trillion Market Cap By 2030, I discussed the importance of intra-quarter analyst revisions supporting Nvidia’s massive run, as data center revenue continued to blow past expectations.
Seen below, Nvidia’s revenue for FY25 was estimated at $120 billion in June, being revised 31.5% higher over the six months prior. In dollar terms, that represented a nearly $29 billion increase from $91.3 billion. For FY26, revenue was estimated at $157.5 billion in June, a 44.8% increase from $108.8 billion, a nearly $50 billion increase in six months.
Compare that to today:
FY25 revisions are up just 7.6% (or $9 billion) since July to $129.2 billion as the fiscal year comes to a close, but FY26 revenue is 20% higher to $196.5 billion. That’s another $39.5 billion (or over a full quarter at the current run rate) that has been added in the past seven months.
Intra-quarter revisions will be where you will first see any potential hiccups or impacts on revenue, and as of yet, that has not materialized.
FY26 quarterly revisions have come down slightly over the past month, at -1.3% lower for Q4 as the largest revision, as the market digests numerous rumors of order cuts, delays and overheating for Blackwell. So far, there has been no impact from DeepSeek’s V3 release – in fact, estimates inched marginally higher on January 28, with Q4 FY26 revenue rising from $55.49 billion to $55.51 billion.
The bigger picture – revenue is still 14% to 24% higher than it was six months ago. At this scale, that’s $6 to $10-billion-plus higher.
Where Nvidia’s Stock Price Goes Next
In the write-up “Where I Plan to Buy Nvidia Stock Next” we stated that Nvidia still needs, at least, one more push higher to complete the current uptrend. This would make the volatility that started in June of 2024 a correction within a larger uptrend.
We still believe that NVDA’s uptrend is not over. We stand by the two scenarios outlined in the last report; both are still in play.
“Blue – The final 5th wave is playing out as an ending diagonal pattern, which is common for 5th waves. This type of pattern is a 5 wave pattern in itself that is characterized with large swings in both directions. Our target zone for the bottom on this 4th wave is $126 - $116. If Nvidia can push over $140.75, then then odds favor this scenario.
Red – Nvidia is in a much more complex 4th wave. If this is playing out, NVDA would see the $116 level break, which opens the door to a potential low at $101, $90, or $78.”
While the $116 support level still holds from our last analysis, the resistance worth monitoring is now $132. If Nvidia can breakout over the $132 resistance, it will shift the odds to the more immediately bullish count in blue. This would see a push into the $170 - $190 region over the coming months.
On the other hand, if Nvidia breaks below the $116 support level, it will signal that the more immediately bearish count in red is playing out. This would see us make a meaningful low in the $102 - $83 range over the coming months.
We began executing on our current buy plan, which is to layer into Nvidia at key levels. The $126 - $116 region was our first target. If the $116 region breaks, we will then target the $102 - $83 region to complete our buying. Considering the blue count is looking for a final 5th wave higher, we will likely not chase a breakout over $132.
Conclusion
If DeepSeek’s breakthroughs are truly the key to ushering in a new paradigm of AI training and ultimately AI democratization from cost reductions, it may not be a death sentence for Nvidia; in fact, quite the opposite.
This is Jevons paradox -- where the technological advancements of Hopper and Blackwell will translate into significant efficiency gains and cost reductions for AI training, that then will drive near-ubiquity for AI services -- and thus GPUs in the data center, across enterprises and at the edge.
The market’s readthrough is that Big Tech has now been overspending on AI. However, The I/O Fund believes that readthrough is wrong, it’s not that the United States is overspending; it’s that we will accelerate spending to stay ahead. The I/O Fund recently entered new positions in a handful of small and mid-cap names that we believe will be beneficiaries of this AI spending war. We discuss entries, exits and what to expect from the broad market every Thursday at 4:30 p.m. in our 1-hour webinar. Learn more here.
Disclaimer: This is not financial advice. Please consult with your financial advisor in regards to any stocks you buy.
Recommended Reading:
Gains of up to 2,930% from our Free Newsletter.
Here are sample stock gains from the I/O Fund’s newsletter --- produced weekly and all for free!
2,930% on Nvidia
770% on Bitcoin
*as of Jan 07, 2025
Our newsletter provides an edge in the world’s most valuable industry – technology. Due to the enormous gains from this particular industry, we think it’s essential that every stock investor have a credible source who specializes in tech. Subscribe for Free Weekly Analysis on the Best Tech Stocks.
If you are a more serious investor, we have a premium service that offers lower entries and real-time trade alerts. Sample returns on the premium site include 4,415% on Nvidia, 1,365% on Chainlink, and 1,150% on Bitcoin. The I/O Fund is audited annually to prove it’s one of the best-performing Funds on the market, with returns that beat Wall Street funds.
Get a bonus for subscription!
Subscribe to our free weekly stock
analysis and receive the "AI Stock: 5
Things Nobody is Telling you" brochure
for free.
More To Explore
Newsletter
DeepSeek Creates Buying Opportunity for Nvidia Stock
DeepSeek shook the market to start the week, sending AI heavyweight Nvidia down 17% on Monday, wiping out $600 billion in market cap, while other AI hardware names fell up to 30%. This is enough to ma
Big Tech AI Stocks to Showcase AI Gains, Capex in Q4 Reports
Big Tech’s earnings are just around the corner, with Microsoft and Meta kicking the season off on January 29th. The upcoming December quarter reports will offer a glimpse into AI spending trends to cl
Why Solana is Outperforming Ethereum by 26,500% Since 2020
Ethereum has long been the frontrunner in decentralized apps (dApps) and blockchain innovation, but its scalability challenges have left the door open for competitors like Solana. Launched in March 20
The Best of I/O Fund’s Free Newsletter in 2024
The Best of I/O Fund’s Free Newsletter in 2024 highlights top-performing tech stocks, Bitcoin insights, and AI-driven market trends. Discover how I/O Fund delivered unparalleled analysis on Nvidia, Bi
Five Top Tech Stocks Of 2024: Year In Review
Below, I review five of the top stocks of 2024, selected based on their price action, fundamentals and presence withing leading tech themes. Choosing a top 5 means many great stocks were left off this
Where I Plan To Buy Nvidia Stock Next
The I/O Fund is tracking multiple supply chain signals indicating Blackwell sales will likely far exceed the GPU sales we saw in 2023 and 2024 combined – to the tune of bringing Nvidia to $200 billion
This Is Not Broadcom’s ‘Nvidia Moment’ Yet
Broadcom’s stock surged 35% in two days despite a mediocre Q4, as management offered investors a picturesque addressable market forecast for 2027. Q4 was not the blowout report the market made it out
Semiconductor Stocks Exposed To China With Tariffs Incoming
Semiconductor stocks will come into focus in 2025 as geopolitical tensions rise. China is likely to retaliate following Trump’s most recent threats of 10% additional tariffs to all Chinese goods. This
Shopify Stock Is A Black Friday Beneficiary That Faces Key Test In Q4
Black Friday and Cyber Monday e-commerce sales broke records again this year, with Adobe pointing out that US sales increased 10.2% YoY to $10.8 billion on Black Friday while Cyber Monday sales rose 7
Nvidia’s Stock Has 70% Potential Upside For 2025
Nvidia once again posted a $2 billion beat to consensus revenue estimates in Q3, reporting YoY growth of nearly 94% to over $35 billion in revenue. Data center revenue more than doubled in the quarter