What can a nineteenth-century economist teach a twenty-first-century Chief Executive Officer (CEO) at one of the largest technology firms? More to the point, why is Microsoft’s CEO, Satya Nadella, quoting William Stanley Jevons? Jevons was a dominant figure in economic thought in Britain during the second half of the nineteenth century, but not today.
His work populates the earlier chapters of textbooks on the history of economic thought, where he is credited with being a pioneer in starting the Neoclassical School of Economics.
Yet, on January 27, Nadella, posted a statement on social media: “Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.”
Why is Microsoft’s CEO posting this statement, and why at that moment? While he has warmed the hearts of whoever taught him economics as a student, that surely could not have been his primary purpose.
Indeed, why? This column provides an answer. Understood in context, Nadella is making a profound economic remark about the Chinese firm DeepSeek and the future of Artificial Intelligence (AI). To be sure, the profound parts are not obvious. This will take some explaining.
Running out of Coal
There were many parallels between Jevons’s era and today’s. Today, there is a public debate about the lack of supply of Graphics Processing Units (GPUs), the high-performance chips that are the preferred platform for executing frontier algorithms used for estimation. In Jevons’ lifetime, public debate focused on the potential shortage of coal, the fuel for steam engines. Just like today, there also were technological optimists and economists, and, furthermore, they came to different conclusions despite starting from the same facts.
More precisely, Britain had a lot of steam engines and abundant coal, but by the latter half of the 19th century, many observers worried that Britain could quickly run out of enough coal to fuel the fires of all its steam engines. Ironically, in retrospect, the technological optimists did not forecast how technological improvements would change the situation, namely, that large-scale oil refining and the invention of internal combustion engines would alter society’s energy usage. Instead, technological optimists argued that more efficient steam engines would save on coal use.
Restated in modern terms, the rate of technological improvement in the previous half-century had been miraculous, displaying improvement to steam engines that look equivalent to today’s Moore’s Law. These optimists projected that steam engine technology would continue to improve at the same rate. Like analysts today who forecast by drawing lines on a chart into future decades, the optimists forecasted another half-century of rapid gains in efficiency, and argued this would alleviate any coal shortage.
While the optimistic argument sounds plausible, Jevons pointed out the flaw in the economic reasoning. A more efficient use of coal would lower the cost of producing a wheel turn from an engine. As the cost per wheel turn declined, industrial firms would find more uses for a turning wheel, limited only by their imagination for finding additional applications. However, more applications would lead to more demand for engines in turn.
In other words, under any scenario, increasing the use of engines would work against the efficiency gains, producing not nearly as significant a decline in coal demand as the optimists forecast. Jevons went further and made a second point using the same economic reasoning. Namely, it was possible that the demand for more steam engines could grow so much that it would result in more coal use. In other words, more efficiency would not save any coal.
Let’s rephrase Jevon’s insight to heighten its generality. Jevons argued that creating more efficient input technologies reduced the per-unit output cost. That would translate into lower prices, generating more demand for final output. There is no reason why the decline in input demand from more efficiency is larger than the increase in input demand from more output demand. Net input usage could plausibly go up or down.
As an aside, please forgive a clarifying but pedantic semantic observation. Though “paradox” is a memorable label for Jevons’ insight, it is a misleading. A paradox is defined as a seemingly self-contradictory statement. Jevons did not reason his way to a paradoxical insight. It is more accurately labeled a counterintuitive conclusion.
AI and Coal
Now for the metaphorical payoff. What do coal usage and efficient steam engines have to do with AI?
Nadella wants his readers to use the same economic reasoning but replace the nouns. Replace “coal” with “GPUs.” Replace “execute more wheel turns” with “execute more algorithms.” Replace “more efficient use of coal in steam engines” with “more efficient use of GPUs in data centers.”
In other words, more efficient use of GPUs will lead to more AI algorithms being executed, but that does not necessarily mean it will lead to less demand for GPUs. It could lead to more.
That implies numerous generalities. If a clever firm such as DeepSeek invented more efficient AI software, would it lead to less use of GPUs? What if a dozen firms show up next year with imitations of DeepSeek’s clever approaches? Would that lead to less use of GPUs? Think a bit before answering.
Taking a lesson from Jevons, do not presume demand for GPUs will decline with advances in efficiency. More generally, do not be too quick to conclude that creating a more efficient execution of estimation algorithms on GPUs will lead to less GPU demand in the future.
Seen in context, Nadella was also arguing something subtle. By alluding to Jevons’ paradox, he wants his audience to stop predicting with certainty that GPU usage will decline significantly due to innovations by firms such as DeepSeek and all the imitators that will soon follow. Even if GPU usage declines, it plausibly should fall only a little, and perhaps not at all. More to the point, the value of GPUs might still be high.
What is going on?
Jevons’ paradox tells us not to overreact to efficiency gains. That is what motivated Nadella’s post. Why did he post it on January 27?
On January 20, DeepSeek revealed its second Large Language Model (LLM), R1, after revealing another called V3 in the prior month. V3 and R1 both met some audacious benchmarks. Moreover, DeepSeek claimed to have made both for cheap. By the time of Nadella’s post, DeepSeek’s achievement had begun morphing into a Rashomon example in which every observer could view another angle and comment on another aspect of the AI market. Panic had set in financial markets.
Against that cacophony of comments, Microsoft’s CEO wanted events interpreted in favor of his firm. Microsoft has invested tens of billions of dollars into frontier infrastructure to support LLMs, both for itself and as part of a partnership with OpenAI. Yet, DeepSeek’s achievement led many to panic that these investments no longer had as much value.
There were at least two related but separate panics. The first panic concerned the merits of the announcements from Microsoft, Google, Meta, Apple, Amazon, Oracle, and other large firms, not to mention the Venture Capital community. All have spent vast sums on data center infrastructure, and all plan to spend more. Should the assessment of the value of these plans be revised downward?
The second panic concerns the financial valuations of the firms that supply frontier AI infrastructure for data centers, principally for the silicon stack. Should the forecasts for sales of Nvidia, TSMC, ARM, and ASML be revised downward?
DeepSeek’s achievements upended many market analysts’ assumptions, showing that more efficient use of hardware was possible when estimating an algorithm that imitates a frontier algorithm. Nadella wanted them to rethink their logic. More efficiency does not necessarily lead to less GPU sales.
Conclusion
I am always surprised by panic among experienced technology market analysts. They have lived through massive changes in the last half-century, brought about by the revolutions linked to the deployment of the PC, Local Area Networks, Internet, Web, and smartphones.
By now, experienced analysts should know better than to panic, no matter the news. They should keep those forecast wheels turning, going full steam ahead.
As it happened, since Nadella’s post, a couple additional factors have come into focus. These addenda further reduced the panic. Let’s enumerate three.
First, GPUs are helpful for executing algorithms for estimation but also essential for inference. If execution becomes more efficient, then more uses will be found for the algorithms, and more GPUs will be deployed for inference. If anything, that leads to increasing demand for GPUs for inference, just not for estimation.
Second, DeepSeek allowed users to see the algorithm’s reasoning, an appealing, innovative product design feature. That feature is easy to imitate and should generate many variants in new features among the established players. New features should raise the appeal of algorithms in the future.
Third, DeepSeek delivers censored results, reflecting the Chinese government’s preferences for an information product. Many users do not care, but some have tried DeepSeek, experienced the results, and declared they are uninterested in using such a censored product. That will reduce demand outside of China, reducing some of the anticipated competition with the established players.
More to the point, concerns about innovation and economics are at least a century and a half old. Costs always come down eventually, albeit unpredictably. Sometimes, innovation arrives slowly, and sometimes in bunches. Occasionally, somebody comes along and makes massive leaps in achievements. The timing cannot be gamed.
A voice from the past could teach the present to look beyond most short-run events and recognize additional long-run effects linked to economic factors. That is the essential lesson of the Jevons paradox.
Copyright held by IEEE Micro
April 2025