Things I got wrong in my ChatGPT/environment posts

Logging the errors

May 16, 2025

I’m using this post to track updates to my ChatGPT and the environment posts:

I occasionally have people correct me and then I circle back and update the posts. I’ll log any updates here. If you think I’m getting anything wrong please email me at AndyMasley@Gmail.com or comment below! Let us together figure out what’s up.

Things I got wrong

I’ll add to this as I correct each post!

I now think GPT-4 has probably been used far less than my original rough estimate. This BOTEC implies it’s closer to 50 billion prompts in total (I originally guessed around 200 billion) which means the amortized cost of training (from 50 GWh) becomes 1 Wh/prompt, not 0.3 Wh/prompt. This means that instead of raising the energy cost per prompt by 10%, amortizing the cost of training raises it by 33%. This is a significant increase, but it doesn’t really make me reconsider my use of chatbots. 3Wh → 4Wh is in the same order of magnitude as 3 Wh → 3.3 Wh. It’s important to note that this doesn’t include all the other apps GPT-4’s being used for, just ChatGPT. It seems likely that if we include all the other apps it’s been used at least 200 billion times, so this depends on what you consider relevant for the ChatGPT point.
I’m much less sure of how much water ChatGPT is currently using. The studies on water that were done were for GPT-3, and I used to think it made sense to generalize them to GPT-4 and 4o. Here’s my current best guess and justification:
- We’re in the dark on GPT-4’s water use. A single prompt using GPT-3 likely consumed between 10–50 mL of water if you include the water used in the data center, the water cost of training the model spread out over its lifetime, and the water used to generate the electricity. One commonly-cited study implying that GPT-4 used a whole 16 oz (500 mL) water bottle per prompt is likely wrong. Because the GPT-3 study assumed 4 Wh per prompt, their assumed prompt length was much larger than average (they assume ≤800 words of input and 150-300 words of output) and we’re keeping 3 Wh as an upper prompt length for the average prompt, I’ll stick to the 10–50 mL range given for GPT-4 as well. These numbers have large error bars, but even if we’re an order of magnitude off, they still don’t significantly add to our water footprint. If each prompt costs 10–50 mL of water means that every single day, the average American uses enough water for 12,000-60,000 ChatGPT prompts.
I removed a section on how thinking hard about a topic uses something like 200 ChatGPT searches per hour. The data on this seems too murky to justify.

The Weird Turn Pro

Discussion about this post