The core simple reason I think AI is valuable
Why I don't think every Watt-hour or liter we spend on AI is wasted
A lot of conversations about AI and the environment ultimately turn back to the question of whether AI is worth any resources being spent on it at all. I’m writing this to have a place to point people to for my answer. It’s pretty simple.
First, 2 definitions:
Heuristics vs rules: Rules are absolute, fixed principles that must be followed to guarantee a specific outcome, while heuristics are "rules of thumb" flexible, practical shortcuts or educated guesses used for efficient problem-solving and decision-making, though they are not guaranteed to be optimal, perfect, or even correct.
Deep learning: The specific type of AI that’s become dominant over the past decade, where instead of humans writing out explicit rules, we build large neural networks that learn their own internal rules or heuristics from data. These models stack many layers of simple mathematical functions, allowing them to automatically extract patterns from raw inputs (like pixels, audio, or text) and make increasingly complex decisions. Basically all debates about “AI” right now are debates about deep learning specifically.
Deep learning is a way to get a machine to build its own soup of internal heuristics for how to handle complex situations. This soup of heuristics both succeeds and fails in surprising ways. We don’t design the heuristics themselves, and don’t even know what they are or how they work, in the same way we don’t know how a lot of the heuristics in our own brains work. There’s a whole field of AI interpretability attempting to understand AI models’ internal heuristics. It’s hard!
Could you write a clear step-by-step algorithm to help a machine determine when a song feels nostalgic and when it doesn’t? Can you identify when a song feels nostalgic when you hear it, even if you don’t have the emotional reaction of feeling the nostalgia itself in the moment? If you can identify nostalgic songs, there are heuristics in your brain that you can’t articulate, and that maybe aren’t really articulable using our current language or methods of consciously thinking about the world.
Having some way to get a machine to build up similar inarticulable intuitions that lead to good results like this seems useful. The main benefit of deep learning is that before it, basically everything we used machines for had to be articulable in clear language and logic. If there were situations that we could build up tacit knowledge for how to handle, but couldn’t possibly write that knowledge down as clear instructions, machines couldn’t join us in navigating them. Now (with many bumps and failure points) they can.
A funny example of this is chicken sexing. There are no clear written rules for identifying whether a newly hatched chick is male or female. The average person looking at one would just take a wild guess. Something odd happens where, if humans train for months, they can achieve 97% accuracy in looking at a newly hatched chick and identifying whether it’s a male or female, but no one who can do this can articulate how they’re doing it, so they can’t just write an instruction manual new people can use to skip those months of training. This is a classic example of an internal soup of heuristics. Our brains develop some complex internal rules for what male vs. female newly hatched chick look like, but these rules aren’t actually available to our conscious mind. Until deep learning, we couldn’t build machines to do this task for us, because we couldn’t articulate what we’re doing when writing the machine’s code or working on its physical design.
If you train a deep learning model on enough images of chickens, it can also pick up on whatever opaque heuristics the humans are using. It also can’t articulate what those heuristics are, but whatever internal rules it developed are good enough that the machine can distinguish male from female chickens at 98.5% accuracy, beating out humans.
To get a sense of why and how deep learning models can develop these inarticulable heuristics, I really cannot emphasize enough that you should watch the 3Blue1Brown series on neural nets. It’s the best popularization of how deep learning works anywhere.
If a tool can develop heuristics that are inarticulable to itself and us, this implies 2 things:
The number of possible use cases for it is gigantic. Computers took off so fast because they were so uniquely useful at solving problems that had clear logical steps we could articulate to them. This type of problem is very common. Deep learning models can help us with the other type of problem humans work on: places where we can’t articulate perfect rules to follow, but we know solutions when we see them, and experienced humans can develop tacit knowledge about a problem and solve it using intuition. There are a lot of examples of this type of problem as well.
There will be a lot of weird ways these machines will fail, and they can’t be completely relied on in the way we could rely on computer programs with clear step-by-step instructions we could refine and test. This is the first time we’ve been consistently using machines with complex webs of rules and heuristics that we ourselves didn’t program into them. Instead of saying ChatGPT was programmed, it probably makes more sense to say something like it was “grown” in the lab during training. The potential for failures here is high (hence why AI models hallucinate and lie more consistently than normal computer programs), but the fact that they can develop their own heuristics at all makes them valuable even with a higher chance of messing up.
This is why many uses of deep learning (including ChatGPT) seem valuable to me, even though they can make a lot of mistakes.
I’ve been thinking about this a lot lately — maybe we need to shift our mental picture. We’ve grown so used to computers being exact that it feels strange to accept answers in terms of probabilities or rules of thumb, like you mention.
But in nature, in biology, and especially in human relationships, there’s almost never 100% certainty. It’s all shades of probability.
Seen that way, I think it’s actually a pretty useful mindset for how we approach AI and LLMs.
I find it fascinating how chat GPT can fail. But I find it laughable when people write off llms because they can hallucinate and make mistakes. Have they never made a mistake themselves? Have they never seen how goofy the world of humans is?