Resource recs on AI catastrophic risk
If nothing magic is happening in the human brain, things could get weird
I’m using this post to collect resources on AI catastrophic risk that I think are especially good and informative. An obvious flag on my bias here is that I’ve been into effective altruism for 15 years, run the DC EA professional network, and talk with people working on and thinking about AI risk from an EA perspective pretty regularly.
Some popular written AI x-risk arguments are way too overconfident. The authors seem to be coasting off the social status their confidence about a new tech is giving them. Writing like this gives the impression that all arguments for AI risk are based on pie in the sky wild inferences that give the speaker social sanction to override normal rules in debate, and maybe life more broadly. I think the core AI risk case is pretty simple and convincing, to the point that I don’t really understand how people can just write it off, but I also think the case is often represented badly and it’s understandable that a lot of people won’t consider it because of the sometimes goofy behavior of some of its loudest representatives. This list will be me trying to separate the goofier takes from better arguments for what I think is a real and serious risk.
I put the odds of catastrophic risk from AI in my lifetime as low, but not low enough not to worry and think about. Anything above a 1% chance of human extinction seems high enough that most people should engage with the basic arguments.
Even if you don’t buy the risk case, writing on AI risk often brings in enough interesting ideas about technology, history, nature, evolution, and the future to be worth reading.
I’m very aware this is all very speculative, which is why I’ve made this a separate post. I don’t want to imply that anything I’ve written about chatbots and the environment or how AI works should give me any credibility on risks from advanced AI. This is basically a completely separate field of inquiry that I’m truly just some guy in. My only authority here is that I think I have good taste in reading material, and I’ve been reading talking and thinking about AI risk for over a decade.
Right now this list is pretty short. I’ll be adding to it as I pull together other materials. I’m trying to be pretty picky and selective. I’ve listed the resources in the order I’d consume them.
Articles
On those undefeatable arguments for AI doom
I’m starting with this one to throw cold water on specific ways of thinking about AI catastrophe using fuzzy inferences that lead to drastic overconfidence.
The 80,000 Hours profile on preventing an AI-related catastrophe
In my opinion the single best introduction to AI catastrophic risk.
Matt Reardon’s The bone-simple case for AI x-risk
Michael Nielson’s How to be a wise optimist about science and technology? and Notes on existential risk from artificial superintelligence
Joe Carlsmith’s Existential risks from power-seeking AI
Jacob Steinhardt’s series More is different for AI
Holden Karnofsky’s Cold Takes series, which goes back and forth between takes I think are straightforward and convincing and takes I think are off the wall and goofy goober.
Katja Grace’s Counterarguments to the Basic AI X-risk case
Classic texts
Not AI, but related: Von Neumann’s Can we survive technology?
Thanks for sharing, Andy!
I guess the risk of human extinction over the next 10 years is like 10^-7. A typical mammal species lasts 1 M years (which suggests an extinction risk of 10^-5 over 10 years), and I think humans are much more resilient. Mammals usually go extinct due to competition from other species or climate change, and I believe both of these are way less likely to drive humans extinct. Species living in larger areas are also less likely to go extinct, and humans live all across the globe.
I am open to updating to a much higher risk than suggested by the above priors, but I would need much stronger evidence. For example, a catastrophe caused by AI killing 1 M people in 1 year, or a detailed quantitative model outputting a high risk of extinction with inputs informed as much as possible by empirical data.
The magic is that the human body operates on 2000-2500 calories of energy whereas GPUs require between 400-800W.