EP 327 Nate Soares on Why Superhuman AI Would Kill Us All

Subscribe: Spotify | RSS | More

Jim talks with Nate Soares about the ideas in his and Eliezer Yudkowsky’s book If Anybody Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All. They discuss the book’s claim that mitigating existential AI risk should be a top global priority, the idea that LLMs are grown, the opacity of deep learning networks, the Golden Gate activation vector, whether our understanding of deep learning networks might improve enough to prevent catastrophe, goodness as a narrow target, the alignment problem, the problem of pointing minds, whether LLMs are just stochastic parrots, why predicting a corpus often requires more mental machinery than creating a corpus, depth & generalization of skills, wanting as an effective strategy, goal orientation, limitations of training goal pursuit, transient limitations of current AI, protein folding and AlphaFold, the riskiness of automating alignment research, the correlation between capability and more coherent drives, why the authors anchored their argument on transformers & LLMs, the inversion of Moravec’s paradox, the geopolitical multipolar trap, making world leaders aware of the issues, a treaty to ban the race to superintelligence, the specific terms of the proposed treaty, a comparison with banning uranium enrichment, why Jim tentatively thinks this proposal is a mistake, a priesthood of the power supply, whether attention is a zero-sum game, and much more.

Nate Soares is the President of the Machine Intelligence Research Institute. He has been working in the field for over a decade, after previous experience at Microsoft and Google. Soares is the author of a large body of technical and semi-technical writing on AI alignment, including foundational work on value learning, decision theory, and power-seeking incentives in smarter-than-human AIs.