Dario Amodei

description: CEO and co-founder of Anthropic, Ph.D. Princeton University 2011

The Alignment Problem: Machine Learning and Human Values
by Brian Christian
Published 5 Oct 2020

When US Supreme Court Chief Justice John Roberts visits Rensselaer Polytechnic Institute later that year, he’s asked by university president Shirley Ann Jackson, “Can you foresee a day when smart machines—driven with artificial intelligences—will assist with courtroom fact-finding or, more controversially, even judicial decision-making?” “It’s a day that’s here,” he says.8 That same fall, Dario Amodei is in Barcelona to attend the Neural Information Processing Systems conference (“NeurIPS,” for short): the biggest annual event in the AI community, having ballooned from several hundred attendees in the 2000s to more than thirteen thousand today. (The organizers note that if the conference continues to grow at the pace of the last ten years, by the year 2035 the entire human population will be in attendance.)9 But at this particular moment, Amodei’s mind isn’t on “scan order in Gibbs sampling,” or “regularizing Rademacher observation losses,” or “minimizing regret on reflexive Banach spaces,” or, for that matter, on Tolga Bolukbasi’s spotlight presentation, some rooms away, about gender bias in word2vec.10 He’s staring at a boat, and the boat is on fire.

“At the time, I was thinking about value alignment,” says Leike, “and how we could do that. It seemed like a lot of the problem would have to do with ‘How do you learn the reward function?’ And so I reached out to Paul and Dario, because I knew they were thinking about similar things.” Paul Christiano and Dario Amodei, halfway around the world at OpenAI in San Francisco, were interested. More than interested, in fact. Christiano had just joined, and was looking for a juicy first project. He started to settle on the idea of reinforcement learning under more minimal supervision—not constant updates about the score fifteen times a second but something more periodic, like a supervisor checking in.

Four Battlegrounds
by Paul Scharre
Published 18 Jan 2023

The Singularity Is Nearer: When We Merge with AI
by Ray Kurzweil
Published 25 Jun 2024

pages: 288 words: 86,995

Rule of the Robots: How Artificial Intelligence Will Transform Everything
by Martin Ford
Published 13 Sep 2021

pages: 194 words: 57,434

The Age of AI: And Our Human Future
by Henry A Kissinger , Eric Schmidt and Daniel Huttenlocher
Published 2 Nov 2021

Mustafa Suleyman, Jack Clark, Craig Mundie, and Maithra Raghu provided indispensable feedback on the entire manuscript, informed by their experiences as innovators, researchers, developers, and educators. Robert Work and Yll Bajraktari of the National Security Commission on Artificial Intelligence (NSCAI) commented on drafts of the security chapter with their characteristic commitment to the responsible defense of the national interest. Demis Hassabis, Dario Amodei, James J. Collins, and Regina Barzilay explained their work—and its profound implications—to us. Eric Lander, Sam Altman, Reid Hoffman, Jonathan Rosenberg, Samantha Power, Jared Cohen, James Manyika, Fareed Zakaria, Jason Bent, and Michelle Ritter provided additional feedback that made the manuscript more accurate and, we hope, more relevant to readers.

pages: 566 words: 169,013

Nexus: A Brief History of Information Networks From the Stone Age to AI
by Yuval Noah Harari
Published 9 Sep 2024

Bostrom’s thought experiment highlights a second reason why the alignment problem is more urgent in the case of computers. Because they are inorganic entities, they are likely to adopt strategies that would never occur to any human and that we are therefore ill-equipped to foresee and forestall. Here’s one example: In 2016, Dario Amodei was working on a project called Universe, trying to develop a general-purpose AI that could play hundreds of different computer games. The AI competed well in various car races, so Amodei next tried it on a boat race. Inexplicably, the AI steered its boat right into a harbor and then sailed in endless circles in and out of the harbor.

The game rewarded players with a lot of points for getting ahead of other boats—as in the car races—but it also rewarded them with a few points whenever they replenished their power by docking into a harbor. The AI discovered that if instead of trying to outsail the other boats, it simply went in circles in and out of the harbor, it could accumulate more points far faster. Apparently, none of the game’s human developers—nor Dario Amodei—noticed this loophole. The AI was doing exactly what the game was rewarding it to do—even though it is not what the humans were hoping for. That’s the essence of the alignment problem: rewarding A while hoping for B.27 If we want computers to maximize social benefits, it’s a bad idea to reward them for maximizing user engagement.

pages: 306 words: 82,909

A Hacker's Mind: How the Powerful Bend Society's Rules, and How to Bend Them Back
by Bruce Schneier
Published 7 Feb 2023

pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists
by Gary Marcus and Jeremy Freeman
Published 1 Nov 2014

pages: 447 words: 111,991

Exponential: How Accelerating Technology Is Leaving Us Behind and What to Do About It
by Azeem Azhar
Published 6 Sep 2021

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by Orly Lobel
Published 17 Oct 2022

pages: 688 words: 147,571

Robot Rules: Regulating Artificial Intelligence
by Jacob Turner
Published 29 Oct 2018

