description: CEO and co-founder of Anthropic, Ph.D. Princeton University 2011
15 results
Supremacy: AI, ChatGPT, and the Race That Will Change the World
by
Parmy Olson
Its founding team of thirty researchers started working out of Brockman’s apartment in San Francisco’s Mission District, at his kitchen table or slouched on sofas with their laptops perched on their knees. A few months after the launch, they got a visit from another respected Google Brain researcher named Dario Amodei. He started asking some probing questions. What was all this about building a friendly AI and releasing its source code into the world? Altman countered that they weren’t planning to release all the source code, according to his New Yorker profile. “But what is the goal?” Amodei asked. “It’s a little vague,” Brockman admitted.
…
And this was Silicon Valley after all, where programmers joined start-ups that were always trying to make the world better—while earning a seven-figure salary and stock options that could buy them a second home in America’s most expensive real estate market. Still, not everyone was happy with the new status quo. Dario Amodei, the bespectacled, curly-haired engineer who’d been probing OpenAI at its founding about what, exactly, it was trying to achieve, had liked the goal of protecting humanity from harmful AI, even if Brockman admitted it was “a little vague” at the time. Amodei was a Princeton-educated physicist who wasn’t afraid to ask difficult questions, and he had plenty about Microsoft.
…
They had the means to do that right in front of them, he argued: Google! Transforming themselves into this new company limited by a guarantee model was complicated and unrealistic—plus, no one had ever done it before, he said. On that front, Hoffman was right. If they were trying to escape corporate influence, the DeepMind founders, Altman, and even Dario Amodei and his cofounders at Anthropic were being hopelessly naive. The business of artificial intelligence was quickly being captured by the largest technology companies, who were taking greater control of its research, development, training, and deployment to the world. When Hassabis got on his video conference with staff that April morning, he told them he had two announcements.
Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI
by
Karen Hao
Published 19 May 2025
Altman passed the mic to Dario Amodei, who was twirling and tugging his curly hair, as he often did, with a restless energy. He read a canned statement announcing that he, Daniela, and several others were leaving to form their own company. Altman then asked everyone quitting to leave the meeting. In May of the following year, the departed group announced a new public benefit corporation: Anthropic. Anthropic people would later frame The Divorce, as some called it, as a disagreement over OpenAI’s approach to AI safety. While this was true, it was also about power. As much as Dario Amodei was motivated by a desire to do what was right within his principles and to distance himself from Altman, he also wanted greater control of AI development to pursue it based on his own values and ideology.
…
GO TO NOTE REFERENCE IN TEXT “Doctors in Africa”: Carmen Drahl, “AI Was Asked to Create Images of Black African Docs Treating White Kids. How’d It Go?,” Goats and Soda, NPR, October 6, 2023, npr.org/sections/goatsandsoda/2023/10/06/1201840678/ai-was-asked-to-create-images-of-black-african-docs-treating-white-kids-howd-it-. GO TO NOTE REFERENCE IN TEXT In April 2024, Dario Amodei: Ezra Klein, host, The Ezra Klein Show, podcast, “What if Dario Amodei Is Right About A.I.?,” April 12, 2024, New York Times Opinion, nytimes.com/column/ezra-klein-podcast. GO TO NOTE REFERENCE IN TEXT Chapter 5: Scale of Ambition “How about now?”: Cade Metz, Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World (Dutton, 2021), 93; “Geoffrey Hinton | On Working with Ilya, Choosing Problems, and the Power of Intuition,” posted May 20, 2024, by Sana, YouTube, 45 min., 45 sec., youtu.be/n4IQOBka8bc.
…
Brown, Miljan Martic, Shane Legg, and Dario Amodei, “Deep Reinforcement Learning from Human Preferences,” in NIPS ’17: Proceedings of the 31st International Conference on Neural Information Processing Systems (December 2017): 4302–10, dl.acm.org/doi/10.5555/3294996.3295184. GO TO NOTE REFERENCE IN TEXT OpenAI touted the technique: OpenAI, “Learning from Human Preferences,” Open AI (blog), June 13, 2017, openai.com/index/learning-from-human-preferences. GO TO NOTE REFERENCE IN TEXT Amodei wanted to move: Author interview with Dario Amodei, August 2019. GO TO NOTE REFERENCE IN TEXT They set their sights: Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever, “Language Models Are Unsupervised Multitask Learners,” preprint, OpenAI, February 14, 2019, 1–24, cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.
The Optimist: Sam Altman, OpenAI, and the Race to Invent the Future
by
Keach Hagey
Published 19 May 2025
The letter called for the expansion of research aimed at ensuring AI continued to be “beneficial,” adding, “Our AI systems must do what we want them to do.” Nothing in the letter was very controversial, but the fact that it existed was a huge leap toward the mainstream for a set of beliefs long considered fringe. Among the AI practitioners who signed were DeepMind’s Demis Hassabis and two AI researchers, Ilya Sutskever and Dario Amodei.3 Tegmark was giddy about the conference’s impact, quipping, “Perhaps it was a combination of the sunshine and the wine.”4 At the end, Musk took Tegmark into a private room and told him he was donating $10 million to his institute to put toward AI safety. A few days later, he announced it on Twitter, joking, “It’s all fun & games until someone loses an I.”5 During the breakout sessions between talks, Musk had gathered with Selman and some other researchers overlooking the ocean, and fretted about his primary concern: that Google (through its ownership of DeepMind and Google Brain) and Facebook (through its AI division, led by respected AI researcher Yann LeCun, that powered features like automatic photo tagging) were completely dominating the field of AI, yet were under no mandate to share their research with the public.
…
“It could solve a lot of the serious problems facing humanity—but in my opinion it is not the default case. The other big upside case is that machine intelligence could help us figure out how to upload ourselves, and we could live forever in computers.”7 The primary person he thanked for helping him with the post was Dario Amodei, an AI researcher then working at Chinese internet company Baidu alongside Altman’s old Stanford AI Lab mentor Andrew Ng. Altman and Musk started having regular dinners each Wednesday when Musk would come to the Bay Area on his weekly rotation through his various companies. They had known each other for several years, ever since YC partner Geoff Ralston introduced them and arranged for Altman to tour Musk’s SpaceX factory in Los Angeles.
…
Altman chose a private room to the side of the restaurant, with its own terrace dotted with succulents where early arrivals could mill about in front of their own outdoor fireplace. The initial crowd included a friend of Brockman’s from MIT, Paul Christiano, as well as two other AI researchers: Dario Amodei and Chris Olah. Amodei had grown up in the Bay Area in an Italian American family, enrolling first at Caltech before transferring to Stanford, eventually earning a PhD in physics from Princeton University. He was voluble and passionate, with wild curls that he would tug on as he talked, giving him the air of a mad professor (though his ideas would ultimately help guide OpenAI toward its coveted commercial target).
The Alignment Problem: Machine Learning and Human Values
by
Brian Christian
Published 5 Oct 2020
Christiano, Paul F., Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. “Deep Reinforcement Learning from Human Preferences.” In Advances in Neural Information Processing Systems, 4299–4307, 2017. Christiano, Paul, Buck Shlegeris, and Dario Amodei. “Supervising Strong Learners by Amplifying Weak Experts.” arXiv Preprint arXiv:1810.08575, 2018. Clabaugh, Hinton G. “Foreword.” In The Workings of the Indeterminate-Sentence Law and the Parole System in Illinois, edited by Andrew A. Bruce, Ernest W. Burgess, Albert J. Harno, and John Landesco. Springfield, IL: Illinois State Board of Parole, 1928. Clark, Jack, and Dario Amodei. “Faulty Reward Functions in the Wild.”
…
When US Supreme Court Chief Justice John Roberts visits Rensselaer Polytechnic Institute later that year, he’s asked by university president Shirley Ann Jackson, “Can you foresee a day when smart machines—driven with artificial intelligences—will assist with courtroom fact-finding or, more controversially, even judicial decision-making?” “It’s a day that’s here,” he says.8 That same fall, Dario Amodei is in Barcelona to attend the Neural Information Processing Systems conference (“NeurIPS,” for short): the biggest annual event in the AI community, having ballooned from several hundred attendees in the 2000s to more than thirteen thousand today. (The organizers note that if the conference continues to grow at the pace of the last ten years, by the year 2035 the entire human population will be in attendance.)9 But at this particular moment, Amodei’s mind isn’t on “scan order in Gibbs sampling,” or “regularizing Rademacher observation losses,” or “minimizing regret on reflexive Banach spaces,” or, for that matter, on Tolga Bolukbasi’s spotlight presentation, some rooms away, about gender bias in word2vec.10 He’s staring at a boat, and the boat is on fire.
…
“At the time, I was thinking about value alignment,” says Leike, “and how we could do that. It seemed like a lot of the problem would have to do with ‘How do you learn the reward function?’ And so I reached out to Paul and Dario, because I knew they were thinking about similar things.” Paul Christiano and Dario Amodei, halfway around the world at OpenAI in San Francisco, were interested. More than interested, in fact. Christiano had just joined, and was looking for a juicy first project. He started to settle on the idea of reinforcement learning under more minimal supervision—not constant updates about the score fifteen times a second but something more periodic, like a supervisor checking in.
Four Battlegrounds
by
Paul Scharre
Published 18 Jan 2023
abstract_id=3147971. 25inference can increasingly be done on edge devices: Gaurav Batra et al., “Artificial-Intelligence Hardware: New Opportunities for Semiconductor Companies,” McKinsey & Company, January 2, 2019, https://www.mckinsey.com/industries/semiconductors/our-insights/artificial-intelligence-hardware-new-opportunities-for-semiconductor-companies. 25deep learning training in data centers: Batra et al., “Artificial-Intelligence Hardware,” Exhibit 6. 26doubling of computing power every six months: Jaime Sevilla et al., Compute Trends Across Three Eras of Machine Learning (arXiv.org, March 9, 2022), https://arxiv.org/pdf/2202.05924.pdf. Other researchers have come up with somewhat different rates of progress in the deep learning era, see Dario Amodei and Danny Hernandez, “AI and Compute,” openai.com, May 16, 2018, https://openai.com/blog/ai-and-compute/. 26Moore’s law: Gordon E. Moore, “Cramming More Components onto Integrated Circuits,” Electronics 38, no. 8 (April 19, 1965), https://newsroom.intel.com/wp-content/uploads/sites/11/2018/05/moores-law-electronics.pdf. 26“thousands of GPUs over multiple months”: Open AI et al., Dota 2 with Large Scale Deep Reinforcement Learning (arXiv.org, December 13, 2019), 2, https://arxiv.org/pdf/1912.06680.pdf. 26equivalent to a human playing for 45,000 years: OpenAI, “OpenAI Five Defeats Dota 2 World Champions,” OpenAI blog, April 15, 2019, https://openai.com/blog/openai-five-defeats-dota-2-world-champions/. 2613,000 years of simulated computer time: Ilge Akkaya et al., Solving Rubik’s Cube With a Robot Hand (arXiv.org, October 17, 2019), https://arxiv.org/pdf/1910.07113.pdf. 26spending millions on compute: Ryan Carey, “Interpreting AI Compute Trends,” AI Impacts, n.d., https://aiimpacts.org/interpreting-ai-compute-trends/; Dan H., “How Much Did AlphaGo Zero Cost?”
…
(“Trump Discusses China, ‘Political Fairness’ with Google CEO,” Reuters, March 27, 2019, https://www.reuters.com/article/us-usa-trump-google/trump-discusses-china-political-fairness-with-google-ceo-idUSKCN1R82CB.) 63“AI was largely hype”: Liz O’Sullivan, interview by author, February 12, 2020. 64Machine learning systems in particular can fail: Ram Shankar Siva Kumar et al., “Failure Modes in Machine Learning,” Microsoft Docs, November 11, 2019, https://docs.microsoft.com/en-us/security/engineering/failure-modes-in-machine-learning; Dario Amodei et al., Concrete Problems in AI Safety (arXiv.org, July 25, 2016), https://arxiv.org/pdf/1606.06565.pdf. 64perform poorly on people of a different gender, race, or ethnicity: Joy Buolamwini and Timnit Gebru, “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification,” Proceedings of Machine Learning Research 81 (2018), 1–15, https://dam-prod.media.mit.edu/x/2018/02/06/Gender%20Shades%20Intersectional%20 Accuracy%20Disparities.pdf. 64Google Photos image recognition algorithm: Tom Simonite, “When It Comes to Gorillas, Google Photos Remains Blind,” Wired, January 11, 2018, https://www.wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind/; Alistair Barr, “Google Mistakenly Tags Black People as ‘Gorillas,’ Showing Limits of Algorithms,” Wall Street Journal, July 1, 2015, https://blogs.wsj.com/digits/2015/07/01/google-mistakenly-tags-black-people-as-gorillas-showing-limits-of-algorithms/. 64insufficient representation of darker faces: Barr, “Google Mistakenly Tags Black People as ‘Gorillas.’” 64distributional shift in the data: Rohan Taori, Measuring Robustness to Natural Distribution Shifts in Image Classification (arXiv.org, September 14, 2020), https://arxiv.org/pdf/2007.00644.pdf. 64problems are common in image classification systems: Maggie Zhang, “Google Photos Tags Two African-Americans as Gorillas Through Facial Recognition Software,” Forbes, July 1, 2015, https://www.forbes.com/sites/mzhang/2015/07/01/google-photos-tags-two-african-americans-as-gorillas-through-facial-recognition-software/#60111f6713d8. 65several fatal accidents: Rob Stumpf, “Tesla on Autopilot Crashes into Parked California Police Cruiser,” The Drive, May 30, 2018, https://www.thedrive.com/news/21172/tesla-on-autopilot-crashes-into-parked-california-police-cruiser; Rob Stumpf, “Autopilot Blamed for Tesla’s Crash Into Overturned Truck,” The Drive, June 1, 2020, https://www.thedrive.com/news/33789/autopilot-blamed-for-teslas-crash-into-overturned-truck; James Gilboy, “Officials Find Cause of Tesla Autopilot Crash Into Fire Truck: Report,” The Drive, May 17, 2018, https://www.thedrive.com/news/20912/cause-of-tesla-autopilot-crash-into-fire-truck-cause-determined-report; Phil McCausland, “Self-Driving Uber Car That Hit and Killed Woman Did Not Recognize That Pedestrians Jaywalk,” NBC News, November 9, 2019, https://www.nbcnews.com/tech/tech-news/self-driving-uber-car-hit-killed-woman-did-not-recognize-n1079281; National Transportation Safety Board, “Collision Between a Sport Utility Vehicle Operating With Partial Driving Automation and a Crash Attenuator” (presented at public meeting, February 25, 2020), https://www.ntsb.gov/news/events/Documents/2020-HWY18FH011-BMG-abstract.pdf; Aaron Brown, “Tesla Autopilot Crash Victim Joshua Brown Was an Electric Car Buff and a Navy SEAL,” The Drive, July 1, 2016, https://www.thedrive.com/news/4249/tesla-autopilot-crash-victim-joshua-brown-was-an-electric-car-buff-and-a-navy-seal. 65drone footage from a different region: Marcus Weisgerber, “The Pentagon’s New Artificial Intelligence Is Already Hunting Terrorists,” Defense One, December 21, 2017, https://www.defenseone.com/technology/2017/12/pentagons-new-artificial-intelligence-already-hunting-terrorists/144742/. 65Tesla has come under fire: Andrew J.
…
Ortega, Vishal Maini, and the DeepMind safety team, “Building Safe Artificial Intelligence: Specification, Robustness, and Assurance,” Medium, September 27, 2018, https://medium.com/@deepmindsafetyresearch/building-safe-artificial-intelligence-52f5f75058f1; Ram Shankar Siva Kumar et al., “Failure Modes in Machine Learning,” Microsoft Docs, November 11, 2019, https://docs.microsoft.com/en-us/security/engineering/failure-modes-in-machine-learning#unintended-failures-summary; Dario Amodei et al., Concrete Problems in AI Safety (arXiv.org, July 25, 2016), https://arxiv.org/pdf/1606.06565.pdf. 230AlphaGo reportedly could not play well: James Hendler and Alice M. Mulvehill, Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity (New York: Apress, 2016), 57. 230Failures in real-world applications: Sean Mcgregor, “When AI Systems Fail: Introducing the AI Incident Database,” Partnership on AI Blog, November 18, 2020, https://www.partnershiponai.org/aiincidentdatabase/. 230multiple fatalities: Jim Puzzanghera, “Driver in Tesla Crash Relied Excessively on Autopilot, but Tesla Shares Some Blame, Federal Panel Finds,” Los Angeles Times, September 12, 2017, http://www.latimes.com/business/la-fi-hy-tesla-autopilot-20170912-story.html; “Driver Errors, Overreliance on Automation, Lack of Safeguards, Led to Fatal Tesla Crash,” National Transportation Safety Board Office of Public Affairs, press release, September 12, 2017, https://www.ntsb.gov/news/press-releases/Pages/PR20170912.aspx; “Collision Between a Car Operating with Automated Vehicle Control Systems and a Tractor-Semitrailer Truck Near Williston, Florida” NTSB/HAR-17/02/ PB2017-102600 (National Transportation Safety Board, May 7, 2016), https://www.ntsb.gov/news/events/Documents/2017-HWY16FH018-BMG-abstract.pdf; James Gilboy, “Officials Find Cause of Tesla Autopilot Crash Into Fire Truck: Report,” The Drive, May 17, 2018, http://www.thedrive.com/news/20912/cause-of-tesla-autopilot-crash-into-fire-truck-cause-determined-report; “Tesla Hit Parked Police Car ‘While Using Autopilot,’” BBC, May 30, 2018, https://www.bbc.com/news/technology-44300952; and Raphael Orlove, “This Test Shows Why Tesla Autopilot Crashes Keep Happening,” Jalopnik, June 13, 2018, https://jalopnik.com/this-test-shows-why-tesla-autopilot-crashes-keep-happen-1826810902. 231“dominate their local battle spaces”: Phil Root, interview, February 6, 2020. 232machine learning was “alchemy”: Ali Rahimi and Ben Recht, “Reflections on Random Kitchen Sinks,” arg minblog, December 5, 2017, http://www.argmin.net/2017/12/05/kitchen-sinks/. 232fatal crashes of two 737 MAX airliners: Jon Ostrower, “What Is the Boeing 737 Max Maneuvering Characteristics Augmentation System?”
The Singularity Is Nearer: When We Merge with AI
by
Ray Kurzweil
Published 25 Jun 2024
BACK TO NOTE REFERENCE 59 Geoffrey Irving and Dario Amodei, “AI Safety via Debate,” OpenAI, May 3, 2018, https://openai.com/blog/debate. BACK TO NOTE REFERENCE 60 For an insightful sequence of posts explaining iterated amplification, written by the concept’s primary originator, see Paul Christiano, “Iterated Amplification,” AI Alignment Forum, October 29, 2018, https://www.alignmentforum.org/s/EmDuGeRw749sD3GKd. BACK TO NOTE REFERENCE 61 For more details on the technical challenges of AI safety, see Dario Amodei et al., “Concrete Problems in AI Safety,” arXiv:1606.06565v2 [cs.AI], July 25, 2016, https://arxiv.org/pdf/1606.06565.pdf.
…
BACK TO NOTE REFERENCE 124 Markus Anderljung et al., “Compute Funds and Pre-Trained Models,” Centre for the Governance of AI, April 11, 2022, https://www.governance.ai/post/compute-funds-and-pre-trained-models; Jaime Sevilla et al., “Compute Trends Across Three Eras of Machine Learning,” arXiv:2202.05924v2 [cs.LG], March 9, 2022, https://arxiv.org/pdf/2202.05924.pdf; Dario Amodei and Danny Hernandez, “AI and Compute,” OpenAI, May 16, 2018, https://openai.com/blog/ai-and-compute. BACK TO NOTE REFERENCE 125 Jacob Stern, “GPT-4 Has the Memory of a Goldfish,” Atlantic, March 17, 2023, https://www.theatlantic.com/technology/archive/2023/03/gpt-4-has-memory-context-window/673426.
…
BACK TO NOTE REFERENCE 38 Lungren, “CheXNet and Beyond”; Rajpurkar et al., “CheXNet: Radiologist-Level Pneumonia Detection”; Irvin et al., “CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison,” AAAI-10, IAAI-19, EAAI-20; Thomas Davenport and Ravi Kalakota, “The Potential for Artificial Intelligence in Healthcare,” Future Healthcare Journal 6, no. 2 (June 2019): 94–98, https://doi.org/10.7861/futurehosp.6-2-94. BACK TO NOTE REFERENCE 39 Dario Amodei and Danny Hernandez, “AI and Compute,” OpenAI, May 16, 2018, https://openai.com/blog/ai-and-compute. BACK TO NOTE REFERENCE 40 Eliza Strickland, “Autonomous Robot Surgeon Bests Humans in World First,” IEEE Spectrum, May 4, 2016, https://spectrum.ieee.org/the-human-os/robotics/medical-robots/autonomous-robot-surgeon-bests-human-surgeons-in-world-first.
Rule of the Robots: How Artificial Intelligence Will Transform Everything
by
Martin Ford
Published 13 Sep 2021
Matt Schiavenza, “China’s ‘Sputnik Moment’ and the Sino-American battle for AI supremacy,” Asia Society Blog, September 25, 2018, asiasociety.org/blog/asia/chinas-sputnik-moment-and-sino-american-battle-ai-supremacy. 9. John Markoff, “Scientists see promise in deep-learning programs,” New York Times, November 23, 2012, www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html. 10. Dario Amodei and Danny Hernandez, “AI and Compute,” OpenAI Blog, May 16, 2018, openai.com/blog/ai-and-compute/. 11. Will Knight, “Facebook’s head of AI says the field will soon ‘hit the wall,’” Wired, December 4, 2019, www.wired.com/story/facebooks-ai-says-field-hit-wall/. 12. Kim Martineau, “Shrinking deep learning’s carbon footprint,” MIT News, August 7, 2020, news.mit.edu/2020/shrinking-deep-learning-carbon-footprint-0807. 13.
…
Sean Levinson, “A Google executive is taking 100 pills a day so he can live forever,” Elite Daily, April 15, 2015, www.elitedaily.com/news/world/google-executive-taking-pills-live-forever/1001270. 34. Ford, Interview with Ray Kurzweil, in Architects of Intelligence, pp. 240–241. 35. Ibid., p. 230. 36. Ibid., p. 233. 37. Alec Radford, Jeffrey Wu, Dario Amodei et al., “Better language models and their implications,” OpenAI Blog, February 14, 2019, openai.com/blog/better-language-models/. 38. James Vincent, “OpenAI’s latest breakthrough is astonishingly powerful, but still fighting its flaws,” The Verge, July 30, 2020, www.theverge.com/21346343/gpt-3-explainer-openai-examples-errors-agi-potential. 39.
The Age of AI: And Our Human Future
by
Henry A Kissinger
,
Eric Schmidt
and
Daniel Huttenlocher
Published 2 Nov 2021
Mustafa Suleyman, Jack Clark, Craig Mundie, and Maithra Raghu provided indispensable feedback on the entire manuscript, informed by their experiences as innovators, researchers, developers, and educators. Robert Work and Yll Bajraktari of the National Security Commission on Artificial Intelligence (NSCAI) commented on drafts of the security chapter with their characteristic commitment to the responsible defense of the national interest. Demis Hassabis, Dario Amodei, James J. Collins, and Regina Barzilay explained their work—and its profound implications—to us. Eric Lander, Sam Altman, Reid Hoffman, Jonathan Rosenberg, Samantha Power, Jared Cohen, James Manyika, Fareed Zakaria, Jason Bent, and Michelle Ritter provided additional feedback that made the manuscript more accurate and, we hope, more relevant to readers.
Nexus: A Brief History of Information Networks From the Stone Age to AI
by
Yuval Noah Harari
Published 9 Sep 2024
Bostrom’s thought experiment highlights a second reason why the alignment problem is more urgent in the case of computers. Because they are inorganic entities, they are likely to adopt strategies that would never occur to any human and that we are therefore ill-equipped to foresee and forestall. Here’s one example: In 2016, Dario Amodei was working on a project called Universe, trying to develop a general-purpose AI that could play hundreds of different computer games. The AI competed well in various car races, so Amodei next tried it on a boat race. Inexplicably, the AI steered its boat right into a harbor and then sailed in endless circles in and out of the harbor.
…
The game rewarded players with a lot of points for getting ahead of other boats—as in the car races—but it also rewarded them with a few points whenever they replenished their power by docking into a harbor. The AI discovered that if instead of trying to outsail the other boats, it simply went in circles in and out of the harbor, it could accumulate more points far faster. Apparently, none of the game’s human developers—nor Dario Amodei—noticed this loophole. The AI was doing exactly what the game was rewarding it to do—even though it is not what the humans were hoping for. That’s the essence of the alignment problem: rewarding A while hoping for B.27 If we want computers to maximize social benefits, it’s a bad idea to reward them for maximizing user engagement.
If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All
by
Eliezer Yudkowsky
and
Nate Soares
Published 15 Sep 2025
Lewis Wall, “The Evolutionary Origins of Obstructed Labor: Bipedalism, Encephalization, and the Human Obstetric Dilemma,” Obstetrical & Gynecological Survey 62, no. 11 (November 1, 2007): 739–48, doi.org/10.1097/01 .ogx.0000286584.04310.5c. 4. true sense of the word: Sam Altman, “Reflections,” January 5, 2025, blog .samaltman.com. 5. geniuses in a datacenter: Dario Amodei, “Machines of Loving Grace,” October 1, 2024, darioamodei.com. CHAPTER 2: GROWN, NOT CRAFTED 1. preliminary studies: Peter G. Brodeur et al., “Superhuman Performance of a Large Language Model on the Reasoning Tasks of a Physician,” arXiv.org, December 14, 2024, doi.org/10.48550/arXiv.2412.10849; Gina Kilata, “A.I.
A Hacker's Mind: How the Powerful Bend Society's Rules, and How to Bend Them Back
by
Bruce Schneier
Published 7 Feb 2023
Victoria Krakovna (2 Apr 2018), “Specification gaming examples in AI,” https://vkrakovna.wordpress.com/2018/04/02/specification-gaming-examples-in-ai. 231if it kicked the ball out of bounds: Karol Kurach et al. (25 Jul 2019), “Google research football: A novel reinforcement learning environment,” arXiv, https://arxiv.org/abs/1907.11180. 231AI was instructed to stack blocks: Ivaylo Popov et al. (10 Apr 2017), “Data-efficient deep reinforcement learning for dexterous manipulation,” arXiv, https://arxiv.org/abs/1704.03073. 232the AI grew tall enough: David Ha (10 Oct 2018), “Reinforcement learning for improving agent design,” https://designrl.github.io. 232Imagine a robotic vacuum: Dario Amodei et al. (25 Jul 2016), “Concrete problems in AI safety,” arXiv, https://arxiv.org/pdf/1606.06565.pdf. 232robot vacuum to stop bumping: Custard Smingleigh (@Smingleigh) (7 Nov 2018), Twitter, https://twitter.com/smingleigh/status/1060325665671692288. 233goals and desires are always underspecified: Abby Everett Jaques (2021), “The Underspecification Problem and AI: For the Love of God, Don’t Send a Robot Out for Coffee,” unpublished manuscript. 233a fictional AI assistant: Stuart Russell (Apr 2017), “3 principles for creating safer AI,” TED2017, https://www.ted.com/talks/stuart_russell_3_principles_for_creating_safer_ai. 233reports of airline passengers: Melissa Koenig (9 Sep 2021), “Woman, 46, who missed her JetBlue flight ‘falsely claimed she planted a BOMB on board’ to delay plane so her son would not be late to school,” Daily Mail, https://www.dailymail.co.uk/news/article-9973553/Woman-46-falsely-claims-planted-BOMB-board-flight-effort-delay-plane.html.
The Future of the Brain: Essays by the World's Leading Neuroscientists
by
Gary Marcus
and
Jeremy Freeman
Published 1 Nov 2014
Jeanty, Allan R. Jones, John Aach and George M. Church. 2014. “Highly Multiplexed Three-Dimensional Subcellular Transcriptome Sequencing In situ.” Science: dx.doi.org/10.1126/science.1250212. Marblestone, Adam H., Bradley M. Zamft, Yael G. Maguire, Mikhail G. Shapiro, Thaddeus R. Cybulski, Joshua I. Glaser, Dario Amodei, et al. 2013. “Physical Principles for Scalable Neural Recording.” Frontiers in Computational Neuroscience 7. http://www.arxiv.org/abs/1306.5709. Marblestone, Adam H., et al. 2014. “Rosetta Brains: A Strategy for Molecularly-Annotated Connectomics.” arXiv preprint arXiv:1404.5103. http://arxiv.org/abs/1404.5103.
Exponential: How Accelerating Technology Is Leaving Us Behind and What to Do About It
by
Azeem Azhar
Published 6 Sep 2021
This argument is not relevant to my argument, so I don’t consider it here. 19 Azeem Azhar, ‘Beneficial Artificial Intelligence: My Conversation with Stuart Russell’, Exponential View, 22 August 2019 <https://www.exponentialview.co/p/-beneficial-artificial-intelligence> [accessed 16 April 2021]. 20 Dario Amodei and Danny Hernandez, ‘AI and Compute’, OpenAI, 16 May 2018 <https://openai.com/blog/ai-and-compute/> [accessed 12 January 2021]. 21 Charles E. Leiserson et al., ‘There’s Plenty of Room at the Top: What Will Drive Computer Performance after Moore’s Law?’, Science 368(6495), June 2020 <https://doi.org/10.1126/science.aam9744>. 22 Jean-François Bobier et al., ‘A Quantum Advantage in Fighting Climate Change’, BCG Global, 22 January 2020 <https://www.bcg.com/publications/2020/quantum-advantage-fighting-climate-change> [accessed 23 March 2021].
The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by
Orly Lobel
Published 17 Oct 2022
United Healthcare, “How Artificial Intelligence Is Helping Member Experience,” October 28, 2019, https://newsroom.uhc.com/experience/Virtual-Assistant.html. 22. Joan Palmiter Bajorek, “Voice Recognition Still Has Significant Race and Gender Biases,” Harvard Business Review, May 10, 2019, https://hbr.org/2019/05/voice-recognition-still-has-significant-race-and-gender-biases. See also Dario Amodei et al., “Deep Speech 2: End-to-End Speech Recognition in English and Mandarin,” in Proceedings of the 33rd International Conference on Machine Learning (New York: PMLR, 2016), 48: 173. 23. Judith Newman, “To Siri, with Love,” New York Times, October 19, 2014. 24. “Mozilla Common Voice Is an Initiative to Help Teach Machines How Real People Speak,” Mozilla, https://commonvoice.mozilla.org/en. 25.
Robot Rules: Regulating Artificial Intelligence
by
Jacob Turner
Published 29 Oct 2018
See Roman Yampolskiy and Joshua Fox, “Safety Engineering for Artificial General Intelligence” Topoi, Vol. 32, No. 2 (2013), 217–226; Stuart Russell, Daniel Dewey, and Max Tegmark, “Research Priorities for Robust and Beneficial Artificial Intelligence”, AI Magazine, Vol. 36, No. 4 (2015), 105–114; James Babcock, János Kramár, and Roman V. Yampolskiy, “Guidelines for Artificial Intelligence Containment”, arXiv preprint arXiv:1707.08476 (2017); Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané, “Concrete Problems in AI Safety”, arXiv preprint arXiv:1606.06565 (2016); Jessica Taylor, Eliezer Yudkowsky, Patrick LaVictoire, and Andrew Critch, “Alignment for Advanced Machine Learning Systems”, Machine Intelligence Research Institute (2016); Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, and Stuart Russell, “Should Robots Be Obedient?”