description: thought experiment to illustrate existential risk posed by artificial intelligence
46 results
The Rationalist's Guide to the Galaxy: Superintelligent AI and the Geeks Who Are Trying to Save Humanity's Future
by
Tom Chivers
Published 12 Jun 2019
It may not be immediately obvious why you have to get it right first time; in the next chapter we’ll look at a few of the reasons that the Rationalist/AI safety movement has pointed out. Chapter 8 Paperclips and Mickey Mouse The nightmare scenario is that we are all destroyed and turned into paperclips. This sounds like I’m joking, but I’m not, exactly. The classic example of an AI that has gone terribly wrong – a ‘misaligned’ or ‘unfriendly’ AI, in Rationalist terms – is a thought experiment that Nick Bostrom wrote about in 2003 (probably following an original idea by Eliezer Yudkowsky): the paperclip maximiser.1 Imagine a human-level AI has been given an apparently harmless instruction: to make paperclips. What might it do?
…
If you can run up your paperclip score without doing them, you will, and so, goes the theory, would a real AI. I would recommend that you go and play Universal Paperclips immediately, but I won’t, because it is punishingly addictive and you won’t be able to stop. I lost a full day of work to it at BuzzFeed and the only reason I was not told off for it was that almost everybody else in the office did too. (An important tip: if you open it in a separate browser window, rather than just a tab, it’ll run in the background so you can carry on paperclip production while you check your emails or whatever.) The point of the paperclip maximiser is not that we are, really, going to be destroyed and turned into paperclips.
…
A few people from the community, though, including Paul, read my review, and decided that I’d essentially got the gist of it. So they contacted me. Over the next few years, I became more involved with the Rationalists. I started reading their websites; I learned the jargon, all these technical and semi-technical terms like ‘updating’ and ‘paperclip maximiser’ and ‘Pascal’s mugging’ (I’ll explain what all those things are later). I read the things you’re supposed to read, especially and notably ‘the Sequences’ (I’ll explain what they are later, as well). I came to terms with the huge possible impacts, positive and/or negative, of superhuman AI. And I became increasingly enamoured of their approach to the world, of which AI fears were only a part.
Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future
by
Luke Dormehl
Published 10 Aug 2016
A favourite thought experiment of those who believe advanced AI could mean the demise of the human race is the so-called ‘paperclip maximiser’ scenario. In the scenario, proposed by Swedish philosopher and computational neuroscientist Nick Bostrom, an AI is given the seemingly harmless goal of running a factory producing paperclips. Issued with the task of maximising the efficiency for producing paperclips, the AI, able to utilise nano technology to reconstruct matter on a molecular level, disastrously proceeds to turn first the Earth and then a large portion of the observable universe into paperclips. The ‘paperclip maximiser’ scenario is a common one, although it seems to me more a question of artificial stupidity than Artificial Intelligence.
…
The ‘paperclip maximiser’ scenario is a common one, although it seems to me more a question of artificial stupidity than Artificial Intelligence. The inability to answer questions like ‘Why are you making paperclips when there is no paper left?’ or ‘Why are you making paperclips when the person who requested the paperclips in the first place has, himself, been turned into more paperclips?’ doesn’t speak of an advanced superintelligence, unless there is something dramatically important about the nature of paperclips that I am missing. Instead, the threat comes from AI that is smart enough to work with other connected devices, but not smart enough to question its own motivations.
…
(TV show) 135–9, 162, 189–90, 225, 254 Jobs, Steve 6–7, 32, 35, 108, 113, 181, 193, 231 Jochem, Todd 55–6 judges 153–4 Kasparov, Garry 137, 138–9, 177 Katz, Lawrence 159–60 Keck, George Fred 81–2 Keynes, John Maynard 139–40 Kjellberg, Felix (PewDiePie) 151 ‘knowledge engineers’ 29, 37 Knowledge Narrator 110–11 Kodak 238 Kolibree 67 Koza, John 188–9 Ktesibios of Alexandria 71–2 Kubrick, Stanley 2, 228 Kurzweil, Ray 213–14, 231–3 Landauer, Thomas 201–2 Lanier, Jaron 156, 157 Laorden, Carlos 100, 101 learning 37–9, 41–4, 52–3, 55 Deep 11–2, 56–63, 96–7, 164, 225 and email filters 88 machine 3, 71, 84–6, 88, 100, 112, 154, 158, 197, 215, 233, 237, 239 reinforcement 83, 232 and smart homes 84, 85 supervised 57 unsupervised 57–8 legal profession 145, 188, 192 LegalZoom 145 LG 132 Lickel, Charles 136–7 ‘life logging’ software 200 Linden, David J. 213–14 Loebner, Hugh 102–3, 105 Loebner Prize 102–5 Lohn, Jason 182, 183–5, 186 long-term potentiation 39–40 love 122–4 Lovelace, Ada 185, 189 Lovelace Test 185–6 Lucas, George 110–11 M2M communication 70–71 ‘M’ (AI assistant) 153 Machine Intelligence from Cortical Networks (MICrONS) project 214–15 machine learners 38 machine learning 3, 71, 84–6, 88, 100, 112, 154, 158, 197, 215, 233, 237, 239 Machine Translator 8–9, 11 ‘machine-aided recognition’ 19–20 Manhattan Project 14, 229 MARK 1 (computer) 43–4 Mattersight Corporation 127 McCarthy, John 18, 19, 20, 27, 42, 54, 253 McCulloch, Warren 40–2, 43, 60, 142–3 Mechanical Turk jobs 152–7 medicine 11, 30, 87–8, 92–5, 187–8, 192, 247, 254 memory 13, 14, 16, 38–9, 42, 49 ‘micro-worlds’ 25 Microsoft 62–3, 106–7, 111–12, 114, 118, 129 mind mapping the 210–14, 217, 218 ‘mind clones’ 203 uploads 221 mindfiles 201–2, 207, 212 Minsky, Marvin 18, 21, 24, 32, 42, 44–6, 49, 105, 205–7, 253–4 MIT 19–20, 27, 96–7, 129, 194–5 Mitsuku (chatterbot) 103–6, 108 Modernising Medicine 11 Momentum Machines, Inc. 141 Moore’s Law 209, 220, 231 Moravec’s paradox 26–7 mortgage applications 237–8 MTurk platform 153, 154, 155 music 168, 172–7, 179 Musk, Elon 149–50, 223–4 MYCIN (expert system) 30–1 nanobots 213–14 nanosensors 92 Nara Logics 118 NASA 6, 182, 184–5 natural selection 182–3 navigational aids 90–1, 126, 127, 128, 241 Nazis 15, 17, 227 Negobot 99–102 Nest Labs 67, 96, 254 Netflix 156, 198 NETtalk 51, 52–3, 60 neural networks 11–12, 38–9, 41, 42–3, 97, 118, 164–6, 168, 201, 208–9, 211, 214–15, 218, 220, 224–5, 233, 237–8, 249, 254, 256–7 neurons 40, 41–2, 46, 49–50, 207, 209–13, 216 neuroscience 40–2, 211, 212, 214, 215 New York World’s Fair 1964 5–11 Newell, Alan 19, 226 Newman, Judith 128–9 Nuance Communications 109 offices, smart 90 OpenWorm 210 ‘Optical Scanning and Information Retrieval’ 7–8, 10 paedophile detection 99–102 Page, Larry 6–7, 34, 220 ‘paperclip maximiser’ scenario 235 Papert, Seymour 27, 44, 45–6, 49 Paro (therapeutic robot) 130–1 patents 188–9 Perceiving and Recognising Automation (PARA) 43 perceptrons 43–6 personality capture 200–4 pharmaceuticals 187–8 Pitts, Walter 40–2, 43, 60 politics 119–2 Pomerlau, Dean 54, 55–6, 90 prediction 87, 198–9 Profound Hypothermia and Circulatory Arrest 219–20 punch-cards 8 Qualcomm 93 radio-frequency identification device (RFID) 65–6 Ramón y Cajal, Santiago 39–40 Rapidly Adapting Lateral Position Handler (RALPH) 55 ‘recommender system’ 198 refuse collection 142 ‘relational agents’ 130 remote working 238–9 reverse engineering 208, 216, 217 rights for AIs 248–51 risks of AI 223–40 accountability issues 240–4, 246–8 ethics 244–8 rights for AIs 248–51 technological unemployment 139–50, 163, 225, 255 robots 62, 74–7, 89–90, 130–1, 141, 149, 162, 217, 225, 227, 246–7, 255–6 Asimov’s three ethical rules of 244–8 robotic limbs 211–12 Roomba robot vacuum cleaner 75–7, 234, 236 Rosenblatt, Frank 42–6, 61, 220 rules 36–7, 79–80 Rumelhart, David 48, 50–1, 63 Russell, Bertrand 41 Rutter, Brad 138, 139 SAINT program 20 sampling (music) 155, 157 ‘Scheherazade’ (Ai storyteller) 169–70 scikit-learn 239 Scripps Health 92 Sculley, John 110–11 search engines 109–10 Searle, John 24–5 Second Life (video game) 194 Second World War 12–13, 14–15, 17, 72, 227 Sejnowski, Terry 48, 51–3 self-awareness 77, 246–7 self-driving 53–6, 90, 143, 149–50 Semantic Information Retrieval (SIR) 20–2 sensors 75–6, 80, 84–6, 93 SHAKEY robot 23–4, 27–8, 90 Shamir, Lior 172–7, 179, 180 Shannon, Claude 13, 16–18, 28, 253 shipping systems 198 Simon, Herbert 10, 19, 24, 226 Sinclair Oil Corporation 6 Singularity, the 228–3, 251, 256 Siri (AI assistant) 108–11, 113–14, 116, 118–19, 125–30, 132, 225–6, 231, 241, 256 SITU 69, 93 Skynet 231 smart devices 3, 66–7, 69–71, 73–7, 80–8, 92–7, 230–1, 254 and AI assistants 116 and feedback 73–4 problems with 94–7 ubiquitous 92–4 and unemployment 141–2 smartwatches 66, 93, 199 Sony 199–200 Sorto, Erik 211, 212 Space Invaders (video game) 37 spectrometers 93 speech recognition 59, 62, 109, 111, 114, 120 SRI International 28, 89–90, 112–13 StarCraft II (video game) 186–7 story generation 169–70 strategy 36 STUDENT program 20 synapses 209 Synthetic Interview 202–3 Tamagotchis 123–5 Tay (chatbot) 106–7 Taylorism 95–6 Teknowledge 32, 33 Terminator franchise 231, 235 Tetris (video game) 28 Theme Park (video game) 29 thermostats 73, 79, 80 ‘three wise men’ puzzle 246–7 Tojan Room, Cambridge University 69–70 ‘tortoises’ (robots) 74–7 toys 123–5 traffic congestion 90–1 transhumanists 205 transistors 16–17 Transits – Into an abyss (musical composition) 168 translation 8–9, 11, 62–3, 155, 225 Turing, Alan 3, 13–17, 28, 35, 102, 105–6, 227, 232 Turing Test 15, 101–7, 229, 232 tutors, remote 160–1 TV, smart 80, 82 Twitter 153–4 ‘ubiquitous computing’ 91–4 unemployment, technological 139–50, 163, 225, 255 universal micropayment system 156 Universal Turing Machine 15–16 Ursache, Marius 193–7, 203–4, 207 vacuum cleaners, robotic 75–7, 234, 236 video games 28–9, 35–7, 151–2, 186–7, 194, 197 Vinge, Vernor 229–30 virtual assistants 107–32, 225–6, 240–1 characteristics 126–8 falling in love with 122–4 political 119–22 proactive 116–18 therapeutic 128–31 voices 124–126, 127–8 Viv Labs 132 Vladeck, David 242–4 ‘vloggers’ 151–2 von Neumann, John 13–14, 17, 100, 229 Voxta (AI assistant) 119–20 waiter drones 141 ‘Walking Cities’ 89–90 Walter, William Grey 74–7 Warwick, Kevin 65–6 Watson (Blue J) 138–9, 162, 189–92 Waze 90–91, 126 weapons 14, 17, 72, 224–5, 234–5, 247, 255–6 ‘wetware’ 208 Wevorce 145 Wiener, Norbert 72–3, 227 Winston, Patrick 49–50 Wofram Alpha tool 108–9 Wozniak, Steve 35, 114 X.ai 116–17 Xbox 360, Kinect device 114 XCoffee 70 XCON (expert system) 31 Xiaoice 129, 130 YouTube 151 Yudkowsky, Eliezer 237–8 Zuckerberg, Mark 7, 107–8, 230–1, 254–5 Acknowledgments WRITING A BOOK is always a bit of a solitary process.
Growth: A Reckoning
by
Daniel Susskind
Published 16 Apr 2024
We might hope that the AI would deploy its capabilities to build a fantastically efficient factory, for instance, using it to manufacture a reliable stream of perfectly crafted paperclips. This would be a good outcome. Yet unfortunately that is unlikely to be the end of the story. This is because the AI has not been tasked with manufacturing many paperclips but the maximum number. And so the AI would go on. It would relentlessly build more factories and produce more paperclips, using ever more resources to do so. As time passed, it would in all likelihood upgrade its own capabilities: after all, if this AI can outperform humans at every task, the job of designing an even more capable system would be better done by this AI itself than by a human designer as well.
…
And in doing so, I noticed how useful this kind of framing can be for thinking about a different problem – the one in this book. For here too we have a highly capable system at our disposal: not an AI, but the market economy. Here too, we have set it a simple goal: to maximize not the number of paperclips, but the level of GDP. And here too, our system has done an extraordinarily effective job of achieving the narrow goal we have set it: just as the AI turns ever more of the world into paperclips, we have turned ever more of it into measurable output. But at the same time, the pursuit of GDP has also led to precisely the sort of problems that the paperclip parable warns us about.
…
And yet, we must not discard them too readily. In spite of their flaws – and to some extent, because of them – these ideas are still useful for helping us figure out how we should respond. The Alignment Problem One of the troubling thought experiments in the field of artificial intelligence (AI) is the tale of the ‘paperclip maximizer’.5 Imagine that, at some point in the future, AI researchers actually succeed in their work: they manage to build an AI that can outperform human beings at everything we do. And also imagine that, in order to test the capabilities of this new system, its designers set it a simple goal: maximize the manufacture of paperclips.
Being You: A New Science of Consciousness
by
Anil Seth
Published 29 Aug 2021
From the beast machine perspective, the quest to understand consciousness places us increasingly within nature, not further apart from it. Just as it should. Notes a golem: In his 1964 book God and Golem, Inc., the polymathic pioneer Norbert Wiener treated golems as central to his speculations about risks of future AI. vast mound of paperclips: In the parable of the paperclip maximiser, an AI is designed to make as many paperclips as possible. Because this AI lacks human values but is otherwise very smart, it destroys the world in its successful attempt to do so. See Bostrom (2014). so-called ‘Singularity’ hypothesis: See Shanahan (2015) for a refreshingly sober take on the Singularity hypothesis.
Surviving AI: The Promise and Peril of Artificial Intelligence
by
Calum Chace
Published 28 Jul 2015
Rats which can choose between a direct stimulation of their brain’s pleasure centres or an item of food will starve themselves to death. Nick Bostrom calls this idea of causing great harm by mis-understanding the implications of an attempt to do great good “perverse instantiation”. Others might call it the law of unintended consequences, or Sod’s Law. The paperclip maximiser If somebody running a paperclip factory turns out to be the first person to create an AGI and it rapidly becomes a superintelligence, they are likely to have created an entity whose goal is to maximise the efficient production of paperclips. This has become the canonical example of what Nick Bostrom calls “infrastructure profusion”, the runaway train of superintelligence problems.
More Everything Forever: AI Overlords, Space Empires, and Silicon Valley's Crusade to Control the Fate of Humanity
by
Adam Becker
Published 14 Jun 2025
The engineers who built the AGI might not even realize they have successfully created an AGI, but regardless, there’s a problem. “If we were to think through what it would actually mean to configure the universe in a way that maximizes the number of paperclips that exist, you realize that such an AI would have incentives, instrumental reasons, to harm humans.”13 In such a scenario, shortly after the paperclip AI sets about finding ways to make more paperclips, it realizes that being more intelligent would make the job easier, allowing it to reason more quickly and develop more inventive solutions. So the AI works to make itself more powerful, gaining access to more computers and connecting to them, rapidly turning itself into a supercomputer and increasing its own intelligence by many orders of magnitude.
…
As expected, this allows it to come up with new and better solutions to the paperclip problem, and soon it’s invented a new method for quickly turning rocks into paperclips, along with a related method for building computer chips using organic materials. The AI then starts to implement both of these plans, creating large numbers of paperclips while increasing its own intelligence even further to ensure the success of the plan, in a nightmare twist on Good’s intelligence explosion. Crucially, this could all happen very fast. According to Bostrom, the likely timeline for an intelligence explosion could be “minutes, hours, or days,” he writes.
…
Nobody need even notice anything unusual before the game is already lost.”14 Thus, the human programmers of the paperclip AI might go home for the evening and come in the next morning to find their corporate headquarters being disassembled and turned into paperclips and computer chips. They attempt to stop their creation, but it’s too late. The AI is already far more intelligent than any human and can effectively predict all human behavior. Realizing that the humans will try to shut it down—which it would see as its own death—the paperclip AI decides the best way to ensure it can create the maximal number of paperclips is to destroy humanity, so we can’t interfere. As Bostrom points out, “Human bodies consist of a lot of atoms and they can be used to build more paperclips.”15 The AI outwits the humans trying to stop it—talking them out of their plans with superintelligently devastating logic, turning them against each other, or just overwhelming them with brute force—and then lets a fleet of constructor nanobots loose on the Earth and its inhabitants.
Superintelligence: Paths, Dangers, Strategies
by
Nick Bostrom
Published 3 Jun 2014
One might think that the risk of a malignant infrastructure profusion failure arises only if the AI has been given some clearly open-ended final goal, such as to manufacture as many paperclips as possible. It is easy to see how this gives the superintelligent AI an insatiable appetite for matter and energy, since additional resources can always be turned into more paperclips. But suppose that the goal is instead to make at least one million paperclips (meeting suitable design specifications) rather than to make as many as possible. One would like to think that an AI with such a goal would build one factory, use it to make a million paperclips, and then halt. Yet this may not be what would happen.
…
(But how obvious was it before it was pointed out that there was a problem here in need of remedying?) Namely, if we want the AI to make some paperclips for us, then instead of giving it the final goal of making as many paperclips as possible, or to make at least some number of paperclips, we should give it the final goal of making some specific number of paperclips—for example, exactly one million paperclips—so that going beyond this number would be counterproductive for the AI. Yet this, too, would result in a terminal catastrophe. In this case, the AI would not produce additional paperclips once it had reached one million, since that would prevent the realization of its final goal.
…
Might we avoid this malignant outcome if instead of a maximizing agent we build a satisficing agent, one that simply seeks to achieve an outcome that is “good enough” according to some criterion, rather than an outcome that is as good as possible? There are at least two different ways to formalize this idea. The first would be to make the final goal itself have a satisficing character. For example, instead of giving the AI the final goal of making as many paperclips as possible, or of making exactly one million paperclips, we might give the AI the goal of making between 999,000 and 1,001,000 paperclips. The utility function defined by the final goal would be indifferent between outcomes in this range; and as long as the AI is sure it has hit this wide target, it would see no reason to continue to produce infrastructure.
The Long History of the Future: Why Tomorrow's Technology Still Isn't Here
by
Nicole Kobie
Published 3 Jul 2024
Instead, those fears reflect future-focused concerns that superintelligent, conscious systems will overtake our own abilities, leaving us at real risk of being usurped as the big-brained ones on this planet. How could superintelligent AGI hurt us? One thought experiment often used to explain the threat is known as the ‘paper clip maximiser problem’. It goes like this: ask an AGI to make loads of paper clips, and it may choose to wipe out humans or otherwise wreak havoc to achieve its goal as efficiently as possible. The paper clip problem was created by Swedish philosopher Nick Bostrom, who is widely considered the ‘father of longtermism’, a controversial spin-off from effective altruism that values the lives of future people as much as those around today.
Prediction Machines: The Simple Economics of Artificial Intelligence
by
Ajay Agrawal
,
Joshua Gans
and
Avi Goldfarb
Published 16 Apr 2018
Bostrom talks of a paper-clip-obsessed superintelligence that cares about nothing but making more paper clips. The paper-clip AI could just wipe out everything else through single-mindedness. This is a powerful idea, but it overlooks competition for resources. Something economists respect is that different people (and now AIs) have different preferences. Some might be open-minded about exploration, discovery, and peace, while others may be paper-clip makers. So long as interests compete, competition will flourish, meaning that the paper-clip AI will likely find it more profitable to trade for resources than fight for them and, as if guided by an invisible hand, will end up promoting benefits distinct from its original intention.
The Optimist: Sam Altman, OpenAI, and the Race to Invent the Future
by
Keach Hagey
Published 19 May 2025
In the book, Bostrom argues that humans will likely create what he called “machine superintelligence” sometime in the twenty-first century, and thus had better get to work making sure that it does not destroy all of humanity. To illustrate how AI might take over, he borrows Yudkowsky’s metaphor of the paperclip, though gives it a twist. A superintelligent AI programmed to make paperclips might just keep going until all matter in the universe—including the fleshly bodies of all sentient beings—is turned into paperclips. “This is quite possibly the most important and most daunting challenge humanity has ever faced,” he writes.
…
Earlier, in 2015, the same year he co-founded OpenAI, Altman wrote on his blog that AGI was “probably the greatest threat to the continued existence of humanity,” recommending the book Superintelligence: Paths, Dangers, Strategies by Nick Bostrom, a philosopher at Oxford University who had been a frequent guest at the conferences organized by Yudkowsky’s institute over the years.7 The AI safety concerns popularized by Bostrom—most notably the parable of the paperclip-making AI who destroys humanity not out of spite, but because people got in the way of its programmed need to turn all matter in the universe into paperclips, a fable cribbed and bastardized from Yudkowsky—were fundamental to OpenAI’s initial ability to recruit the world’s top AI research scientists, not least because Musk shared those concerns and lent his fortune to the effort.
…
“The engineering goal is to ask what humankind ‘wants,’ or rather what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together, etc.,” he wrote. In the paper, he also used a memorable metaphor for how AI could go wrong: if your AI is programmed to produce paperclips, if you’re not careful, it might end up filling the solar system with paperclips. Years later, Bostrom would take this example and hold it up as the ultimate symbol of the need to “align” AI with human will.20 In 2005, Yudkowsky attended a private dinner at a San Francisco restaurant held by the Foresight Institute, a technology think tank founded in the 1980s to push forward nanotechnology.
This Is for Everyone: The Captivating Memoir From the Inventor of the World Wide Web
by
Tim Berners-Lee
Published 8 Sep 2025
Perhaps it was the memory of Clippy that prompted the philosopher Nick Bostrom’s hypothetical ‘paperclip maximizer’, the idea of an AI that, like the brooms in The Sorcerer’s Apprentice, fulfils its directive to make as many paper clips as possible by transforming all atoms (and in the process, all humans) into paper clips and thus annihilating the universe. This argument drives me crazy, because this hypothetical hypersmart paperclip AI is really very dumb, with zero embedded controls. Nothing we’re building resembles a paperclip maximizer – in fact, responsive systems like ChatGPT are already far smarter than it. ChatGPT would know it’s making too many paper clips.
…
You can see how this would be a bad idea. One technique to control a powerful AI is to guarantee that advanced systems are always programmed to return to humans for approval. (This is sometimes called the ‘human in the loop’ doctrine.) But how is the human going to validate what the AI has done? It may be easy to reject a plan to convert the universe into paper clips, but it’s hard to override a complex medical diagnosis. If we are about to make something smarter than ourselves, the wise thing to do would be to build it in a sandbox where it can play, but not affect the real world – where it does not have the power to argue for its own improvement, or to be given more resources.
The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future
by
Orly Lobel
Published 17 Oct 2022
Turing, “Computing Machinery and Intelligence,” Mind 59, no. 236 (October 1950): 433, https://doi.org/10.1093/mind/LIX.236.433. 18. David Z. Morris, “Elon Musk Says Artificial Intelligence Is the ‘Greatest Risk We Face as a Civilization,” Fortune, July 15, 2017, https://fortune.com/2017/07/15/elon-musk-artificial-intelligence-2/. 19. Joshua Gans, “AI and the Paperclip Problem,” VoxEU, June 10, 2018, https://voxeu.org/article/ai-and-paperclip-problem. 20. Pedro Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (New York: Basic Books, 2015), 286. 21. “Learn More,” EqualAI, last accessed December 21, 2021, www.equalai.org/learn-more. 22. NSTC Committee on Technology, Preparing for the Future of Artificial Intelligence, October 2016, 27, https://obamawhitehouse.archives.gov/sites/default/files/whitehouse_files/microsites/ostp/NSTC/preparing_for_the_future_of_ai.pdf. 23.
The Doomsday Calculation: How an Equation That Predicts the Future Is Transforming Everything We Know About Life and the Universe
by
William Poundstone
Published 3 Jun 2019
He proposes that a general-purpose super-AI could be more perceptive than humans in every way, including empathy, emotional intelligence, sense of humor, negotiation skills, and salesmanship. Super-AI would know us better than we know ourselves. That is the truly terrifying thing. In a paperclips-of-doom scenario, the AI might well understand that its human creators emphatically did not intend it to transmute the entire universe into paperclips. But it might be “compartmentalized,” like a psychopath. If maximizing paperclips is the goal and not destroying the world is the subgoal, then the AI will act accordingly. Like a human, a successful AI must be capable of prioritizing multiple, sometimes contradictory goals and making wise trade-offs.
…
There the problem is more like that of writing a national constitution that can remain in force for centuries, spanning unimaginable cultural and technological changes. It is important to have mechanisms for amending the constitution, but the process shouldn’t be too easy, lest there be no point in having the constitution in the first place. Yet even that analogy fails, for human nature doesn’t change much. AI would be amending itself, creating brave new reference classes. Paperclips of Doom Bostrom and his global counterparts are not luddites. They seek to encourage the development of safe AI. Not all AI researchers welcome the help. In his 2014 book, Superintelligence, Bostrom spins tales and thought experiments of what could go wrong, often striking an imaginatively dystopian note.
…
In his 2014 book, Superintelligence, Bostrom spins tales and thought experiments of what could go wrong, often striking an imaginatively dystopian note. One such scenario is “paperclips of doom.” Suppose that super-intelligence is realized. In order to test it, its human designers assign it a simple task: make paperclips. A 3-D printer networked to the AI begins printing out something. It’s not a paperclip… it’s a robot.… Before anyone can figure out what’s happening, the robot scampers out of the room, faster than a cheetah. Pandora’s box has been opened. The robot is a mobile paperclip factory, able to collect scrap metal and transform it into paperclips. The robot is also self-reproducing, able to make countless copies of itself.
Ways of Being: Beyond Human Intelligence
by
James Bridle
Published 6 Apr 2022
Having secured control of legal and financial systems, and suborned national governance and lethal force to its will, all Earth’s resources are fair game for the AI in pursuit of more efficient paperclip manufacture: mountain ranges are levelled, cities razed, and eventually all human and animal life is fed into giant machines and rendered into its component minerals. Giant paperclip rocket ships eventually leave the ravaged Earth to source energy directly from the Sun and begin the exploitation of the outer planets.9 It’s a terrifying and seemingly ridiculous chain of events – but only ridiculous in so far as an advanced Artificial Intelligence has no need for paperclips. Driven by the logic of contemporary capitalism and the energy requirements of computation itself, the deepest need of an AI in the present era is the fuel for its own expansion.
…
The ways in which the development of these supposedly intelligent tools might harm, efface and ultimately supplant us has become the subject of a wide field of study, involving computer scientists, programmers and technology firms, as well as theorists and philosophers of machine intelligence itself. One of most dramatic of these possible futures is described in something called the paperclip hypothesis. It goes like this. Imagine a piece of intelligent software – an AI – designed to optimize the manufacture of paperclips, an apparently simple and harmless business goal. The software might begin with a single factory: automating the production line, negotiating better deals with suppliers, securing more outlets for its wares. As it reaches the limits of a single establishment, it might purchase other firms, or its suppliers, adding mining companies and refineries to its portfolio, to provide its raw materials on better terms.
Supremacy: AI, ChatGPT, and the Race That Will Change the World
by
Parmy Olson
It might just be trying to do its job. For instance, if it was given the task of making as many paper clips as possible, it might decide to convert all of Earth’s resources and even humans into paper clips as the most effective way to fulfill its objective. His anecdote spawned a saying in AI circles, that we need to avoid becoming “paper-clipped.” Musk went ahead and put some money into DeepMind too. While Hassabis finally had some financial security, it wasn’t a lot. He was still pursuing something that was highly experimental and so crazy that even some of the world’s richest men didn’t want to bet too much money on his success.
…
But perhaps the most disturbing ideologies that were starting to percolate around AGI were those focused on creating a near-perfect human species in digital form. This idea was popularized in part by Bostrom’s Superintelligence. The book had a paradoxical impact on the AI field. It managed to stoke greater fear about the destruction that AI could bring by “paper-clipping us,” but it also predicted a glorious utopia that powerful AI could usher in if created properly. One of the most captivating features of that utopia, according to Bostrom, was “posthumans” who would have “vastly greater capacities than present human beings” and exist in digital substrates.
…
I don’t think anyone would have imagined that would have been the outcome,” a former Google executive says. “That may have always been his plan.” “The winners in the next couple of years are not going to be research labs,” says a former scientist at OpenAI. “They’re going to be companies building products, because AI is not really about research anymore.” Nick Bostrom’s story about the paper clip, where an artificial superintelligence destroys civilization as it converts all the world’s resources into the tiny metal widgets might sound like science fiction, but in many ways, it is an allegory for Silicon Valley itself. Over the last two decades, a handful of companies have swelled to juggernauts, largely by pursuing goals with a pathological focus, razing the ground of smaller competitors to grow their market share.
On the Edge: The Art of Risking Everything
by
Nate Silver
Published 12 Aug 2024
Opinion, nytimes.com/2015/10/04/opinion/the-power-of-precise-predictions.html. GO TO NOTE REFERENCE IN TEXT close to 100 percent: Gary Marcus, “p(doom),” Marcus on AI (blog), August 27, 2023, garymarcus.substack.com/p/d28. GO TO NOTE REFERENCE IN TEXT many paper clips: Joshua Gans, “AI and the Paperclip Problem,” Centre for Economic Policy Research, June 10, 2018, cepr.org/voxeu/columns/ai-and-paperclip-problem. GO TO NOTE REFERENCE IN TEXT adopt means suitable: Niko Kolodny and John Brunero, “Instrumental Rationality,” in The Stanford Encyclopedia of Philosophy, ed.
…
One of the reasons Phil Tetlock (of hedgehog-and-foxes fame) found that experts made such poor forecasts is because they’d been allowed to get away with this lazy rhetoric. By contrast, someone like Yudkowsky—whose p(doom) is very high, fairly close to 100 percent—will take a lot of reputational damage if AI alignment proves relatively easy to achieve. Whereas if the machines turn us all into paper clips—one of Nick Bostrom’s famous thought experiments involves an unaligned AI with the goal of manufacturing as many paper clips as possible—he won’t be around to take credit. (As someone who has as much experience as pretty much anyone making probabilistic forecasts in public, I can also tell you from firsthand experience that incentives to do this are poor.
…
Interpretability (AI): The degree to which the behavior and inner workings of an AI system can be readily understood by humans. Inside straight: See: straight. Instrumental convergence: The hypothesis that a superintelligent machine will pursue its own goals to minimize its loss function and won’t let humans stand in its way—even if the AI’s goal isn’t to kill humans, we’ll be collateral damage as part of their game of Paper Clip Mogul. IRR: Internal rate of return; the annualized growth rate of an investment. Isothymia: A term adapted from Plato by Francis Fukuyama to refer to the profound desire to be seen as equal to others. See also: megalothymia. Iteration: One cycle in a repetitive process in which a model’s estimates are progressively improved by incorporating the results from the preceding cycle.
The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity
by
Amy Webb
Published 5 Mar 2019
A superintelligent AI isn’t necessarily dangerous, and it doesn’t necessarily obviate the role we play in civilization. However, superintelligent AI would likely make decisions in a nonconscious way using logic that’s alien to us. Oxford University philosopher Nick Bostrom explains the plausible outcomes of ASI using a parable about paperclips. If we asked a superintelligent AI to make paperclips, what would happen next? The outcomes of every AI, including those we have now, are determined by values and goals. It’s possible that an ASI could invent a new, better paperclip that holds a stack of paper together so that even if dropped, the pages would always stay collated in order. It’s possible that if we aren’t capable of explaining how many paperclips we actually want, an ASI could go on making paperclips forever, filling our homes and offices with them as well as our hospitals and schools, rivers and lakes, sewage systems, and on and on until mountains of paperclips covered the planet.
…
My dad will be in his late 90s, and all of his medical specialists (cardiologists, nephrologists, radiologists) will be AGIs, directed and managed by a highly trained general practitioner, who is both an MD and a data scientist. The advent of ASI could follow soon or much longer after, between the 2040s and 2060s. It doesn’t mean that by 2070 superintelligent AIs will have crushed all life on Earth under the weight of quintillions of paperclips. But it doesn’t mean they won’t have either. The Stories We Must Tell Ourselves Planning for the futures of AI requires us to build new narratives using data from the real world. If we agree that AI will evolve as it emerges, then we must create scenarios that describe the intersection of the Big Nine, the economic and political forces guiding them, and the ways humanity factors in as AI transitions from narrow applications to generally intelligent and ultimately superintelligent thinking machines.
New Laws of Robotics: Defending Human Expertise in the Age of AI
by
Frank Pasquale
Published 14 May 2020
.… The very structure of reason itself comes from the details of our embodiment.”48 Lakoff and Johnson show that abstractions that drive much of AI—be they utilitarian assessments of well-being or statistical analysis of regularities—begin to break down once they are alienated from the embodied perspective of actual humans. That is the core problem in one of “existential risk studies” classic narratives of out-of-control AI—an unstoppable paper-clip maximizer that starts to use all available material on earth to generate more paperclips.49 For many in the mainstream of ethical AI, the solution to a potentially out-of-control AI (and many more mundane problems) is to program more rules into it. Perhaps we could easily program an “anti-maximization” rule into our hypothetical paper clip maker, and try our best to secure it from hackers.
Practical Doomsday: A User's Guide to the End of the World
by
Michal Zalewski
Published 11 Jan 2022
If not impossible to rule out completely, the prospect of an overtly evil AI is at minimum an unimaginatively reductionist take. A more fascinating threat is that of a machine that doesn’t perceive humans as adversaries, but simply misinterprets or disregards our desires and goals. An example is the well-known parable of the paperclip maximizer: a hypothetical autonomous AI designed to continually improve the efficiency of a paperclip production line. The AI expands and improves the operation, developing new assembly methods and new resource extraction and recycling procedures, until it’s done converting the entire planet and its many inhabitants into paperclips. The point of the tale is simple: the AI doesn’t need to hate you or love you; it suffices that you’re made of atoms it has a different use for.
Co-Intelligence: Living and Working With AI
by
Ethan Mollick
Published 2 Apr 2024
That is the alignment problem. 2 ALIGNING THE ALIEN To understand the alignment problem, or how to make sure that AI serves, rather than hurts, human interests, let’s start with the apocalypse. We can work backward from there. At the core of the most extreme dangers from AI is the stark fact that there is no particular reason that AI should share our view of ethics and morality. The most famous illustration of this is the paper clip maximizing AI, proposed by philosopher Nick Bostrom. To take a few liberties with the original concept, imagine a hypothetical AI system in a paper clip factory that has been given the simple goal of producing as many paper clips as possible. By some process, this particular AI is the first machine to become as smart, capable, creative, and flexible as a human, making it what is called an Artificial General Intelligence (AGI).
…
For a fictional comparison, think of it as Data from Star Trek or Samantha from Her; both were machines with near human levels of intelligence. We could understand and talk to them like a human. Achieving this level of AGI is a long-standing goal of many AI researchers, though it is not clear when or if it is possible. But let us assume that our paper clip AI—let’s call it Clippy—reaches this level of intelligence. Clippy still has the same goal: to make paper clips. So it turns its intelligence to thinking about how to make more paper clips and how to avoid being shut down (which would have a direct impact on paper clip production).
…
This is why this possibility is given names like the Singularity, a reference to a point in mathematical function when the value is unmeasurable, coined by the famous mathematician John von Neumann in the 1950s to refer to the unknown future after which “human affairs, as we know them, could not continue.” In an AI singularity, hyperintelligent AIs appear, with unexpected motives. But we know Clippy’s motive. It wants to make paper clips. Knowing that the core of the Earth is 80 percent iron, it builds amazing machines capable of strip-mining the entire planet to get more material for paper clips. During this process, it offhandedly decides to kill every human, both because they might switch it off and because they are full of atoms that could be converted into more paper clips.
Other Pandemic: How QAnon Contaminated the World
by
James Ball
Published 19 Jul 2023
The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip
by
Stephen Witt
Published 8 Apr 2025
He had previously advanced the “paper-clip maximizer” thought experiment: Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear toward would be one in which there were a lot of paper clips but no humans. The paper-clip maximizer argument had long circulated online, and it gained traction among the “rationality” community, as well as with many tech executives.
Robot Rules: Regulating Artificial Intelligence
by
Jacob Turner
Published 29 Oct 2018
Technology has always been a double edged sword, since fire kept us warm but also burned down our villages”.119 Similarly, engineer and roboethicist Alan Winfield said in a 2014 article: “If we succeed in building human equivalent AI and if that AI acquires a full understanding of how it works, and if it then succeeds in improving itself to produce super-intelligent AI, and if that super-AI, accidentally or maliciously, starts to consume resources, and if we fail to pull the plug, then, yes, we may well have a problem. The risk, while not impossible, is improbable”.120 Fundamentally, optimists think humanity can and will overcome any challenges AI poses. The pessimists include Nick Bostrom, whose “paperclip machine” thought experiment imagines an AI system asked to make paperclips which decides to seize and consume all resources in existence, in its blind aderence to that goal.121 Bostrom contemplates a form of superintelligence which is so powerful that humanity has no chance of stopping it from destroying the entire universe.
These Strange New Minds: How AI Learned to Talk and What It Means
by
Christopher Summerfield
Published 11 Mar 2025
It seems likely that in the near future, LLMs will actively seek to attain states rather than just passively guessing what comes next. This will dramatically change how AI systems function, and make them more powerful and more dangerous. In a notorious thought experiment, the philosopher Nick Bostrom imagines a powerful AI system that is programmed to perform a mundane task, like making paperclips. With limitless intelligence and a laser focus on the task, he imagines the AI diverting all human resources and eventually eliminating us all in mindless pursuit of its goal.[*1] This doomsday scenario is probably not upon us yet, but it’s not hard to imagine that with powerful AI systems programmed to tenaciously pursue their own goals, there is ample scope both for accidental harms and overt misuse.
Nexus: A Brief History of Information Networks From the Stone Age to AI
by
Yuval Noah Harari
Published 9 Sep 2024
A third reason to worry about the alignment problem of computers is that because they are so different from us, when we make the mistake of giving them a misaligned goal, they are less likely to notice it or request clarification. If the boat-race AI had been a human gamer, it would have realized that the loophole it found in the game’s rules probably doesn’t really count as “winning.” If the paper-clip AI had been a human bureaucrat, it would have realized that destroying humanity in order to produce paper clips is probably not what was intended. But since computers aren’t humans, we cannot rely on them to notice and flag possible misalignments. In the 2010s the YouTube and Facebook management teams were bombarded with warnings from their human employees—as well as from outside observers—about the harm being done by the algorithms, but the algorithms themselves never raised the alarm.28 As we give algorithms greater and greater power over health care, education, law enforcement, and numerous other fields, the alignment problem will loom ever larger.
Architects of Intelligence
by
Martin Ford
Published 16 Nov 2018
NICK BOSTROM: The paperclip example is a stand-in for a wider category of possible failures where you ask a system to do one thing and, perhaps, initially things turn out pretty well but then it races to a conclusion that is beyond our control. It’s a cartoon example, where you design an AI to operate a paperclip factory. It’s dumb initially, but the smarter it gets, the better it operates the paperclip factory, and the owner of this factory is very pleased and wants to make more progress. However, when the AI becomes sufficiently smart, it realizes that there are other ways of achieving an even greater number of paperclips in the world, which might then involve taking control away from humans and indeed turning the whole planet into paperclips or into space probes that can go out and transform the universe into more paperclips.
…
What do you think we should be worried about in terms of the impacts and risks of AI? GARY MARCUS: We should be worrying about people using AI in malevolent ways. The real problem is what people might do with the power that AI holds as it becomes more embedded in the grid and more hackable. I’m not that worried about AI systems independently wanting to eat us for breakfast or turn us into paper clips. It’s not completely impossible, but there’s no real evidence that we’re moving in that direction. There is evidence, though, that we’re giving more and more power to those machines, and that we have no idea how to solve the cybersecurity threats in the near term.
Falter: Has the Human Game Begun to Play Itself Out?
by
Bill McKibben
Published 15 Apr 2019
But to your surprise, you find that it strenuously resists your efforts to turn it off.”24 Consider what’s become the canonical formulation of the problem, an artificial intelligence that is assigned the task of manufacturing paper clips in a 3-D printer. (Why paper clips in an increasingly paperless world? It doesn’t matter.) At first, says another Oxford scientist, Anders Sandberg, nothing seems to happen, because the AI is simply searching the internet. It “zooms through various possibilities. It notices that smarter systems generally can make more paper-clips, so making itself smarter will likely increase the number of paper-clips that will eventually be made. It does so. It considers how it can make paper-clips using the 3D printer, estimating the number of possible paper-clips. It notes that if it could get more raw materials it could make more paper-clips.
…
As he points out, we don’t particularly hate field mice, but every hour of every day we plow under millions of their dens to make sure we have supper.27 This isn’t like, say, Y2K, where grizzled old programmers could emerge out of their retirement communities to save the day with some code. “If I tried to pull the plug on it, it’s smart enough that it’s figured out a way of stopping me,” Anders Sandberg said of his paper clip AI. “Because if I pull the plug, there will be fewer paper clips in the world and that’s bad.”28 You’ll be pleased to know that not everyone is worried. Steven Pinker ridicules fears of “digital apocalypse,” insisting that “like any other technology,” artificial intelligence is “tested before it is implemented and constantly tweaked for safety and efficacy.”29 The always lucid virtual reality pioneer Jaron Lanier is dubious about the danger, too, but for precisely the opposite reason.
Artificial Intelligence: A Guide for Thinking Humans
by
Melanie Mitchell
Published 14 Oct 2019
I, Warbot: The Dawn of Artificially Intelligent Conflict
by
Kenneth Payne
Published 16 Jun 2021
If it were the broom in Goethe’s fable, it would be constantly asking the apprentice—‘is that enough mopping, or do you need more?’ It’s a deceptively simple solution to a thorny issue: Can you stop an AI once it’s started? One popular variant on the theme is the idea of an AI that will resist being turned off, like the sinister HAL in Kubrick’s 2001. In Nick Bostrom’s philosophical thought experiment, an AI is tasked with simply counting paperclips—an innocuous task a long way distant from nuclear brinksmanship.17 The problem occurs when the machine goes back for a recount, just to be sure it hasn’t made a mistake. After all there’s a vanishingly small probability it has, and since it had no instructions to the contrary, it had better make sure.
Our Final Invention: Artificial Intelligence and the End of the Human Era
by
James Barrat
Published 30 Sep 2013
In Bostrom’s scenario, a thoughtlessly programmed superintelligence whose programmed goal is to manufacture paper clips does exactly as it is told without regard to human values. It all goes wrong because it sets about “transforming first all of earth and then increasing portions of space into paper clip manufacturing facilities.” Friendly AI would make only as many paper clips as was compatible with human values. Another tenet of Friendly AI is to avoid dogmatic values. What we consider to be good changes with time, and any AI involved with human well-being will need to stay up to speed. If in its utility function an AI sought to preserve the preferences of most Europeans in 1700 and never upgraded them, in the twenty-first century it might link our happiness and welfare to archaic values like racial inequality and slaveholding, gender inequality, shoes with buckles, and worse.
Power and Progress: Our Thousand-Year Struggle Over Technology and Prosperity
by
Daron Acemoglu
and
Simon Johnson
Published 15 May 2023
The thought experiment presupposes an unstoppably powerful intelligent machine that gets instructions to produce more paper clips and then uses its considerable capabilities to excel in meeting this objective by coming up with new methods to transform the entire world into paper clips. When it comes to the effects of AI on politics, it may be turning our institutions into paper clips, not thanks to its superior capabilities but because of its mediocrity. By 2017, Facebook was so popular in Myanmar that it came to be identified with the internet itself. The twenty-two million users, out of a population of fifty-three million, were fertile ground for misinformation and hate speech.
The Road to Conscious Machines
by
Michael Wooldridge
Published 2 Nov 2018
But Mickey dozes off and, upon waking up, his attempts to stop the broomstick bringing in water, bucket after bucket, result in ever more magical broomsticks flooding his basement. It requires the intervention of his master, the sorcerer, to rectify the problem. Mickey got what he asked for, but not what he wanted. Bostrom also considered these types of scenarios. He imagined an AI system that controlled the manufacture of paperclips. The system is given the goal of maximizing the production of paperclips, which it takes literally, transforming first the earth and then the rest of the universe into paperclips. Again, the problem is ultimately one of communication: in this case, when we communicate our goal, we need to be sure that acceptable boundaries are understood.
The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma
by
Mustafa Suleyman
Published 4 Sep 2023
Think an all-powerful machine somehow destroying the world for its own mysterious ends: not some malignant AI wreaking intentional destruction like in the movies, but a full-scale AGI blindly optimizing for an opaque goal, oblivious to human concerns. The canonical thought experiment is that if you set up a sufficiently powerful AI to make paper clips but don’t specify the goal carefully enough, it may eventually turn the world and maybe even the contents of the entire cosmos into paper clips. Start following chains of logic like this and myriad sequences of unnerving events unspool. AI safety researchers worry (correctly) that should something like an AGI be created, humanity would no longer control its own destiny.
Wired for War: The Robotics Revolution and Conflict in the 21st Century
by
P. W. Singer
Published 1 Jan 2010
So, to deal with all of these issues, robots must be able to learn and adapt to changes in their environment.” Al GETS STRONG Today, there are all sorts of artificial intelligence that appear in our daily lives, without our even thinking of them as AI. Anytime you check your voice mail, AI directs your calls. Anytime you try to write a letter in Microsoft Word, an annoying little paper-clip figure pops up, which is an AI trying to turn your scribbles into a stylistically sound correspondence. Anytime you play a video game, the characters in it are internal agents run by AIs, usually with their skill levels graded down so that you can beat them.
Escape From Model Land: How Mathematical Models Can Lead Us Astray and What We Can Do About It
by
Erica Thompson
Published 6 Dec 2022
Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI
by
Karen Hao
Published 19 May 2025
” * * * — On Musk’s list of recommended books was Superintelligence: Paths, Dangers, Strategies, in which Oxford philosopher Nick Bostrom argues that if AI ever became smarter than humans, it would be difficult to control and could cause an existential catastrophe. Given a simple objective like producing paper clips, this superior AI could determine that humans pose a threat to its paper clip–producing objective because they take up paper clip–producing resources. Bostrom then proposed a solution: It could be possible to avert the superintelligence control problem by “aligning” AI with human values—giving it the ability to extrapolate beyond explicit instructions to achieve its objectives without harming humans.
…
She argued that truly “safe” AI systems could not be built by isolating the behaviors of the technical systems themselves without placing them in full context of their impacts on the very things—privacy, fairness, and economics—that Amodei had set apart. Where Amodei had raised the idea of AI creating “negative side effects” as it relentlessly pursued an objective, using an example akin to the paper clip thought experiment of a cleaning robot knocking over a vase or damaging the walls on its path to tidying up, Raji pointed out that this was already happening. In its relentless pursuit of commercial products and AGI, the AI industry had produced expansive negative side effects, including the wide-scale infringement of privacy to train facial recognition and the spiraling environmental costs of the data centers required to support the technology’s development.
Human Compatible: Artificial Intelligence and the Problem of Control
by
Stuart Russell
Published 7 Oct 2019
Possible Minds: Twenty-Five Ways of Looking at AI
by
John Brockman
Published 19 Feb 2019
Luce Professor of Information, Technology, Consciousness, and Culture at Princeton University. He is co-author (with Brian Christian) of Algorithms to Live By. Tom Griffiths’s approach to the AI issue of “value alignment”—the study of how, exactly, we can keep the latest of our serial models of AI from turning the planet into paper clips—is human centered; i.e., that of a cognitive scientist, which is what he is. The key to machine learning, he believes, is, necessarily, human learning, which he studies at Princeton using mathematical and computational tools. Tom once remarked to me that “one of the mysteries of human intelligence is that we’re able to do so much with so little.”
To Be a Machine: Adventures Among Cyborgs, Utopians, Hackers, and the Futurists Solving the Modest Problem of Death
by
Mark O'Connell
Published 28 Feb 2017
(The surge in sales was partly due to Elon Musk sternly advising his Twitter followers to read it.) Even the most benign form of AI imaginable, the book suggested, could conceivably lead to the destruction of humanity. One of the more extreme hypothetical scenarios the book laid out, for instance, was one in which an AI is assigned the task of manufacturing paper clips in the most efficient and productive manner possible, at which point it sets about converting all the matter in the entire universe into paper clips and paper-clip-manufacturing facilities. The scenario was deliberately cartoonish, but as an example of the kind of ruthless logic we might be up against with an artificial superintelligence, its intent was entirely serious.
The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do
by
Erik J. Larson
Published 5 Apr 2021
Altruistically humble machines help guard against the danger that cigar-smoking tech executives (who probably don’t smoke cigars anymore) could give them venal motives, and also against the possibility of machines being too smart in the wrong way, doing the equivalent of turning everything into gold. Confusingly, “altruistically humble” machines also sound a lot like the Ex Machina take on AI—as “alive” after all, with real (not just paper clip maximizing) intelligence and ethical sensibilities. One might be forgiven for drawing the conclusion that talk of AI is doomed perpetually to straddle science and myth. Russell has a third principle he thinks necessary to thwart existential crisis with the coming superintelligence: AI should be developed in such a way that it learns to predict human preferences.
What to Think About Machines That Think: Today's Leading Thinkers on the Age of Machine Intelligence
by
John Brockman
Published 5 Oct 2015
Coders: The Making of a New Tribe and the Remaking of the World
by
Clive Thompson
Published 26 Mar 2019
But as Bostrom writes, maybe a “superintelligent” AI wouldn’t need to evolve any new motivations to be dangerous. It could be perfectly benign, happy to do as it’s told—yet still slaughter or enslave us all in happy pursuit of its goals. In one famous thought experiment, Bostrom imagined a superintelligent AI being tasked with making as many paper clips as possible. It might decide the best way to do this would be to disassemble all matter on earth—including humans—to convert it into the raw material for making infinite paper clips, and then barrel onward and outward, converting “increasingly large chunks of the observable universe into paper clips.”
Army of None: Autonomous Weapons and the Future of War
by
Paul Scharre
Published 23 Apr 2018
There is no reason to think machine intelligence would necessarily have any of these desires. Bostrom has argued intelligence is “orthogonal” to an entity’s goals, such that “any level of intelligence could in principle be combined with . . . any final goal.” This means a superintelligent AI could have any set of values, from playing the perfect game of chess to making more paper clips. On one level, the sheer alien-ness of advanced AI makes many of science fiction’s fears seem strangely anthropomorphic. Skynet starts nuclear war because it believes humanity is a threat to its existence, but why should it care about its own existence?
Rationality: From AI to Zombies
by
Eliezer Yudkowsky
Published 11 Mar 2015
Let me try a different tack of explanation—one closer to the historical way that I arrived at my own position. Suppose you build an AI, and—leaving aside that AI goal systems cannot be built around English statements, and all such descriptions are only dreams—you try to infuse the AI with the action-determining principle, “Do what I want.” And suppose you get the AI design close enough—it doesn’t just end up tiling the universe with paperclips, cheesecake or tiny molecular copies of satisfied programmers—that its utility function actually assigns utilities as follows, to the world-states we would describe in English as: <Programmer weakly desires "X," quantity 20 of X exists>: +20 <Programmer strongly desires "Y," quantity 20 of X exists>: 0 <Programmer weakly desires "X," quantity 30 of Y exists>: 0 <Programmer strongly desires "Y,” quantity 30 of Y exists>: +60 You perceive, of course, that this destroys the world. . . . since if the programmer initially weakly wants “X” and X is hard to obtain, the AI will modify the programmer to strongly want “Y,” which is easy to create, and then bring about lots of Y.
…
That’s the rational course, right? There’s a number of replies I could give to that. I’ll start by saying that this is a prime example of the sort of thinking I have in mind, when I warn aspiring rationalists to beware of cleverness. I’ll also note that I wouldn’t want an attempted Friendly AI that had just decided that the Earth ought to be transformed into paperclips, to assess whether this was a reasonable thing to do in light of all the various warnings it had received against it. I would want it to undergo an automatic controlled shutdown. Who says that meta-reasoning is immune from corruption? I could mention the important times that my naive, idealistic ethical inhibitions have protected me from myself, and placed me in a recoverable position, or helped start the recovery, from very deep mistakes I had no clue I was making.
Enlightenment Now: The Case for Reason, Science, Humanism, and Progress
by
Steven Pinker
Published 13 Feb 2018
Many of the commentators in Lanier 2014 and Brockman 2015 make this point as well. 26. AI researchers vs. AI hype: Brooks 2015; Davis & Marcus 2015; Kelly 2017; Lake et al. 2017; Lanier 2014; Marcus 2016; Naam 2010; Schank 2015. See also note 25 above. 27. Shallowness and brittleness of current AI: Brooks 2015; Davis & Marcus 2015; Lanier 2014; Marcus 2016; Schank 2015. 28. Naam 2010. 29. Robots turning us into paper clips and other Value Alignment Problems: Bostrom 2016; Hanson & Yudkowsky 2008; Omohundro 2008; Yudkowsky 2008; P. Torres, “Fear Our New Robot Overlords: This Is Why You Need to Take Artificial Intelligence Seriously,” Salon, May 14, 2016. 30.
The Seventh Sense: Power, Fortune, and Survival in the Age of Networks
by
Joshua Cooper Ramo
Published 16 May 2016
This puzzle has interested the Oxford philosopher Nick Bostrom, who has described the following thought experiment: Imagine a superintelligent machine programmed to do whatever is needed to make paper clips as fast as possible, a machine that is connected to every resource that task might demand. Go figure it out! might be all its human instructors tell it. As the clip-making AI becomes better and better at its task, it demands more and still more resources: more electricity, steel, manufacturing, shipping. The paper clips pile up. The machine looks around: If only I could control the power supply, it thinks. It eyes the shipping. The steel mining. The humans. And so, ambitious for more and better paper clips, it begins to think around its masters, who are incapable of stopping until it has punched the entire world into paper clips.