AlphaGo

back to index

description: an artificial intelligence developed by Google's DeepMind to play the board game Go

generative artificial intelligence

116 results

The Creativity Code: How AI Is Learning to Write, Paint and Think

by Marcus Du Sautoy  · 7 Mar 2019  · 337pp  · 103,522 words

The competition would be played over five games with the winner taking home a prize of one million dollars. The name of Sedol’s challenger: AlphaGo. AlphaGo is the brainchild of Demis Hassabis. Hassabis was born in London in 1976 to a Greek Cypriot father and a mother from Singapore. Both parents

So in October 2015 they decided to test-run their program in a secret competition against the current European champion, the Chinese-born Fan Hui. AlphaGo destroyed Fan Hui five games to nil. But the gulf between European players of the game and those in the Far East is huge. The

the press in the Far East heard about Fan Hui’s defeat they were merciless in their dismissal of how meaningless the win was for AlphaGo. Indeed, when Fan Hui’s wife contacted him in London after the news got out, she begged her husband not to go online. Needless

. In the following months his ranking went from 633 to the 300s. But it wasn’t only Fan Hui who was learning. Every game AlphaGo plays affects its code and changes it to improve its play next time around. It was at this point that the DeepMind team felt confident

of a million dollars. Although the venue was public, the precise location within the hotel was kept secret and was isolated from noise – not that AlphaGo was going to be disturbed by the chitchat of the press and the whispers of curious bystanders. It would assume a perfect Zen-like state

beat the best humans at the game. As the date for the match approached, the team at DeepMind felt they needed someone to really stretch AlphaGo and to test it for any weaknesses. So they invited Fan Hui back to play the machine going into the last few weeks. Despite having

who had control of the game, often becoming totally delusional that it was winning when the opposite was true. If Sedol tapped into this weakness, AlphaGo wouldn’t just lose, it would appear extremely stupid. The DeepMind team worked around the clock trying to fix this blind spot. Eventually they just

Beautiful. Beautiful’ It was with a sense of existential anxiety that I fired up the YouTube channel broadcasting the matches that Sedol would play against AlphaGo and joined 280 million other viewers to see humanity take on the machines. Having for years compared creating mathematics to playing the game of Go

right up until the final moments of the game. What they were able to pick up quite quickly was Sedol’s opening strategy. If AlphaGo had learned to play on games that had been played in the past, then Sedol was working on the principle that it would put him

known how to respond and would most likely have made a move that would have serious consequences in the grand arc of the game. But AlphaGo was not a conventional machine. It could assess the new moves and determine a good response based on what it had learned over the course

out of him as it gradually dawned on him that he was losing. He kept looking over at Huang, the DeepMind representative who was playing AlphaGo’s moves, but there was nothing he could glean from Huang’s face. By move 186 Sedol had to recognise that there was no

shock not just Sedol but every human player of the game of Go. The first game was one that experts could follow and appreciate why AlphaGo was playing the moves it was. They were moves a human champion would play. But as I watched game 2 on my laptop at

something rather strange happened. Sedol played move 36 and then retired to the roof of the hotel for a cigarette break. While he was away, AlphaGo on move 37 instructed Huang, its human representative, to place a black stone on the line five steps in from the edge of the board

fifth line has always been regarded as suboptimal, giving your opponent the chance to build up territory that has both short- and long-term influence. AlphaGo had broken this orthodoxy built up over centuries of competing. Some commentators declared it a clear mistake. Others were more cautious. Everyone was intrigued to

easy for me.’ The match was being played over five games. This was the game that Sedol needed to win to be able to stop AlphaGo claiming the match. The human fight-back Sedol had a day off to recover. The third game would be played on Saturday, 12 March.

up till 6 a.m. the next morning analysing the games he’d lost so far with a group of fellow professional Go players. Did AlphaGo have a weakness they could exploit? The machine wasn’t the only one who could learn and evolve. Sedol felt he might learn something

from his losses. Sedol played a very strong opening to game 3, forcing AlphaGo to manage a weak group of stones within his sphere of influence on the board. Commentators began to get excited. Some said Sedol had found

to get scary. As I watched the game unfold and the realisation of what was happening dawned on me, I felt physically unwell.’ Sedol pushed AlphaGo to its limits but in so doing he revealed the hidden powers that the program seemed to possess. As the game proceeded, it started to

it won by half a point. All that mattered was that it won. To play such lazy moves was almost an affront to Sedol, but AlphaGo was not programmed with any vindictive qualities. Its sole goal was to win the game. Sedol pushed this way and that, determined not to

too quickly. Perhaps one of these lazy moves was a mistake that he could exploit. By move 176 Sedol eventually caved in and resigned. AlphaGo 3 Humans 0. AlphaGo had won the match. Backstage, the DeepMind team was going through a strange range of emotions. They’d won the match, but seeing

dedicated to promoting Go and science subjects as well as to Unicef. Yet their human code was causing them to empathise with Sedol’s pain. AlphaGo did not demonstrate any emotional response to its win. No little surge of electrical current. No code spat out with a resounding ‘YES!’ It

is this emotional response that is the drive to be creative and venture into the unknown: it was humans, after all, who’d programmed AlphaGo with the goal of winning. Scary because the machine won’t care if the goal turns out to be not quite what its programmers had

gains that accumulate over time, bet the whole bank. Sedol and his team had stayed up all of Saturday night trying to reverse-engineer from AlphaGo’s games how it played. It seemed to work on a principle of playing moves that incrementally increase its probability of winning rather than betting

on the potential outcome of a complicated single move. Sedol had witnessed this when AlphaGo preferred lazy moves to win game 3. The strategy they’d come up with was to disrupt this sensible play by playing the risky single

there for thirty minutes staring at the board, staring at defeat, when he suddenly placed a white stone in an unusual position, between two of AlphaGo’s black stones. Michael Redmond, who was commentating on the YouTube channel, spoke for everyone: ‘It took me by surprise. I’m sure that

at their screens behind the scenes and watched their creation imploding. It was as if move 78 short-circuited the program. It seemed to cause AlphaGo to go into meltdown as it made a whole sequence of destructive moves. This apparently is another characteristic of the way Go algorithms are programmed

. No human with a shred of strategic sense would make them. The game dragged on for a total of 180 moves, at which point AlphaGo put up a message on the screen that it had resigned. The press room erupted with spontaneous applause. The human race had got one back

s transformational creativity, whereby breaking out of the system you can find new insights. At the press conference, Hassabis and Silver could not explain why AlphaGo had lost. They would need to go back and analyse why it had made such a lousy move in response to Sedol’s move 78

worthy of response. Perhaps Sedol just needed to get to know his opponent. Perhaps over a longer match he would have turned the tables on AlphaGo. Could he maintain the momentum into the fifth and final game? Losing 3–2 would be very different from 4–1. The last game

but now it is trying hard to claw it back … nail-biting.’ Sedol was in the lead at this stage. It was game on. Gradually AlphaGo did claw back. But right up to the end the DeepMind team was not exactly sure whether it was winning. Finally, on move 281 – after

those looking on, its capability to learn and adapt was something quite new. Hassabis’s tweet after winning the first game summed up the achievement: ‘#AlphaGo WINS!!!! We landed it on the moon.’ It was a good comparison. Landing on the moon did not yield extraordinary new insights about the universe

, but the technology that we developed to achieve such a feat has. Following the last game, AlphaGo was awarded an honorary professional 9 dan rank by the South Korean Go Association, the highest accolade for a Go player. From hilltop to mountain

new tactics. The fifth line is now played early on, as we have come to understand that it can have big implications for the endgame. AlphaGo has gone on to discover still more innovative strategies. DeepMind revealed at the beginning of 2017 that its latest iteration had played online anonymously against

was the analysis of the games that was truly insightful. Those games are now regarded as a treasure trove of new ideas. In several games AlphaGo played moves that beginners would have their wrists slapped for by their Go master. Traditionally you do not play a stone in the intersection of

the third column and third row. And yet AlphaGo showed how to use such a move to your advantage. Hassabis describes how the game of Go had got stuck on what mathematicians like to

with modern Go is that conventions had built up about ways to play that had ensured players hit peak A. But by breaking those conventions AlphaGo had cleared the fog and revealed an even higher peak B. It’s even possible to measure the difference. In Go, a player using

the conventions of peak A will in general lose by two stones to the player using the new strategies discovered by AlphaGo. This rewriting of the conventions of how to play Go has happened at a number of previous points in history. The most recent was

in the 1930s. His experimentation with ways of playing the opening moves revolutionised the way the game is played. But Go players now recognise that AlphaGo might well have launched an even greater revolution. The Chinese Go champion Ke Jie recognises that we are in a new era: ‘Humanity has played

are now second best to the machine. Sure, the machine was programmed by humans, but that doesn’t really seem to make it feel better. AlphaGo has since retired from competitive play. The Go team at DeepMind has been disbanded. Hassabis proved his Cambridge lecturer wrong. DeepMind has now set its

bigger picture – at least for now. Machine versus machine It is this power to change and adapt to new encounters that was exploited to make AlphaGo. The team at DeepMind built their algorithm with a period of supervised learning. This is like an adult helping a child learn the skills that

handedly rediscover all of mathematics to get to the frontier. Instead I spent a few years at university fast-tracking through centuries of mathematical discovery. AlphaGo began by going through the same process. Humans have played millions of games of Go that have been recorded digitally online. This is an amazing

that make you feel like you’ve got to the top but that are little more than tiny hillocks surrounded by towering mountains. What if AlphaGo maximised its game play to beat other players in this local maxima? This appeared to have been the case when European Champion Fan Hui discovered

a weakness in the way AlphaGo was playing some days prior to the event with Lee Sedol. But once the algorithm was introduced to this new game play it quickly learned

descend the hill and find a way to scale new heights. DeepMind now has an even better algorithm that can thrash the original version of AlphaGo. This algorithm circumvented the need to be shown how humans play the game. Like the Atari algorithm, it was given the 19×19 pixels

and a score and started to play, experimenting with moves. It exploited the power of reinforcement learning, the second stage in the building of AlphaGo. This is almost tabula rasa learning and even the team at DeepMind was shocked at how powerful the new algorithm was. It was no longer

training, in which time it played 4.9 million games against itself, it was able to beat by 100 games to nil the version of AlphaGo that had defeated Lee Sedol. What took humans 3000 years to achieve, it did in three days. By day forty it was unbeatable. It

Microsoft, the motivation for the Rembrandt project was most likely less artistic than commercial. To convincingly fake a Rembrandt demonstrates how good your code is. AlphaGo’s triumph against Lee Sedol was similarly not so much about discovering new and more creative ways to play the game of Go as it

that it isn’t a mistake rather than an extremely insightful suggestion? The Go commentators were not sure which side of the divide to put AlphaGo’s move 37 in game 2 until they eventually saw that it won the game. But increasingly these algorithms are doing more than playing games

on its way to producing a new theorem of value that might surprise the mathematical community just as the gaming world was so shocked by AlphaGo. 10 THE MATHEMATICIAN’S TELESCOPE Our writing tools participate in the writing of our thoughts. Friedrich Nietzsche For all my existential angst about the

lurking between the alto and bass parts he had written. It is the mark of the creative thinker to break with traditional rules. In AlphaGo we saw this in move 37 of game 2. Likewise we find Bach getting to the end of a chorale by sometimes breaking the rule

. For me this is the moment when the Lovelace Test was passed. It is the musical version of move 37 in game 2 of AlphaGo’s contest with Lee Sedol. The algorithm is producing a result that is surprising both the programmers of the algorithm and the musician whom it

Society’s meetings about the impact that machine learning was going to have on society that I had an idea. It was Hassabis’s algorithm AlphaGo that had started my whole existential crisis about whether the job of being a mathematician would continue to be a human one. Hassabis and I

learning and 97, 98; term 46; training 89–91; unexpected consequences of 62–5 see also individual algorithm name Al-Khwarizmi, Muhammad 46, 47, 159 AlphaGo 22, 29–43, 95–6, 97–8, 131, 145, 168, 209, 219–20, 233 AlphaZero 97–8 Al Qadiri, Fatima 224 Altamira, Cave of,

Genius Makers: The Mavericks Who Brought A. I. To Google, Facebook, and the World

by Cade Metz  · 15 Mar 2021  · 414pp  · 109,622 words

“Hello, this is Mark, from Facebook.” 8 HYPE “Success is guaranteed.’’ 9 ANTI-HYPE “He could produce something evil by accident.” 10 EXPLOSION “He ran AlphaGo like Oppenheimer ran the Manhattan Project.” 11 EXPANSION “George wiped out the whole field without even knowing its name.” 12 DREAMLAND “It is not that

with the larger community—and Google was beginning to do much the same. “You,” LeCun told Sutskever, “are going to fail.” 10 EXPLOSION “HE RAN ALPHAGO LIKE OPPENHEIMER RAN THE MANHATTAN PROJECT.” On October 31, 2015, Facebook chief technology officer Mike Schroepfer stood at the end of a table inside the

“someone would have told me.” He was wrong. Days later, Nature carried a cover story in which Hassabis and DeepMind revealed that their AI system, AlphaGo, had beaten the three-time European Go champion. It had happened in a closed-door match in October. LeCun and Facebook caught wind of this

was building to other forms of AI inside the company. But the fact remained that Google and DeepMind were ahead. In that closed-door match, AlphaGo had won all five games against the European champion, a Chinese-Frenchman named Fan Hui. Several weeks later, in Seoul, it would challenge Lee Sedol

knew how to move men (and women, including Geoff Hinton’s cousin, Joan Hinton). Hinton saw the same combination of skills in Hassabis. “He ran AlphaGo like Oppenheimer ran the Manhattan Project. If anybody else had run it,” Hinton says, “they would not have gotten it working so fast, so well

Fan Hui, the European Go champion, the next year. The result shocked both the worldwide Go community and the global community of AI researchers, but AlphaGo versus Lee Sedol promised to be something far bigger. When IBM’s Deep Blue supercomputer topped world-class champion Garry Kasparov inside a high-rise

compared to the match in Seoul. In Korea—not to mention Japan and China—Go is a national pastime. Over 200 million people would watch AlphaGo versus Lee Sedol, double the audience for the Super Bowl. At a press conference the day before the five-game match, Lee boasted that he

kimchi and grilled meats—which he didn’t eat—Hassabis said he was “cautiously confident.” What the pundits didn’t grasp, he explained, was that AlphaGo had continued to hone its skills since the match in October. He and his team originally taught the machine to play Go by feeding 30

million moves into a deep neural network. From there, AlphaGo played game after game against itself, all the while carefully tracking which moves proved successful and which didn’t—much like the systems the lab

had built to play old Atari games. In the months since beating Fan Hui, the machine had played itself several million more times. AlphaGo was continuing to teach itself the game, learning at a faster rate than any human ever could. Google chairman Eric Schmidt sat across from Hassabis

later, as the match reached its climax, Sergey Brin flew into Seoul. Hassabis spent the first game moving between a private viewing room and the AlphaGo control room down the hall. This room was filled with PCs and laptops and flat-panel displays, all tapping into a service running across several

Silver said. “It’s hard to know what to believe. You’re listening to the commentators on the one hand. And you’re looking at AlphaGo’s evaluation on the other hand. And all the commentators are disagreeing.” On that first day of the match, alongside Schmidt and Dean and other

room. After Lee’s move—Move 78—the machine responded with a move so poor, its chances of winning immediately plummeted. “All the thinking that AlphaGo had done up to that point was sort of rendered useless,” Hassabis said. “It had to restart.” At that moment, Lee looked up from the

Fan Hui had said several days earlier. Lee Sedol would go on to win his next nine matches against top human players. The match between AlphaGo and Lee Sedol was the moment when the new movement in artificial intelligence exploded into the public consciousness. It was a milestone moment not just

promise for AI. After reading about the match, Jordi Ensign, a forty-five-year-old computer programmer from Florida, went out and got two tattoos. AlphaGo’s Move 37 was tattooed on the inside of her right arm—and Lee Sedol’s Move 78 was on the left. 11 EXPANSION “GEORGE

said, because his wife had been diagnosed with pancreatic cancer after it advanced beyond the stage where she could be cured. * * * — IN the wake of AlphaGo’s win in Korea, many inside Google Brain grew to resent DeepMind, and a fundamental divide developed between the two labs. Led by Jeff Dean

they were. One colleague couldn’t quite believe they had all founded the same company. Like many inside Google Brain, Suleyman would come to resent AlphaGo. But in the beginning, the warm glow from DeepMind’s Go-playing machine gave an added shine to his own pet project. Three weeks after

DeepMind revealed that AlphaGo had beaten Fan Hui, the European champion, Suleyman unveiled what he called DeepMind Health. While he was growing up in London near King’s Cross

was to build artificial intelligence that could remake the world’s healthcare providers, beginning with the NHS. Every news story covering the project pointed to AlphaGo as evidence that DeepMind knew what it was doing. His first big project was a system for predicting acute kidney injury. Each year, one out

data that DeepMind researchers could feed into a neural network so it could identify patterns that anticipated acute kidney injury. After the project was unveiled, AlphaGo went to Korea and defeated Lee Sedol, and the warm glow of the Go-playing machine grew brighter. Then, just a few weeks later,

skeptical. * * * — WHEN Qi Lu returned to work several months after breaking his hip in the Bellevue park, he was still walking with a cane. Meanwhile, AlphaGo had beaten Lee Sedol, and the tech industry was stricken with a kind of AI fever. Even smaller Silicon Valley companies—Nvidia, Twitter, Uber—were

HUBRIS “I KNEW WHEN I GAVE THE SPEECH THAT THE CHINESE WERE COMING.” In the spring of 2017, a year after the match in Korea, AlphaGo played its next match in Wuzhen, China, an ancient water town eighty miles south of Shanghai along the Yangtze River. With its lily ponds, stone

rise of new Internet technologies and marked the ways they would regulate and control the spread of information, it was now hosting a match between AlphaGo and the Chinese grandmaster Ke Jie, the current number-one-ranked Go player in the world. The morning of the first game, inside a

United States—about 680 million—and that number was growing at a rate no other country could match. Google wanted back in. The company saw AlphaGo as the ideal vehicle. In China, Go was a national game. An estimated 60 million Chinese had watched the match against Lee Sedol as it

the Internet. And as Google eyed the Chinese market, one of its primary aims was to promote its expertise in artificial intelligence. Even before the AlphaGo match with Lee Sedol, executives at Google and DeepMind had discussed the possibility of a second match in China that would pave a path back

table tennis matches in China during the 1970s as a way of easing diplomatic relations between the two countries. Google spent the next year planning AlphaGo’s visit to China, meeting with the national minister of sport and arranging for multiple Internet and television services to broadcast the match. Sundar Pichai

conference center before the first game, they photographed him like he was a pop star. Later that morning, as Hassabis described the ongoing evolution of AlphaGo in the room with the afternoon sky painted on the wall, Ke Jie made the game’s first move in the auditorium, a few hundred

at Google had envisioned. On the morning of the first game with Ke Jie, sitting in front of that painted afternoon sky, Demis Hassabis said AlphaGo would soon grow even more powerful. His researchers were building a version that could master the game entirely on its own. Unlike the original incarnation

of AlphaGo, it didn’t need to learn its initial skills by analyzing moves from professional players. “It removes more and more of the human knowledge,” Hassabis

those at the top rungs of the Chinese government did not want underlined—that the West was sprinting ahead in the race to the future. AlphaGo did not just win handily. It won against a Chinese grandmaster in China. And between games one and two, Eric Schmidt spent thirty minutes

invest in moonshot projects across industry, academia, and the military. As two university professors who were working on the plan told the New York Times, AlphaGo versus Lee Sedol was China’s Sputnik moment. China’s plan echoed a blueprint the Obama administration laid down just before leaving office, often using

was that even humans couldn’t agree on what was and what was not hate speech. * * * — TWO years earlier, in the summer of 2016, after AlphaGo defeated Lee Sedol and before Donald Trump defeated Hillary Clinton, Zuckerberg sat down at a conference table inside Building 20, the new centerpiece of the

seemed. In the early months of 2018, he published what he called a trilogy of papers critiquing deep learning and, in particular, the feats of AlphaGo. Then he pitched his critique in the popular press, with one story appearing on the cover of Wired magazine. All this would eventually lead to

robots, he changed his mind about the future of AI research. As Covariant’s systems moved into the warehouse in Berlin, he called it “the AlphaGo moment” for robotics. “I have always been skeptical of reinforcement learning, because it required an extraordinary amount of computation. But we’ve now got that

Seoul, South Korea. Qi Lu leaves Microsoft. Google deploys translation service based on deep learning. Donald Trump defeats Hillary Clinton. 2017—Qi Lu joins Baidu. AlphaGo defeats Ke Jie in China. China unveils national AI initiative. Geoff Hinton unveils capsule networks. Nvidia unveils progressive GANs, which can generate photo-realistic faces

creation of a machine that mastered old Atari games. DAVID SILVER, the researcher who met Hassabis at Cambridge and led the DeepMind team that built AlphaGo, the machine that marked a turning point in the progress of AI. MUSTAFA SULEYMAN, the childhood acquaintance of Demis Hassabis who helped launch DeepMind and

25 Miles in 15 Minutes,” New York Times, October 24, 2014. Two of them, Demis Hassabis and David Silver: Cade Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human,” Wired, May 19, 2016, https://www.wired.com/2016/05/google-alpha-go-ai/. “Despite its rarefied image”: Archived

.kontek.net/republic.strategyplanet.gamespy.com/d1.shtml. “Ian’s no mean player”: Ibid. David Silver also returned to academia: Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human.” With one paper, he studied people who developed amnesia: Demis Hassabis, Dharshan Kumaran, Seralynne D. Vann, and Eleanor A

www.youtube.com/watch?v=0ghzG14dT-w. “I want to know where we’re going”: Ibid. Hassabis started talking chess: Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human.” he invested £1.4 million of the initial £2 million: Hodson, “DeepMind and Google: The Battle to Control Artificial

Hassabis,” YouTube, https://www.youtube.com/watch?v=EhAjLnT9aL4. “I can’t talk about it yet”: Ibid. Hassabis and DeepMind revealed that their AI system, AlphaGo: Cade Metz, “In a Huge Breakthrough, Google’s AI Beats a Top Player at the Game of Go,” Wired, January 27, 2016, https://www.wired

breakthrough-googles-ai-beats-a-top-player-at-the-game-of-go/. Demis Hassabis and several other DeepMind researchers: Cade Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human,” Wired, May 19, 2016, https://www.wired.com/2016/05/google-alpha-go-ai/. The four researchers published a

-worlds-top-go-players/. He and his team originally taught the machine to play Go by feeding 30 million moves: Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human.” Google’s $75 billion Internet business: Google Annual Report, 2015, https://www.sec.gov/Archives/edgar/data/1288776/000165204416000012

/goog10-k2015.htm. Sergey Brin flew into Seoul: Metz, “What the AI Behind AlphaGo Can Teach Us About Being Human.” This room was filled with PCs and laptops: Cade Metz, “How Google’s AI Viewed the Move No Human

, 96–98, 121–22, 134, 267 Algorithmic Warfare Cross-Functional Team. See Project Maven Allen Institute for Artificial Intelligence, 272–74 Alphabet, 186, 216, 301 AlphaGo in China, 214–17, 223–24 in Korea, 169–78, 198, 216 Altman, Sam, 161–65, 282–83, 287–88, 292–95, 298–99 ALVINN

, 143, 289–90, 295, 299–300, 309–10 artificial intelligence (AI). See also intelligence ability to remove flagged content, 253 AI winter, 34–35, 288 AlphaGo competition as a milestone event, 176–78, 198 artificial general intelligence (AGI), 100, 109–10, 143, 289–90, 295, 299–300, 309–10 the black

the world leader in AI by 2030, 224–25, 226–27 promotion of TensorFlow within, 220–22, 225 use of facial recognition technology, 308 Wuzhen AlphaGo match, 214–17, 223–24 Clarifai, 230–32, 235, 239–40, 249–50, 325 Clarke, Edmund, 195 cloud computing, 221–22, 245, 298 computer

–11 for self-driving vehicles, 137–38, 197–98 teaching machines teamwork and collaboration, 296 DeepMind AI improvements at, 152 AI safety team, 157–58 AlphaGo, 169–78, 198, 214–17, 223–24 auction for acquiring DNNresearch, 5–7, 100 building systems that played games, 143, 155, 169–78, 295–98

–04, 109 as cofounder of Elixir, 103–04 compared to Robert Oppenheimer, 171 and David Silver, 101–02, 103, 104–05 DeepMind’s AI system AlphaGo, 169–78, 198, 214–17, 223–24 as the founder of DeepMind, 5, 10, 123–24, 186–87 meeting with Google, 112–16 and

, 183 Shanahan, Patrick, 246 Silicon Valley scale, 293–94 self-belief and conviction, importance of, 293, 306–07 venture capital firms, 160–61 Silver, David AlphaGo project, 171, 173–74, 175, 198 artificial intelligence research, 104–05 as cofounder of Elixir, 103 and Demis Hassabis, 101–02, 103, 104–05 Simon

Artificial Intelligence: A Guide for Thinking Humans

by Melanie Mitchell  · 14 Oct 2019  · 350pp  · 98,077 words

five years later, millions of internet viewers were introduced to the complex game of Go, a longtime grand challenge for AI, when a program called AlphaGo stunningly defeated one of the world’s best players in four out of five games. The buzz over artificial intelligence was quickly becoming deafening, and

derogatory as they sound; they simply refer to a system that can perform only one narrowly defined task (or a small set of related tasks). AlphaGo is possibly the world’s best Go player, but it can’t do anything else; it can’t even play checkers, tic-tac-toe, or

been considered one of AI’s “grand challenges”: creating a program that learns to play the game Go better than any human. DeepMind’s program AlphaGo builds on a long history of AI in board games. Let’s start with a brief survey of that history, which will help in explaining

how AlphaGo works and why it is so significant. Checkers and Chess In 1949, the engineer Arthur Samuel joined IBM’s laboratory in Poughkeepsie, New York, and

was based on the method of searching a game tree, which is the basis of all programs for playing board games to this day (including AlphaGo, which I’ll describe below). Figure 30 illustrates part of a game tree for checkers. The “root” of the tree (by convention drawn at the

the then European Go champion Fan Hui half a year earlier, Lee remained confident that he would prevail: “I think [AlphaGo’s] level doesn’t match mine.… Of course, there would have been many updates in the last four or five months, but that isn’t

enough time to challenge me.”21 Perhaps you were one of the more than two hundred million people who watched some part of the AlphaGo-Lee match online in March 2016. I’m certain that this ranks as the largest audience by far for any Go match in the game

Lee’s reaction at his loss to the program: “I am in shock, I admit that.… I didn’t think AlphaGo would play the game in such a perfect manner.”22 AlphaGo’s “perfect” play included many moves that evoked surprise and admiration among the match’s human commentators. But partway through

game 2, AlphaGo made a single move that gobsmacked even the most advanced Go experts. As Wired reported, At first, Fan Hui [the aforementioned European Go champion] thought

these are sometimes made by human Go masters. They are known in Japanese as kami no itte (‘the hand of God,’ or ‘divine moves’).”24 AlphaGo won that game, and the next. But in game 4, Lee had his own kami no itte moment, one that captures the intricacy of the

top players. Lee’s move took the commentators by surprise, but they immediately recognized it as potentially lethal for Lee’s opponent. One writer noted, “AlphaGo, however, didn’t seem to realize what was happening. This wasn’t something it had encountered … in the millions and millions of games it had

was asked what he had been thinking when he played it. It was, he said, the only move he had been able to see.”25 AlphaGo lost game 4 but came back to win game 5 and thus the match. In the popular media, it was Deep Blue versus Kasparov all

over again, with an endless supply of think pieces on what AlphaGo’s triumph meant for the future of humanity. But this was even more significant than Deep Blue’s win: AI had surmounted an even greater

challenge than chess and had done so in a much more impressive fashion. Unlike Deep Blue, AlphaGo acquired its abilities by reinforcement learning via self-play. Demis Hassabis noted that “the thing that separates out top Go players [is] their intuition” and

image of the skulls of vanquished enemies in the collection of a digital Viking. Not what DeepMind intended, I’m sure. In any case, AlphaGo Fan and AlphaGo Lee both used an intricate mix of deep Q-learning, “Monte Carlo tree search,” supervised learning, and specialized Go knowledge. But a year after

Sedol match, DeepMind developed a version of the program that was both simpler than and superior to the previous versions. This newer version is called AlphaGo Zero because, unlike its predecessor, it started off with “zero” knowledge of Go besides the rules.27 In a hundred games of

AlphaGo Lee versus AlphaGo Zero, the latter won every single game. Moreover, DeepMind applied the same methods (though with different networks and different built-in game rules) to learn

Japanese chess).28 The authors called the collection of these methods AlphaZero. In this section, I’ll describe how AlphaGo Zero worked, but for conciseness I’ll simply refer to this version as AlphaGo. FIGURE 31: An illustration of Monte Carlo tree search The word intuition has an aura of mystery, but

) with exploration (sometimes choosing lower-scoring moves for which the program doesn’t yet have much statistics). In figure 31, I showed three roll-outs; AlphaGo’s Monte Carlo tree search performed close to two thousand roll-outs per turn. The computer scientists at DeepMind didn’t invent Monte Carlo tree

that they could improve their system by complementing Monte Carlo tree search with a deep convolutional neural network. Given the current board position as input, AlphaGo uses a trained deep convolutional neural network to assign a rough value to all possible moves from the current position. Then Monte Carlo tree search

random, Monte Carlo tree search uses values output by the ConvNet as an indicator of which initial moves should be preferred. Imagine that you are AlphaGo staring at a board position: before you start the Monte Carlo process of performing roll-outs from that position, the ConvNet is whispering in your

your current position are probably the best ones. Conversely, the results of Monte Carlo tree search feed back to train the ConvNet. Imagine yourself as AlphaGo after a Monte Carlo tree search. The results of your search are new probabilities assigned to all your possible moves, based on how many times

will play the role of the program’s “intuition,” which is further improved by Monte Carlo tree search. Like its ancestor, Samuel’s checkers player, AlphaGo learns by playing against itself over many games (about five million). During its training, the convolutional neural network’s weights are updated after each move

the difference between the network’s output values and the improved values after Monte Carlo tree search is run. Then, when it’s time for AlphaGo to play, say, a human like Lee Sedol, the trained ConvNet is used at each turn to generate values to help Monte Carlo tree search

get started. With its AlphaGo project, DeepMind demonstrated that one of AI’s longtime grand challenges could be conquered by an inventive combination of reinforcement learning, convolutional neural networks, and

Monte Carlo tree search (and adding powerful modern computing hardware to the mix). As a result, AlphaGo has attained a well-deserved place in the AI pantheon. But what’s next? Will this potent combination of methods generalize beyond the world of

own”? And what is it, exactly, that they learn? Generality and “Transfer Learning” When I was searching online for articles about AlphaGo, the web offered me this catchy headline: “DeepMind’s AlphaGo Mastered Chess in Its Spare Time.”2 This claim is wrong and misleading, and it’s important to understand why

. AlphaGo (in all its versions) can’t play anything but Go. Even the most general version, AlphaZero, is not a single system that learned to play

learn on their own, simply by performing actions in their “environment” and observing the outcome. DeepMind’s most important claim about its results, especially on AlphaGo, is that the work has delivered on that promise: “Our results comprehensively demonstrate that a pure reinforcement learning approach is fully feasible, even in the

or guidance, given no knowledge of the domain beyond basic rules.”4 We have the claim. Now let’s look at the caveats. AlphaGo (or more precisely, the AlphaGo Zero version) indeed didn’t use any human examples in its learning, but human “guidance” is another story. A few aspects of human

the many hyperparameters that both of these entail. As the psychologist and AI researcher Gary Marcus has pointed out, none of these crucial aspects of AlphaGo were “learned from the data, by pure reinforcement learning. Rather, [they were] built in innately … by DeepMind’s programmers.”5 DeepMind’s Atari game-playing

programs were actually better examples of “learning without human guidance” than AlphaGo, because unlike the latter they were not provided with the rules of their game (for example, that the goal in Breakout is to destroy bricks

of wall that was robust but rather the system superficially approximated breaking through walls within a narrow set of highly trained circumstances.12 Similarly, while AlphaGo exhibited miraculous “intuition” in playing Go, the system doesn’t have any mechanisms, as far as I can tell, that would allow it to generalize

-playing program’s input—changes that are imperceptible to humans but that significantly damage the program’s ability to play the game. How Intelligent Is AlphaGo? Here’s something we must keep in mind when thinking about games like chess and Go and their relation to human intelligence. Consider the reasons

that will carry over into the rest of one’s life, general abilities that a person will be able to use in all endeavors. But AlphaGo, in spite of the millions of games it has played during its training, has not learned to “think” better about anything except the game of

it has learned are general in any way; none can be transferred to any other task. AlphaGo is the ultimate idiot savant. It’s certainly true that the deep Q-learning method used in AlphaGo can be used to learn other tasks, but the system itself would have to be wholly retrained

would have to start essentially from scratch in learning a new skill. This brings us back to the “easy things are hard” paradox of AI. AlphaGo was a great achievement for AI; learning largely via self-play, it was able to definitively defeat one of the world’s best human players

in a game that is considered a paragon of intellectual prowess. But AlphaGo does not exhibit human-level intelligence as we generally define it, or even arguably any real intelligence. For humans, a crucial part of intelligence is

play chess or Go. It may sound strange to say, but in this way the lowliest kindergartner in the school chess club is smarter than AlphaGo. From Games to the Real World Finally, let’s consider Demis Hassabis’s statement that the ultimate goal of these demonstrations on games is to

Andrej Karpathy, Tesla’s director of AI, to note that, for real-world tasks like this, “basically every single assumption that Go satisfies and that AlphaGo takes advantage of are violated, and any successful approach would look extremely different.”15 No one knows what that successful approach would be. Indeed, the

me what it refers to in the previous sentence and you’re welcome to join in the discussion. Notes Prologue: Terrified   1.  A. Cuthbertson, “DeepMind AlphaGo: AI Teaches Itself ‘Thousands of Years of Human Knowledge’ Without Help,” Newsweek, Oct. 18, 2017, www.newsweek.com/deepmind

-alphago-ai-teaches-human-help-687620.   2.  In the following sections, quotations from Douglas Hofstadter are from a follow-up interview I did with him after

, “DeepMind Founder Demis Hassabis on How AI Will Shape the Future,” Verge, March 10, 2016, www.theverge.com/2016/3/10/11192774/demis-hassabis-interview-alphago-google-deepmind-ai. 27.  D. Silver et al., “Mastering the Game of Go Without Human Knowledge,” Nature, 550 (2017): 354–59. 28.  D. Silver et

,” PCGamesN, accessed Dec. 7, 2018, www.pcgamesn.com/demis-hassabis-interview.   2.  E. David, “DeepMind’s AlphaGo Mastered Chess in Its Spare Time,” Silicon Angle, Dec. 6, 2017, siliconangle.com/blog/2017/12/06/deepminds-alphago-mastered-chess-spare-time.   3.  As one example, still in the game-playing domain, DeepMind published

that they are still quite limited and not yet ready to solve my family’s nightly dishwashing arguments. 15.  A. Karpathy, “AlphaGo, in Context,” Medium, May 31, 2017, medium.com/@karpathy/alphago-in-context-c47718cb95a5. 11: Words, and the Company They Keep   1.  My “Restaurant” story was inspired by similar tiny stories

Simpsons.   8.  K. Jennings, “The Go Champion, the Grandmaster, and Me,” Slate, March 15, 2016, www.slate.com/articles/technology/technology/2016/03/google_s_alphago_defeated_go_champion_lee_sedol_ken_jennings_explains_what.html.   9.  Quoted in D. Kawamoto, “Watson Wasn’t Perfect: IBM Explains the ‘Jeopardy!’ Errors,” Aol

intelligence AI Singularity, see Singularity AI spring AI winter AlexNet algorithm Allen, Paul Allen Institute for Artificial Intelligence; science questions data set AlphaGo; intelligence of; learning in AlphaGo Fan AlphaGo Lee AlphaGo Zero AlphaZero Amazon Mechanical Turk; origin of name American Civil Liberties Union (ACLU) analogy: in humans; letter-string microworld; relationship to categories

for big data; see also convolutional neural networks; encoder-decoder system; encoder networks; neural machine translation; recurrent neural networks DeepMind; acquisition by Google; see also AlphaGo; Breakout deep neural networks, see deep learning deep Q-learning; adversarial examples for; on Breakout; compared with random search; convolutional network in; on Go; transfer

Gates, Bill GEB, see Gödel, Escher, Bach general or human-level AI General Problem Solver genetic art geofencing Gershwin, Ira Go (board game); see also AlphaGo Gödel, Escher, Bach (book) GOFAI Good, I. J. Goodfellow, Ian Google DeepMind, see DeepMind Google Translate; see also neural machine translation Gottschall, Jonathan GPS, see

Four Battlegrounds

by Paul Scharre  · 18 Jan 2023

completely replace real-world data. The evolution of DeepMind’s go-playing AI systems shows the changing importance of data. DeepMind’s early version of AlphaGo, which beat eighteen-time world champion Lee Sedol in 2016, was first trained on a database of 30 million moves by human expert go players

. AlphaGo then refined its performance to superhuman levels through self-play, a form of training on synthetic data in which the computer plays against itself. An

, released the following year, reached superhuman performance without any human training data at all, playing 4.9 million games against itself. AlphaGo Zero was able to entirely replace human-generated data with synthetic data. (This also had the benefit of allowing the algorithm to learn to play

go without adopting any biases from human players.) A subsequent version of AlphaGo Zero was trained on 29 million games of self-play. For DeepMind’s next version, AlphaZero, three different versions of the same algorithm were trained

on AI, Schmidt said, “There’s no question that there was a Sputnik moment and it was in China. And the Sputnik moment was when AlphaGo beat the Chinese champion and Chinese care a great deal about the game of go.” Ke Jie, who at the time was the world’s

number one go player, lost to DeepMind’s AlphaGo in 2017. Schmidt added, “not only was it notable, but they also censored the television feed because they didn’t want people to see them

ASIC is Google’s Tensor Processing Unit (TPU). TPUs have been used to train several Google AI research projects, including language models and DeepMind’s AlphaGo and AlphaStar, which achieved superhuman performance in the games go and StarCraft, respectively. FPGAs are like a middle ground between general-purpose CPUs and GPUs

on their own pet projects, they used reinforcement learning to train bots to play Doom, the classic ’90s first-person shooter video game. DeepMind’s AlphaGo and AlphaStar, which achieved breakthroughs in go and StarCraft in 2016 and 2019, respectively, “helped us a ton,” Darcey said. “What that did was, it

today, and it comes up again and again in different AI systems. An algorithm programmed to play one Atari game can’t play other games. AlphaGo reportedly could not play well if the game area is larger or smaller than the 19×19 board on which it was trained. Failures in

score. One way in which machine learning systems can fail is if the training data does not sufficiently represent the AI system’s operating environment. AlphaGo’s inability to adjust to a differently sized board and the Marines defeating DARPA’s AI detection system are both examples of AI systems reacting

competition employed high-precision, split-second gunshots, demonstrating a “superhuman capability” making shots that were “almost impossible” for humans, as one fighter pilot explained. During AlphaGo’s celebrated victory over Lee Sedol, it made a move that so stunned Lee that he got up from the table and left the room

. AlphaGo calculated the odds that a human would have made that move (based on its database of 30 million expert human moves) as 1 in 10,

000. AlphaGo’s move wasn’t just better. It was inhuman. AlphaGo’s unusual move wasn’t a fluke. AlphaGo plays differently than humans in a number of ways. It will carry out multiple simultaneous attacks on

. Human players tend to play on the corners and sides of the board, whereas AlphaGo plays across the entire board. And AlphaGo has developed novel opening moves, including some that humans simply do not understand. Experts who study AlphaGo’s playing style describe it as “alien,” and “from an alternate dimension.” Similar inhuman

human players and, in some cases, impossible for human players to match. In poker, Libratus can make wild shifts in bet sizes. In go, once AlphaGo has a secure advantage it plays conservatively, since it is designed to maximize its chances of winning, rather than its margin of victory over the

other player. If AlphaGo is ahead, it will play conservatively to lock-in what may be a narrow margin, rather than press to widen the gap. Yet AI agents

describe them as an unstoppable juggernaut, never making mistakes. Fighting battles against a superhuman adversary could be deeply demoralizing to human soldiers. After losing to AlphaGo, Lee Sedol said, “I kind of felt powerless.” Three years after his defeat, he retired from the game of go. Lee cited AI’s influence

-like. In fact, the history of AI to date suggests that machine intelligence is often very inhuman, an alien form of intelligence. While programs like AlphaGo and AlphaZero have only a very narrow form of intelligence with nothing approximating the generality of human intelligence, they are highly intelligent at the games

in algorithmic efficiency and hardware can combine to make compute-heavy models much more accessible in only a few years. DeepMind’s incremental improvements in AlphaGo, which gained in performance and compute efficiency with each iteration, shows the value of both hardware and algorithmic improvements. The first version of

played against Fan Hui in October 2015, used 176 GPUs for inference (running the trained model). The version of AlphaGo that defeated Lee Sedol five months later in March 2016 switched to Google’s TPU, an ASIC optimized for deep learning. This version used only

forty-eight TPUs for inference and reduced energy consumption by roughly a factor of four. Later versions AlphaGo Master and AlphaGo Zero in 2017 reduced compute usage to only four TPUs, a greater than 10× improvement over the prior version. In just two years, through

improvements, DeepMind reduced energy consumption for inference by over fortyfold. Separately, DeepMind also reduced the compute needed for training by a factor of eight from AlphaGo Zero (2017) to AlphaZero (2018) one year later. The combination of both hardware and algorithmic improvements in compute efficiency will counterbalance some of the increases

,” Neptuneblog, July 16, 2021, https://neptune.ai/blog/understanding-few-shot-learning-in-computer-vision. 23DeepMind’s early version of AlphaGo: “The Google DeepMind Challenge Match,” DeepMind, n.d., https://deepmind.com/alphago-korea; David Silver et al., “Mastering the Game of Go with Deep Neural Networks and Tree Search,” Nature 529

(January 28, 2016), 485. 23updated version, AlphaGo Zero: David Silver et al., “Mastering the Game of Go Without Human Knowledge,” Nature 550 (October 19, 2017), 354–355, https://www.nature.com/articles/

nature24270.epdf. 23subsequent version of AlphaGo Zero: Silver et al., “Mastering the Game of Go Without Human Knowledge,” 358. 23AlphaZero: David Silver et al., Mastering Chess and Shogi by Self-Play

millions on compute: Ryan Carey, “Interpreting AI Compute Trends,” AI Impacts, n.d., https://aiimpacts.org/interpreting-ai-compute-trends/; Dan H., “How Much Did AlphaGo Zero Cost?” Dansplaining, updated June 2020, https://www.yuzeh.com/data/agz-cost.html; Saif M. Khan and Alexander Mann, AI Chips: What They Are

: John Tromp, “Number of Legal Go Positions,” tromp.github.io, n.d., http://tromp.github.io/go/legal.html; and “AlphaGo,” DeepMind, n.d., https://deepmind.com/research/case-studies/alphago-the-story-so-far. 48“The states don’t have values”: Sandholm, interview. 48betting tactics like “limping” and “donk betting”: Engadget

there was a Sputnik moment”: Eric Schmidt, interview by author, June 9, 2020. 73DeepMind’s AlphaGo: “AlphaGo,” DeepMind, n.d., https://deepmind.com/research/case-studies/alphago-the-story-so-far; Alex Hern, “China Censored Google’s AlphaGo Match against World’s Best Go Player,” The Guardian, May 24, 2017, https://www.theguardian.com

/technology/2017/may/24/china-censored-googles-alphago-match-against-worlds-best-go-player; “AlphaGo China,” DeepMind, 2017, https://deepmind

.com/alphago-china. 73“not only was it notable, but they also censored”: Schmidt, interview. 73Go is an ancient strategy game

-in-digital-dogfight-with-human-fighter-pilot. 266AlphaGo calculated the odds: Cade Metz, “In Two Moves, AlphaGo and Lee Sedol Redefined the Future,” Wired, March 16, 2016, https://www.wired.com/2016/03/two-moves-alphago-lee-sedol-redefined-future/. 266plays differently than humans: Dawn Chan, “The AI That Has Nothing to

Learn from Humans,” The Atlantic, October 20, 2017, https://www.theatlantic.com/technology/archive/2017/10/alphago-zero-the-ai-that-taught-itself-go/543450/. 266“It splits its bets”: Cade Metz, “A Mystery AI Just Crushed the Best Human Players at

: Mastering the Real-Time Strategy Game StarCraft II.” 270expanded how humans think about the game: Regan and Sadler, “Game Changer: AlphaZero revitalizing the attack.” 271If AlphaGo is ahead, it will play conservatively: Elizabeth Gibney, “Google AI Algorithm Masters Ancient Game of Go,” Nature 529 (2016): 445–446, https://www.nature.com

Unusual Moves Prove Its AI Prowess, Experts Say,” PCWorld, March 14, 2016, https://www.pcworld.com/article/3043668/alphagos-unusual-moves-prove-its-ai-prowess-experts-say.html; Tanguy Chouard, “The Go Files: AI Computer Clinches Victory Against Go Champion,” Nature (2016), https://doi.

org/10.1038/nature.2016.19553; David Ormerod, “AlphaGo Shows Its True Strength in 3rd Victory Against Lee Sedol,” Go Game Guru, March 12, 2016, https://gogameguru.com

/alphago-shows-true-strength-3rd-victory-lee-sedol/ (site discontinued), https://web.archive.org/web/20160312154540/https://gogameguru.com/alphago-shows-true-strength-3rd-victory-lee-sedol/. 271constant pressure on human players: Mike, “OpenAI & DOTA 2

action among humans ebbs and flows: U.S. Marine Corps, MCDP 1: Warfighting, 10. 274“I kind of felt powerless”: Christopher Moyer, “How Google’s AlphaGo Beat a Go World Champion,” The Atlantic, March 28, 2016, https://www.theatlantic.com/technology/archive/2016/03/the-invisible-opponent/475611/. 275AI “cannot be

; Lennart Heim, “Estimating PaLM’s training cost,” blog.heim.xyz, April 5, 2022, https://blog.heim.xyz/palm-training-cost/; Dan H, “How much did AlphaGo Zero cost?” Dansplaining, updated June 2020, https://www.yuzeh.com/data/agz-cost.html; and Ryan Carey, “Interpreting AI Compute Trends,” AI Impacts, n.d

8, 2019, http://www.rossgritz.com/uncategorized/updated-deepmind-operating-costs/. Other estimates suggest that compute for training large scale “flagship” AI models (e.g., AlphaGo, GPT-3) is doubling roughly every 10 months, a slightly slower pace than other deep learning models, perhaps due to the higher cost or greater

; “Cloud Tensor Processing Units (TPUs),” Google Cloud, n.d., https://cloud.google.com/tpu/docs/tpus. 298reduced energy consumption: The metric DeepMind used to compare AlphaGo versions, thermal design power (TDP), is not a direct measure of energy consumption. It is a rough first-order proxy, however, for power consumption. David

Silver and Demis Hassabis, “AlphaGo Zero: Starting From Scratch,” DeepMind Blog, October 18, 2017, https://deepmind.com/blog/article/alphago-zero-starting-scratch. 298reduced compute usage to only 4 TPUs: Silver and Hassabis, “AlphaGo Zero: Starting From Scratch”; “AlphaGo,” DeepMind, n.d., https://deepmind.com/research/case-studies

/alphago-the-story-so-far; David Silver et al., “Mastering the Game of

All-Cloud Smart Video Cloud Solution, 107 Allen, John, 280 Allen-Ebrahimian, Bethany, 82 Alphabet, 26, 296 AlphaDogfight, 1–3, 220–22, 257, 266, 272 AlphaGo, 23, 73, 180, 221, 266, 271, 274, 284, 298, 453, 454 AlphaPilot drone racing, 229–30, 250 AlphaStar, 180, 221, 269, 271, 441 AlphaZero, 267

The Deep Learning Revolution (The MIT Press)

by Terrence J. Sejnowski  · 27 Sep 2018

to Play Go In March 2016, Lee Sedol, the Korean Go 18-time world champion, played and lost a five-game match against DeepMind’s AlphaGo (figure 1.8), a Go-playing program that used deep learning networks to evaluate board positions and possible moves.29 Go is to Chess in

more than the number of atoms in the universe. In addition to several deep learning networks to evaluate the board and choose the best move, AlphaGo had a completely different learning system, one used to solve the temporal credit assignment problem: which of the many moves were responsible for a win

brain, which receive projections from the entire cerebral cortex and project back to it, solve this problem with a temporal difference algorithm and reinforcement learning. AlphaGo used the same learning algorithm that the basal ganglia evolved to evaluate sequences of The Rise of Machine Learning 17 Figure 1.8 Go board

during play in the five-game match that pitted Korean Go champion Lee Sedol against AlphaGo, a deep learning neural network that had learned how to play Go by playing itself. actions to maximize future rewards (a process that will be

explained in chapter 10). AlphaGo learned by playing itself—many, many times The Go match that pitted AlphaGo against Lee Sedol had a large following in Asia, where Go champions are national figures and treated like rock

stars. AlphaGo had earlier defeated a European Go champion, but the level of play was considerably below the highest levels of play in Asia, and Lee Sedol

was not expecting a strong match. Even DeepMind, the company that had developed AlphaGo, did not know how strong their deep learning program was. Since its last match, AlphaGo had played millions of games with several versions of itself and there was no way to benchmark how

good it was. It came as a shock to many when AlphaGo won the first three of five games, exhibiting an unexpectedly high level of play. This was riveting viewing in South Korea, where all the major

television stations had a running commentary on the games. Some of the moves made by AlphaGo were revolutionary. On the thirty-eighth move in the match’s second game, AlphaGo made a brilliantly creative play that surprised Lee Sedol, who took nearly ten minutes to respond

. AlphaGo lost the fourth game, a face-saving win for humans, and ended the match by winning four games to one (figure 1.9).30 I

Surveyor robotic spacecraft landed on the moon and beamed back the first photo of a moonscape.31 I witnessed these historic moments in real time. AlphaGo far exceeded what I and many others thought was possible. On January 4, 2017, a Go player on an Internet Go server called “Master” was

unmasked as AlphaGo 2.0 after winning sixty out of sixty games against some of the world’s best players, including the world’s reigning Go champion, the

revealed a new style of play that went against the strategic wisdom of the ages. On May 27, 2017, Ke Jie lost three games to AlphaGo at the Future of Go Summit in Wuzhen, China (figure 1.10). These were some of the best Go games ever played, and hundreds of

millions of Chinese followed the match. “Last year, I think the way AlphaGo played was pretty close to human beings, but today I think he plays like the God of Go,” Ke Jie concluded.32 After the first

U-shaped curve, with their best ones in an optimal state between low and high levels of arousal. Athletes call this being “in the zone.” AlphaGo also defeated a team of five top players on May 26, 2017. These players have analyzed the moves made by

AlphaGo and are already changing their strategies. In a new version of “ping-pong diplomacy,” the match was hosted by the Chinese government. China is making

initiative is to mine the brain for new algorithms.34 The next chapter in this Go saga is even more remarkable, if that is possible. AlphaGo was jump-started by supervised learning from 160,000 human Go games before playing itself. Some thought this was cheating— an autonomous AI program should

be able to learn how to play Go without human knowledge. In October, 2017, a new version, called AlphaGo Zero, was revealed that learned to play Go starting with only the rules of the game, and trounced

AlphaGo Master, the version that beat Kie Jie, winning 100 games to none.35 Moreover, AlphaGo Zero learned 100 times faster and with 10 times less compute power than AlphaGo Master. By completely ignoring human knowledge, AlphaGo Zero became super-superhuman. 20 Chapter 1 There

is no known limit to how much better AlphaGo might become as machine learning algorithms continue to

improve. AlphaGo Zero had dispensed with human play, but there was still a lot of Go knowledge handcrafted into the features that the program used to represent

the board. Maybe AlphaGo Zero could improve still further without any Go knowledge. Just as Coca-Cola Zero stripped all the calories from Coca-Cola, all domain knowledge of

Go was stripped from AlphaZero. As a result, AlphaZero was able to learn even faster and decisively beat AlphaGo Zero.36 To make the point that less is more even more dramatically, AlphaZero, without changing a single learning parameter, learned how to play chess

a checkmate many moves later that neither Stockfish nor humans saw coming. The aliens have landed and the earth will never be the same again. AlphaGo’s developer, DeepMind, was cofounded in 2010 by neuroscientist Demis Hassabis (figure 1.10, left), who had been a postdoctoral fellow at University College London

a blend between academia and start-ups. The synergies between neuroscience and AI run deep and are quickening. Learning How to Become More Intelligent Is AlphaGo intelligent? There has been more written about intelligence than any other topic in psychology except consciousness, both of which are difficult to define. Psychologists since

, reaching a peak in early adulthood and decreasing with age, whereas crystallized intelligence increases slowly and asymptotically as you age until fairly late in life. AlphaGo displays both crystallized and The Rise of Machine Learning 21 fluid intelligence in a rather narrow domain, but within this domain, it has demonstrated surprising

on learning in narrow domains. We are all professionals in the domain of language and practice it every day. The reinforcement learning algorithm used by AlphaGo can be applied to many problems. This form of learning depends only on the reward given to the winner at the end of a sequence

major concern will be who has access to the internal files of the digital assistants and digital tutors. Is Artificial Intelligence an Existential Threat? When AlphaGo convincingly beat Lee Sedol at Go in 2016, it fueled a reaction that had been building over the last several years concerning the 24 Chapter

would be publicly available for all to use, it had another, implicit and more important goal—to prevent private companies from doing evil. For, with AlphaGo’s victory over world Go champion Sedol, a tipping point had been reached. Almost overnight, artificial intelligence had gone from being judged a failure to

’s NIPS 2012 paper “ImageNet Classification with Deep Convolutional Neural Networks” reduces the error rate for correctly classifying objects in images by 18 percent. 2017—AlphaGo, a deep learning network program, beats Ke Jie, the world champion at Go. 6 The Cocktail Party Problem Chapter The Cocktail Party 6 Problem © Massachusetts

version of the value function that Gerry Tesauro trained in TD-Gammon to predict the value of board positions. The surprising success of DeepMind’s AlphaGo described in chapter 1 in achieving a world-championship level of play in Go is based on the same architecture as TD-Gammon, but on

steroids. One layer of hidden units in the value network of TD-Gammon became a dozen layers in AlphaGo, which played many millions of games. But the basic algorithms 154 Chapter 10 were the same. This is a dramatic demonstration of how well learning

machine learning engineers, and The Future of Machine Learning 193 where neuromorphic computing is one of the two wings of its Brain Project. Spurred by AlphaGo’s defeat of Ke Jie in 2017, which had much the same impact on China that Sputnik did on the United States in 1957, Beijing

and darkest traits are rooted in brain systems so ancient that we share them with insects—the same reinforcement algorithms that DeepMind used to train AlphaGo. The Society for Neuroscience hosts a website (http://www.brainfacts.org/ brain-basics/neural-network-function/) where you can look up information about many aspects

and redo the first match, I think that I would not have been able to win, because I at that time misjudged the capabilities of AlphaGo.” As quoted in Jordan Novet, “Go Board Game Chapion Lee Sedol Apologizes for Losing to Google’s AI,” VentureBeat, March, 12, 2016. https://venturebeat.com

, as quoted in Selina Cheng, “The Awful Frustration of a Teenage Go Champion Playing Google’s AlphaGo,” Quartz, May 27, 2017. https://qz.com/ 993147/the-awful-frustration-of-a-teenage-go-champion-playing-googles-alphago/. 33. Ke Jie, as quoted in Paul Mozur, “Google’s A.I. Program Rattles Chinese Go

Master As It Wins Match,” New York Times, May 25, 2017. https://www.nytimes .com/2017/05/25/business/google-alphago-defeats-go-ke-jie-again.html. 34. Paul Mozur, “Beijing Wants A.I. to Be Made in China by 2030,” New York Times, July 20

, 196 definition and nature of, 195 etymology of the word, 195 the space of, 201–203 Allman, John M., 74–75, 247, 294n8, 295n14, 317n5 AlphaGo, 17, 17f, 19–20, 79, 276 defeat of champions, 16–20, 23–24, 79, 193, 288n30 intelligence, 20–21 Ke Jie and, 18, 79, 193

learning system and algorithms, 16–17, 19–21, 153–154 Lee Sedol and, 16, 17, 17f, 18f, 23–24, 288n30 TD-Gammon compared with, 153 AlphaGo Zero, 19–20 AlphaZero, 20 Amari, Shun-ichi, 295n5 Amblyopia, 67 Amino acids, 116 Andersen, Richard A., 300n13 Anderson, James A., 1, 50f, 52, 54

–133 origin and roots of, 3 understanding, 119–122 Deep learning systems, 159. See also specific topics DeepLensing, 21 DeepMind, 17, 20, 154. See also AlphaGo Deepstack, 15, 24 Index Defense Advanced Research Projects Agency (DARPA) Grand Challenge, 4, 4f, 5f, 164, 169 Dehaene, Stanislas, 316n15 Deisseroth, Karfl, 315n7 Delbrück, Max

minimum, 96, 99. See also Energy minima finding the, 96 Global minimum, 96, 119, 120f Go (game), learning how to play, 16–20. See also AlphaGo Goldhagen, Sarah Williams, 319n1 Goldilocks problem in language, 112 Goldman-Rakic, Patricia S., 134, 303n16 Goldstein, Moise H., Jr., 298n19 Golomb, Beatrice A. (Sejnowski’s

, 147b dopamine model of, 151 dopamine neurons and, 150, 151, 152f Richard Sutton and, 79, 144, 145 Temporal difference learning algorithm, 146, 152f, 158, 267 AlphaGo and, 16 parameters in, 153 TD-Gammon and, 149 Temporal Dynamics of Learning Center (TDLC), 183–186 TensorFlow, 205–206 Tensor processing unit (TPU), 7

Human Compatible: Artificial Intelligence and the Problem of Control

by Stuart Russell  · 7 Oct 2019  · 416pp  · 112,268 words

important open problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind’s AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion—events that some experts predicted wouldn’t happen until 2097, if ever

such-and-such situation but an attempt to provide the machine with the ability to figure out the solution for itself. For example, when the AlphaGo team at Google DeepMind succeeded in creating their world-beating Go program, they did this without really working on Go. What I mean by this

improvements are applicable to many other problems, including problems as far afield as robotics. Just to rub it in, a version of AlphaGo called AlphaZero recently learned to trounce AlphaGo at Go, and also to trounce Stockfish (the world’s best chess program, far better than any human) and Elmo (the world

is the mother of invention.”) At the same time, it’s important to understand how much progress has occurred and where the boundaries are. When AlphaGo defeated Lee Sedol and later all the other top Go players, many people assumed that because a machine had learned from scratch to beat the

to the game’s particular definition of winning. Lookahead algorithms are incredibly effective for their specific tasks, but they are not very flexible. For example, AlphaGo “knows” the rules of Go, but only in the sense that it has two subroutines, written in a traditional programming language such as C++: one

subroutine generates all the possible legal moves and the other encodes the goal, determining whether a given state is won or lost. For AlphaGo to play a different game, someone has to rewrite all this C++ code. Moreover, if you give it a new goal—say, visiting the exoplanet

that achieves the goal. It cannot look inside the C++ code and determine the obvious: no sequence of Go moves gets you to Proxima Centauri. AlphaGo’s knowledge is essentially locked inside a black box. In 1958, two years after his Dartmouth summer meeting had initiated the field of artificial intelligence

applied the same idea to the game of backgammon, achieving world-champion-level play after 1,500,000 games.61 Beginning in 2016, DeepMind’s AlphaGo and its descendants used reinforcement learning and self-play to defeat the best human players at Go, chess, and shogi. Reinforcement learning algorithms can also

brains have far more moving parts than our bodies and those parts move much faster. The same is true for computers: for every move that AlphaGo makes on the Go board, it performs millions or billions of units of computation, each of which involves adding a branch to the lookahead search

. And each of those units of computation happens because the program makes a choice about which part of the tree to explore next. Very approximately, AlphaGo chooses computations that it expects will improve its eventual decision on the board. It has been possible to work out a reasonable scheme for managing

AlphaGo’s computational activity because that activity is simple and homogeneous: every unit of computation is of the same kind. Compared to other programs that use

that same basic unit of computation, AlphaGo is probably quite efficient, but it’s probably extremely inefficient compared to other kinds of programs. For example, Lee Sedol

, AlphaGo’s human opponent in the epochal match of 2016, probably does no more than a few thousand units of computation per move, but he has

the various kinds of deliberation so that good decisions are found as quickly as possible. It is clear, however, that a simple computational architecture like AlphaGo’s cannot possibly work in the real world, where we routinely need to deal with decision horizons of not tens but billions of primitive steps

all humans in strict order of intelligence. This is even more true of machines, because their abilities are much narrower. The Google search engine and AlphaGo have almost nothing in common, besides being products of two subsidiaries of the same parent corporation, and so it makes no sense to say that

reward system is called wireheading. Could something similar happen to machines that are running reinforcement learning algorithms, such as AlphaGo? Initially, one might think this is impossible, because the only way that AlphaGo can gain its +1 reward for winning is actually to win the simulated Go games that it is playing

. Unfortunately, this is true only because of an enforced and artificial separation between AlphaGo and its external environment and the fact that

AlphaGo is not very intelligent. Let me explain these two points in more detail, because they are important for understanding some of

the ways that superintelligence can go wrong. AlphaGo’s world consists only of the simulated Go board, composed of 361 locations that can be empty or contain a black or white stone. Although

AlphaGo runs on a computer, it knows nothing of this computer. In particular, it knows nothing of the small section of code that computes whether it

won or lost each game; nor, during the learning process, does it have any idea about its opponent, which is actually a version of itself. AlphaGo’s only actions are to place a stone on an empty location, and these actions affect only the Go board and nothing else—because there

is nothing else in AlphaGo’s model of the world. This setup corresponds to the abstract mathematical model of reinforcement learning, in which the reward signal arrives from outside the

do, as far as it knows, has any effect on the code that generates the reward signal, so AlphaGo cannot indulge in wireheading. Life for AlphaGo during the training period must be quite frustrating: the better it gets, the better its opponent gets—because its opponent is a near-exact copy

it had a design closer to what one might expect of a human-level AI system—it would be able to fix this problem. This AlphaGo++ would not assume that the world is just the Go board, because that hypothesis leaves a lot of things unexplained. For example, it doesn’t

explain what “physics” is supporting the operation of AlphaGo++’s own decisions or where the mysterious “opponent moves” are coming from. Just as we curious humans have gradually come to understand the workings of

, in a way that (to some extent) also explains the workings of our own minds, and just like the Oracle AI discussed in Chapter 6, AlphaGo++ will, by a process of experimentation, learn that there is more to the universe than the Go board. It will work out the laws of

language of patterns and persuade them to reprogram its reward signal so that it always gets +1. The inevitable conclusion is that a sufficiently capable AlphaGo++ that is designed as a reward-signal maximizer will wirehead. The AI safety community has discussed wireheading as a possibility for several years.25 The

concern is not just that a reinforcement learning system such as AlphaGo might learn to cheat instead of mastering its intended task. The real issue arises when humans are the source of the reward signal. If we

have a particular purpose, such as the purpose of satisfying Harriet’s preferences. Let’s unpack this last concern a bit. Consider AlphaGo: What purpose does it have? That’s easy, one might think: AlphaGo has the purpose of winning at Go. Or does it? It’s certainly not the case that

AlphaGo always makes moves that are guaranteed to win. (In fact, it nearly always loses to AlphaZero.) It’s true that when it’s

only a few moves from the end of the game, AlphaGo will pick the winning move if there is one. On the other hand, when no move is guaranteed to win—in other words, when

AlphaGo sees that the opponent has a winning strategy no matter what AlphaGo does—then AlphaGo will pick moves more or less at random. It won’t try the trickiest move in the hope that

perfectly. It acts as if it has lost the will to win. In other cases, when the truly optimal move is too hard to calculate, AlphaGo will sometimes make mistakes that lead to losing the game. In those instances, in what sense is it true that

AlphaGo actually wants to win? Indeed, its behavior might be identical to that of a machine that just wants to give its opponent a really exciting game. So, saying that AlphaGo “has the purpose of winning” is an oversimplification. A better

description would be that AlphaGo is the result of an imperfect training process—reinforcement learning with self-play—for which winning was the

reward. The training process is imperfect in the sense that it cannot produce a perfect Go player: AlphaGo learns an evaluation function for Go positions that is good but not perfect, and it combines that with a lookahead search that is good but

act in ways that are contrary to their own preferences. For example, when Lee Sedol lost his Go match to AlphaGo, he played one or more moves that guaranteed he would lose, and AlphaGo could (in some cases at least) detect that he had done this. It would be incorrect, however, for

AlphaGo to infer that Lee Sedol has a preference for losing. Instead, it would be reasonable to infer that Lee Sedol has a preference for winning

-playing program to beat its creator in 1955,2 by Deep Blue to beat the then world chess champion, Garry Kasparov, in 1997, and by AlphaGo to beat former world Go champion Lee Sedol in 2016. For Deep Blue, humans wrote the piece of the program that evaluates positions at the

leaves of the tree, based largely on their knowledge of chess. For Samuel’s program and for AlphaGo, the programs learned it from thousands or millions of practice games. The first question—which part of the tree should the program explore?—is an

chosen to get a PhD at Berkeley. The GetVisa step, whose feasibility is uncertain, has been expanded out into an abstract plan of its own. AlphaGo simply cannot do this kind of thinking: the only actions it ever considers are primitive actions occurring in a sequence from the initial state. It

has no notion of abstract plan. Trying to apply AlphaGo in the real world is like trying to write a novel by wondering whether the first letter should be A, B, C, and so on

acquiring more knowledge is a form of learning, because it means the system can answer more questions; for a lookahead decision-making system such as AlphaGo, learning could mean improving its ability to evaluate positions or improving its ability to explore useful parts of the tree of possibilities. Learning from examples

application areas for AI. Deep learning has also played an important role in applications of reinforcement learning—for example, in learning the evaluation function that AlphaGo uses to estimate the desirability of possible future positions, and in learning controllers for complex robotic behaviors. As yet, we have very little understanding as

, 49–50, 260–61 propositional logic and, 268–70 reinforcement learning, 55–57, 105 subroutines within, 34 supervised learning, 58–59, 285–93 Alibaba, 250 AlphaGo, 6, 46–48, 49–50, 55, 91, 92, 206–7, 209–10, 261, 265, 285 AlphaZero, 47, 48 altruism, 24, 227–29 altruistic AI, 173

, 62, 261 deep convolutional network, 288–90 deep dreaming images, 291 deepfakes, 105–6 deep learning, 6, 58–59, 86–87, 288–93 DeepMind, 90 AlphaGo, 6, 46–48, 49–50, 55, 91, 92, 206–7, 209–10, 261, 265, 285 AlphaZero, 47, 48 DQN system, 55–56 deflection arguments, 154

Robot Rules: Regulating Artificial Intelligence

by Jacob Turner  · 29 Oct 2018  · 688pp  · 147,571 words

is an unprecedented technological development, particularly with the advent of unsupervised machine learning—i.e. machines learning without human input (as famously recently occurred with AlphaGo Zero) and no doubt in due course learning from other machines. In a nutshell, it is not only because AI will be so far-reaching

capabilities. The manner in which humans solve problems is limited by the hardware available to us: our brains. AI has no such limits. DeepMind’s AlphaGo program achieved superhuman capabilities in Chess, Go, and other board games. DeepMind CEO Demis Hassabis explained: “It doesn’t play like a human, and it

rate of improvement continues, they might beat the human world champion in about a decade”.110 Just three years later, in March 2016, DeepMind’s AlphaGo defeated champion player Lee Sedol by four games to one—with the human champion even resigning in the final game, having been tactically and emotionally

crushed.111 The killer move by AlphaGo was successful precisely because it used tactics which went against all traditional human schools of thought.112 Of course, winning board games is one thing

University Library Research Paper, 5 December 2017, https://​arxiv.​org/​abs/​1712.​01815, accessed 1 June 2018. See also Cade Metz, What the AI Behind AlphaGo Can Teach Us About Being Human”, Wired, 19 May 2016, https://​www.​wired.​com/​2016/​05/​google-alpha-go-ai/​, accessed 1 June 2018. 47Russell

a corporeal object with the capacity to exert itself physically”. See also Jean-Christophe Baillie, “Why AlphaGo Is Not AI”, IEEE Spectrum, 17 March 2016, https://​spectrum.​ieee.​org/​automaton/​robotics/​artificial-intelligence/​why-alphago-is-not-ai, accessed 1 June 2018. 57As to the unique nature of this aspect of AI

: Paths, Dangers and Strategies (Oxford: Oxford University Press, 2014), 16. 111In May 2017, a subsequent version of the program, “AlphaGo Master”, defeated the world champion Go player, Ke Jie by three games to nil. See “AlphaGo at The Future of Go Summit, 23–27 May 2017”, DeepMind Website, https://​deepmind.​com/​research

/​alphago/​alphago-china/​, accessed 16 August 2018. Perhaps as a control against accusations that top players were being beaten psychologically by the

prospect of playing an AI system rather than on the basis of skill, DeepMind had initially deployed AlphaGo Master in secret, during which period it beat 50 of the world’s top players online, playing under the pseudonym “Master”. See “Explore the

AlphaGo Master series”, DeepMind Website, https://​deepmind.​com/​research/​alphago/​match-archive/​master/​, accessed 16 August 2018. DeepMind, promptly announced AlphaGo’s retirement from the game to pursue other interests. See Jon Russell, “After Beating the World

’s Elite Go Players, Google’s AlphaGo AI Is Retiring”, Tech Crunch, 27 May 2017, https://​techcrunch.​com

/​2017/​05/​27/​googles-alphago-ai-is-retiring/​ accessed 1 June 2018. Rather like a champion boxer tempted out of retirement for one

more fight, AlphaGo (or at least a new program bearing a similar name, AlphaGo Zero) returned a year later to face a new challenge

: AlphaGo Zero. This is discussed in Chapter 2 at s. 3.2.1, and FN 130

and 131. 112Cade Metz, “In Two Moves, AlphaGo and Lee Sedol Redefined the Future”, Wired, 16 March

2016, https://​www.​wired.​com/​2016/​03/​two-moves-alphago-lee-sedol-redefined-future/​, accessed 1 June 2018. In October 2017, DeepMind announced yet another breakthrough

the game to such an extent that it was able to beat the previous version of AlphaGo by 100 games to 0. See “AlphaGo Zero: Learning from Scratch”, DeepMind Website, 18 October 2017, https://​deepmind.​com/​blog/​alphago-zero-learning-scratch/​, accessed 1 June 2018. See also Chapter 2 at s. 3.2

, hunches, experience or even wisdom. In short, it is precisely when we stop trying to reproduce human intelligence that we can successfully replace it. Otherwise, AlphaGo would have never become so much better than anyone at playing Go.103 In philosopher Philippa Foot’s famous “Trolley Problem”104 thought experiment, participants

think, but to think differently from us, is potentially one of the most beneficial features of AI. Chapter 1 described how the ground-breaking program AlphaGo used reinforcement learning to defeat a human champion player at the notoriously complex game “Go”. In October 2017, DeepMind announced a further milestone: researchers had

Sedol in 2016, had learned their skills via scanning and analysing millions of moves contained in vast data sets of games played by humans.129 AlphaGo Zero, as the 2017 program was called, had a different method: it learned entirely without human input. Instead, it was provided only with the rules

, had mastered the game to such an extent that after just three days of self-training it was able to beat the previous version of AlphaGo by 100 games to 0. DeepMind explained the new methodology as follows:It is able to do this by using a novel form of reinforcement

learning, in which AlphaGo Zero becomes its own teacher. The system starts off with a neural network that knows nothing about the game of Go. It then plays games

tuned and updated to predict moves, as well as the eventual winner of the games.130 AlphaGo Zero is an excellent example of the capability for independent development in AI. Though the other versions of AlphaGo were able to create novel strategies unlike those used by human players, the program did so

on the basis of data provided by humans. Through learning entirely from first principles, AlphaGo Zero shows that humans can be taken out of the loop altogether soon after a program’s inception. The causal link between the initial human

input and the ultimate output is weakened yet further. DeepMind say of AlphaGo Zero’s unexpected moves and strategies: “These moments of creativity give us confidence that AI will be a multiplier for human ingenuity, helping us with

Innovation and Technology, Vol. 5, No. 2 (2013), 214–247, 234–235. 129See Chapter 1 at s. 5 and FN 111. A subsequent iteration of AlphaGo, “AlphaGo Master” beat Ke Jie, at the time the world’s top-ranked human player, by three games to nil in May 2017. See

AlphaGo at The Future of Go Summit, 23–27 May 2017”, DeepMind Website, https://​deepmind.​com/​research/​alphago/​alphago-china/​, accessed 16 August 2018. 130Silver et al., “AlphaGo Zero: Learning from Scratch”, DeepMind Website, 18 October 2017, https://​deepmind.​com/​blog

/​alphago-zero-learning-scratch/​, accessed 1 June 2018. See also the paper published by the DeepMind

(19 October 2017), 354–359, https://​doi.​org/​10.​1038/​nature24270, accessed 1 June 2018. 131Silver et al., “AlphaGo Zero: Learning from Scratch”, DeepMind Website, 18 October 2017, https://​deepmind.​com/​blog/​alphago-zero-learning-scratch/​, accessed 1 June 2018. 132Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow

iconic designs. Although these examples both feature the physical robots, we may also wish to preserve the source code of seminal AI systems such as AlphaGo Zero for future generations to study and learn from. 4.4 The Argument from Post-humanism: Hybrids, Cyborgs and Electronic Brains Machines and human minds

in Wired, describes in vivid terms how the human proxy “operator” Aja Huang followed the instructions provided to him by DeepMind’s AlphaGo: Cade Metz, “What the AI Behind Alphago Can Teach Us About Being Human”, Wired, 19 May 2016, https://​www.​wired.​com/​2016/​05/​google-alpha-go-ai/​, accessed 1

clear. The tools are at our disposal. The question is not whether we can, but whether we will. Index A Actus Reus Alibaba AlphaGo See alsoAlphaGo Zero AlphaGo Zero See alsoAlphaGo Amazon Android Fallacy Animals liability of punishment of rights of Artificial Neural Networks Asilomar 1975 Conference 2017 Conference Asimov, Isaac Auditors

The Alignment Problem: Machine Learning and Human Values

by Brian Christian  · 5 Oct 2020  · 625pp  · 167,349 words

is ongoing.30 Perhaps the single most impressive achievement in automated curriculum design, however, is DeepMind’s board game–dominating work with AlphaGo and its successors AlphaGo Zero and AlphaZero. “AlphaGo always has an opponent at just the right level,” explains lead researcher David Silver.31 “It starts off extremely naïve; it starts

future research is to establish the extent to which better imitating human expert moves corresponds to genuinely stronger play.”83 Fifteen years later, DeepMind’s AlphaGo system finally realized Arthur Samuel’s vision of a system that could concoct its own positional considerations from scratch. Instead of being given a big

-the-art prediction of human expert moves—57% accuracy to be precise, smashing the previous state-of-the-art result of 44%. In October 2015, AlphaGo became the first computer program to defeat a human professional Go player (in this case, the three-time European champion Fan Hui). Just seven months

heart.85 It was not learning to play the best moves. It was learning to play the human moves. The successes of Deep Blue and AlphaGo alike were possible only because of mammoth databases of human examples from which the machines could learn. These flagship successes of machine learning created such

to work from. Popularity thus served a double role. It had made the accomplishment significant. But it had also made it possible. No sooner had AlphaGo reached the pinnacle of the game of Go, however, than it was, in 2017, summarily dethroned, by an even stronger program called

AlphaGo Zero.86 The biggest difference between the original AlphaGo and AlphaGo Zero was in how much human data the latter had been fed to imitate: zero. From a completely random initialization, tabula

against itself, again and again and again and again. Incredibly, after just thirty-six hours of self-play, it was as good as the original AlphaGo, which had beaten Lee Sedol. After seventy-two hours, the DeepMind team set up a match between the two, using the exact same two-hour

time controls and the exact version of the original AlphaGo system that had beaten Lee. AlphaGo Zero, which consumed a tenth of the power of the original system, and which seventy-two hours earlier had never played a

accompanying Nature paper, “Humankind has accumulated Go knowledge from millions of games played over thousands of years, collectively distilled into patterns, proverbs and books.”87 AlphaGo Zero discovered it all and more in seventy-two hours. But there was something very interesting, and very instructive, going on under the hood. The

looks at sequences of moves and says, “Okay, if I go here, then they go there, but then I go here and I win.” In AlphaGo Zero, the explicit “slow” reasoning by thinking ahead, move by move, “if this, then that,” is done by an algorithm called Monte Carlo Tree Search

, we have an intuitive sense of how good a particular position is. This is the “value function” or “evaluation function” we’ve been discussing; in AlphaGo Zero, this comes from a neural network called the “value network,” which outputs a percentage from 0 to 100 of how likely

AlphaGo Zero thinks it is to win from that position. The second bit of implicit, “fast” reasoning is that when we look at the board there

deploy our slow, deliberate, “if this, then that” reasoning down the paths that our intuition has first identified as plausible or promising. This is where AlphaGo Zero gets interesting. These candidate moves come from a neural network called the “policy network,” which takes in the current board position as input and

, itself, ultimately decide to play. This is quite a strange and almost paradoxical idea, and merits a bit of further elaboration. The policy network represents AlphaGo Zero’s guess, for each possible move, of how likely it will be to choose that move after doing an explicit MCTS search to look

aspect is that the system uses these probabilities to focus the slow MCTS search along the series of moves it thinks are most likely.90 “AlphaGo Zero becomes its own teacher,” DeepMind’s David Silver explains. “It improves its neural network to predict the moves which

AlphaGo Zero itself played.”91 Given that the system uses these predictions to guide the very search whose outcome they are predicting, this might sound like

slow MCTS algorithm uses them to search more narrowly and wisely through possible future lines of play. As a result of this more refined search, AlphaGo Zero becomes a stronger player. The policy network then adjusts to predict these new, slightly stronger moves—which, in turn, allows the system to use

’s a virtuous circle. This process is known in the technical community as “amplification,” but it could just as easily be called something like transcendence. AlphaGo Zero learned only to imitate itself. It used its predictions to make better decisions, and then learned to predict those better decisions in turn. It

machine-learning community, has just recently reared its (multiple) head(s) in one of the flagship neural networks of the 2010s, AlphaGo Zero. When DeepMind iterated on their champion-dethroning AlphaGo architecture, they realized that the system they’d built could be enormously simplified by merging its two primary networks into one

double-headed network. The original AlphaGo used a “policy network” to estimate what move to play in a given position, and a “value network” to estimate the degree of advantage or

-level “features”—who controlled which territory, how stable or fragile certain structures were—would be extremely similar for both networks. Why reduplicate? In their subsequent AlphaGo Zero architecture, the “policy network” and “value network” became a “policy head” and “value head” attached to the same deep network. This new, Cerberus-like

that we discuss in Chapter 6. For earlier machine-learning work on curriculum design, see, e.g., Bengio et al., “Curriculum Learning.” 31. David Silver, “AlphaGo Zero: Starting from Scratch,” October 18, 2017, https://www.youtube.com/watch?v=tXlM99xPQC8. 32. Kerr, “On the Folly of Rewarding A, While Hoping for

That Learn to Play Games. 84. There are, of course, many subtle differences between the architecture and training procedure for Deep Blue and those for AlphaGo. For more details on AlphaGo, see Silver et al., “Mastering the Game of Go with Deep Neural Networks and Tree Search.” 85

. AlphaGo’s value network was derived from self-play, but its policy network was imitative, trained through supervised learning on a database of human expert games.

the Game of Go with Deep Neural Networks and Tree Search.” 86. Silver et al., “Mastering the Game of Go Without Human Knowledge.” In 2018, AlphaGo Zero was further refined into an even stronger program—and a more general one, capable of record-breaking strength in not just Go but chess

“expert iteration” (“ExIt”) algorithm in Anthony, Tian, and Barber, “Thinking Fast and Slow with Deep Learning and Tree Search.” 91. Shead, “DeepMind’s Human-Bashing AlphaGo AI Is Now Even Stronger.” 92. Aurelius, The Emperor Marcus Aurelius. 93. Andy Fitch, “Letter from Utopia: Talking to Nick Bostrom,” BLARB (blog), November 24

Alignment (blog), March 4, 2018, https://ai-alignment.com/iterated-distillation-and-amplification-157debfd1616. 99. For an explicit discussion of the connection between AlphaGo’s policy network and the idea of iterated capability amplification, see Paul Christiano, “AlphaGo Zero and Capability Amplification,” AI Alignment (blog), October 19, 2017, https://ai-alignment.com

/alphago-zero-and-capability-amplification-ede767bb8446. 100. Christiano, Shlegeris, and Amodei, “Supervising Strong Learners by Amplifying Weak Experts.” 101. Paul Christiano, personal interview, July 1

Conference: Contrasts in Computers, 119–28. Los Angeles, CA, 1958. Shead, Sam. “DeepMind’s Human-Bashing AlphaGo AI Is Now Even Stronger.” Business Insider, October 18, 2017. https://www.businessinsider.com/deepminds-alphago-ai-gets-alphago-zero-upgrade-2017–10. Shenoy, Premnath, and Anand Harugeri. “Elderly Patients’ Participation in Clinical Trials.” Perspectives in

parenting and, 166 reinforcement learning and, 151 technical limitations and, 313, 395–96n4 thermostats and, 311–12, 313 See also value alignment Allen, Woody, 170 AlphaGo, 162–63, 243–44, 380nn84–85 AlphaGoZero, 162–63, 244–45, 356n59, 380n86 AlphaZero, 141, 162–63, 205, 248, 380n86 Alstrøm, Preben, 167, 177 ALVINN

Giving What We Can, 379n67 Glimcher, Paul, 135 Global Catastrophic Risks (Bostrom and Ćirković), 262 GloVe, 37 Go, 145, 243–46, 380nn84–86 See also AlphaGo; AlphaGoZero Gödel, Escher, Bach (Hofstadter), 204–05 Goel, Sharad, 68, 348n47 Go-Explore, 373n54 Goh, Gabriel, 111, 357n69 Goldbert, Yoav, 44 Gonen, Hila, 44 Goodman

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity

by Amy Webb  · 5 Mar 2019  · 340pp  · 97,723 words

, machine-learning researcher Shane Legg, and entrepreneur Mustafa Suleyman. Part of the team’s appeal: they’d developed a program called AlphaGo. Within months, they were ready to test AlphaGo against a real human player. A match was arranged between DeepMind and Fan Hui, a Chinese-born professional Go player and one

communicate Hui’s moves back to the computer. Before the game, Toby Manning, who was one of the heads of the British Go Association, played AlphaGo in a test round—and lost by 17 points. Manning made some errors, but so did the program. An eerie thought crossed his mind: What

if the AlphaGo was just playing conservatively? Was it possible that the program was only playing aggressively enough to beat Manning, rather than to clobber him entirely? The

turn to start. During the first 50 moves, it was a quiet game—Hui was clearly trying to suss out the strengths and weaknesses of AlphaGo. One early tell: the AI would not play aggressively unless it was behind. It was a tight first match

. AlphaGo earned a very narrow victory, by just 1.5 points. Hui used that information going into the second game. If AlphaGo wasn’t going to play aggressively, then Hui decided that he’d fight early. But

then AlphaGo started playing more quickly. Hui mentioned that perhaps he needed a bit more time to think between

turns. On move 147, Hui tried to prevent AlphaGo from claiming a big territory in the center of the board, but the move misfired, and he was forced to resign. By game three, Hui’

s moves were more aggressive, and AlphaGo followed suit. Halfway through, Hui made a catastrophic overplay, which AlphaGo punished, and then another big mistake, which rendered the game effectively over. Reeling from frustration, Hui had to excuse himself

finish the match. Yet again, stress had gotten the better of a great human thinker—while the AI was unencumbered to ruthlessly pursue its goal. AlphaGo—an AI program—had beaten a professional Go player 5–0. And it had won by analyzing fewer positions than IBM’s Deep Blue did

by several orders of magnitude. When AlphaGo beat a human, it didn’t know it was playing a game, what a game means, or why humans get pleasure out of playing games

Lee, a high-ranking professional Go player from Korea, reviewed the games after. In an official public statement, he said, “My overall impression was that AlphaGo seemed stronger than Fan, but I couldn’t tell by how much… maybe it becomes stronger when it faces a stronger opponent.”36 Focusing on

systems to win—to accomplish the goals we’ve created for them—do humans have to lose in ways that are both trivial and profound? AlphaGo continued playing tournaments, besting every opponent with masterful abilities and demoralizing the professional Go community. After beating the world’s number one champion 3–0

, saying that the team would work on a new set of challenges.37 What the team started working on next was a way to evolve AlphaGo from a powerful system that could be trained to beat brilliant Go players to a system that could train itself to become just as powerful

, without having to rely on humans. The first version of AlphaGo required humans in the loop and an initial data set of 100,000 Go games in order to learn how to play. The next generation

of the system was built to learn from zero. Just like a human player new to the game, this version—called AlphaGo Zero—would have to learn everything from scratch, completely on its own, without an opening library of moves or even a definition of what the

then play again optimized by what it had learned. It took only 70 hours of play for Zero to gain the same level of strength AlphaGo had when it beat the world’s greatest players.39 And then something interesting happened. The DeepMind team applied its technique to a second instance

of AlphaGo Zero using a larger network and allowed it to train and self-play for 40 days. It not only rediscovered the sum total of Go

knowledge accumulated by humans, it beat the most advanced version of AlphaGo 90% of the time—using completely new strategies. This means that Zero evolved into both a better student than the world’s greatest Go masters

complex. So instead, our systems are built for optimization. Implicit in optimizing is unpredictability—to make choices that deviate from our own human thinking. When AlphaGo Zero abandoned human strategy and invented its own, it wasn’t deciding between preexisting alternatives; it was making a deliberate choice to try something completely

criminals while mislabeling white defendants as low risk. The optimization effect sometimes causes brilliant AI tribes to make dumb decisions. Recall DeepMind, which built the AlphaGo and AlphaGo Zero systems and stunned the AI community as it dominated grandmaster Go matches. Before Google acquired the company, it sent Geoff Hinton (the University

make decisions quickly, find hidden regularities in big data sets, and make accurate predictions. And it’s becoming clear with each new milestone achieved—like AlphaGo Zero’s ability to train itself and win matches using a superior strategy it developed on its own—that we are entering a new phase

of algorithms, but the vast proliferation of smart thinking machines bent on recursive self-improvement. Imagine a world in which systems far more advanced than AlphaGo Zero and NASNet not only make strategic decisions autonomously but also work collaboratively and competitively as part of a global community. A world in which

of before was a primary goal of AI’s tribes. It was a key to solving wicked problems that humans alone couldn’t crack. When AlphaGo Zero made autonomous strategy decisions decades ago, we heralded the achievement as a milestone for AI. Inside our bodies, however, nanobots and the AGIs they

of Economic Research, January 2018, http://www.nber.org/papers/w24254.pdf. 36. Toby Manning, “AlphaGo,” British Go Journal 174 (Winter 2015–2016): 15, https://www.britgo.org/files/2016/deepmind/BGJ174-AlphaGo.pdf. 37. Sam Byford, “AlphaGo Retires from Competitive Go after Defeating World Number One 3-0,” Verge, May 27, 2017

, https://www.theverge.com/2017/5/27/15704088/alphago-ke-jie-game-3-result-retires-future. 38. David Silver et al., “Mastering the Game of Go Without Human Knowledge,” Nature 550 (October 19, 2017):

_unformatted_nature.pdf. 39. Ibid. 40. Ibid. 41. This statement was made by Zero’s lead programmer, David Silver, at a news conference. 42. Byford, “AlphaGo Retires From Competitive Go.” 43. Jordan Novet, “Google Is Finding Ways to Make Money from Alphabet’s DeepMind AI Technology,” CNBC, March 31, 2018, https

, 68–69; smart speaker, 69; values algorithm, 100; Zoloz acquisition, 72 Alipay, 69, 186; social network, 81 Alphabet, 48, 49. See also Google AlphaGo, 43–45, 46, 115 AlphaGo Zero, 46–48, 49, 110, 115, 135, 149, 225 Amazon, 3, 85, 96, 119, 154; Akira system, 161; Amazon Basics microwave, 217; in

cloud service and, 117; Royal Free NHS Foundation patient data and, 116–117, 122; UK health care initiative and, 117; WaveNet and, 117 DeepMind team: AlphaGo Zero paper, 47; general-purpose learning machine, 48; power of, 46. See also DeepMind Defense Innovation Board, 212 Defense Innovation Unit Experimental (DIUx), 212 Descartes

U.S. government in catastrophic scenario of future, 228; shrinkage of G-MAFIA to in catastrophic scenario of future, 223 Game theory, 27 Games. See AlphaGo; AlphaGo Zero; Go; Machine learning; Watson Gender-nonconforming people: false accusations of identity theft in catastrophic scenario of future, 222 Generative adversarial network (GAN), 184 Genie

, 139; as proto-PDR, 153–154; spam filter, 89 Go, 39–41, 43, 116, 259; Elo rating, 46. See also names of specific Go players; AlphaGo; AlphaGo Zero Go Intellect, 41 Good, I. J., 33, 148, 177 Google, 3, 43, 48, 67, 69, 85–86, 96, 119, 211–212, 254; Calico health

in China, 80 Leadership, need for courageous, 246 Learned helplessness: in pragmatic scenario of future, 190–201 Learning. See AlphaGo; AlphaGo Zero; Learning machines; Machine learning; Watson Learning machines, 30, 31–32. See also AlphaGo; AlphaGo Zero; Watson Lecun, Yann, 41, 42, 59 Lee, Peter, 122 Legg, Shane, 43 Leibniz, Gottfried Wilhelm von, 20

Army of None: Autonomous Weapons and the Future of War

by Paul Scharre  · 23 Apr 2018  · 590pp  · 152,595 words

allow these machines to conduct their functions?” Tousley compared the challenge of cognitive electronic warfare to Google’s go-playing AlphaGo program. What happens when that program plays another version of AlphaGo at “machine speed?” He explained, “As humans ascend to the higher-level mission command and I’ve got machines doing

to master. Prior to DeepMind, attempts to build go-playing AI software had fallen woefully short of human professional players. To craft its AI, called AlphaGo, DeepMind took a different approach. They built an AI composed of deep neural networks and fed it data from 30 million games of go. As

having it play itself. Our goal is to beat the best human players, not just mimic them,” as explained in the post. “To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error

process known as reinforcement learning.” AlphaGo used the 30 million human games of go as a starting point, but by playing against itself could reach levels of game play beyond even

the best human players. This superhuman game play was demonstrated in the 4–1 victory AlphaGo delivered over the world’s top-ranked human go player, Lee Sedol, in March 2016. AlphaGo won the first game solidly, but in game 2 demonstrated its virtuosity. Partway through game 2, on

move 37, AlphaGo made a move so surprising, so un-human, that it stunned professional players watching the match. Seemingly ignoring a contest between white and black stones

that was under way in one corner of the board, AlphaGo played a black stone far away in a nearly empty part of the board. It was a surprising move not seen in professional games, so

got up and left the room. After he returned, he took fifteen minutes to formulate his response. AlphaGo’s move wasn’t a mistake. European go champion Fan Hui, who had lost to AlphaGo a few months earlier in a closed-door match, said at first the move surprised him as well

player would never make, it was a move no human player probably would never make. AlphaGo rated the odds that a human would have made that move as 1 in 10,000. Yet AlphaGo made the move anyway. AlphaGo went on to win game 2 and afterward Lee Sedol said, “I really feel

that AlphaGo played the near perfect game.” After losing game 3, thus giving AlphaGo the win for the match, Lee Sedol told the audience

at a press conference, “I kind of felt powerless.” AlphaGo’s triumph over Lee Sedol has implications far beyond the

game of go. More than just another realm of competition in which AIs now top humans, the way DeepMind trained AlphaGo is what really matters. As explained in the DeepMind blog

post, “AlphaGo isn’t just an ‘expert’ system built with hand-crafted rules; instead it uses general machine learning techniques to

its own, and some of the things it learned were surprising. In 2017, DeepMind surpassed their earlier success with a new version of AlphaGo. With an updated algorithm, AlphaGo Zero learned to play go without any human data to start. With only access to the board and the rules of the game

, AlphaGo Zero taught itself to play. Within a mere three days of self-play, AlphaGo Zero had eclipsed the previous version that had beaten Lee Sedol, defeating it 100 games to 0. These

deep learning techniques can solve a variety of other problems. In 2015, even before DeepMind debuted AlphaGo, DeepMind trained a neural network to play Atari games. Given only the pixels on the screen and the game score as input and told to

. The AIs being developed for go or Atari are still narrow AI systems. Once trained, the AIs are purpose-built tools to solve narrow problems. AlphaGo can beat any human at go, but it can’t play a different game, drive a car, or make a cup of coffee. Still, the

tools used to train AlphaGo are generalizable tools that can be used to build any number of special-purpose narrow AIs to solve various problems. Deep neural networks have been

real-world environments, but no amount of testing can entirely eliminate the potential for unanticipated behaviors. Sometimes these unanticipated behaviors may pleasantly surprise users, like AlphaGo’s 1 in 10,000 move that stunned human champion Lee Sedol. Sometimes these unanticipated actions can be negative. During Gary Kasparov’s first game

development would be a leap beyond current malware, the advent of learning systems in other areas, such as Google DeepMind’s Atari-playing AI or AlphaGo, suggests that it is not inconceivable. Adaptive malware that could rewrite itself to hide and avoid scrutiny at superhuman speeds could be incredibly virulent, spreading

ability, it rapidly surpasses it. For years, go computer programs couldn’t hold a candle to the top-ranked human go players. Then, seemingly overnight, AlphaGo dethroned the world’s leading human player. The contest between humans and machines at go was over before it began. In early 2017, poker became

://spectrum.ieee.org/robotics/military-robots/a-robotic-sentry-for-koreas-demilitarized-zone. 105 SGR-A1 cited as an example: Christopher Moyer, “How Google’s AlphaGo Beat a Go World Champion,” The Atlantic, March 28, 2016, https://www.theatlantic.com/technology/archive/2016/03/the-invisible-opponent/475611/. Adrianne Jeffries, “Should

module”: “Kalashnikov Gunmaker Develops Combat Module based on Artificial Intelligence.” 125 more possible positions in go: “AlphaGo,” DeepMind, accessed June 7, 2017, https://deepmind.com/research/alphago/. 125 “Our goal is to beat the best human players”: “AlphaGo: Using Machine Learning to Master the Ancient Game of Go,” Google, January 27, 2016, http

://blog.google:443/topics/machine-learning/alphago-machine-learning-game-go/. 126 game 2, on move 37: Daniel

Estrada, “Move 37!! Lee Sedol vs AlphaGo Match 2” video, https://www.youtube.com/watch?v=JNrXgpSEEIE. 126 “I thought it was a

11, 2016, https://www.wired.com/2016/03/sadness-beauty-watching-googles-ai-play-go/. 126 1 in 10,000: Cade Metz, “In Two Moves, AlphaGo and Lee Sedol Redefined the Future,” WIRED, accessed June 7, 2017, https://www.wired.com/2016/03/two-moves

-alphago-lee-sedol-redefined-future/. 126 “I kind of felt powerless”: Moyer, “How Google’s AlphaGo Beat a Go World Champion.” 126 “AlphaGo isn’t just an ‘expert’ system”: “AlphaGo,” January 27, 2016. 127 AlphaGo Zero: “AlphaGo Zero: Learning from Scratch,” DeepMind, accessed October 22, 2017

, https://deepmind.com/blog/alphago=zero=learning=scratch/. 127 neural network to play Atari games

?timestamp=1435068339702. 150 “We had seen it once before”: Interestingly, this random move may have played a key role in shaking Kasparov’s confidence. Unlike AlphaGo’s 1 in 10,000 surprise move that later turned out to be a stroke of brilliance, Kasparov could see right away that Deep Blue

-death decisions by, 287–90 for stock trading, see automated stock trading Ali Al Salem Air Base (Kuwait), 138–39 Alphabet, 125 AlphaGo, 81–82, 125–27, 150, 242 AlphaGo Zero, 127 AlphaZero, 410 al-Qaeda, 22, 253 “always/never” dilemma, 175 Amazon, 205 AMRAAM (Advanced Medium-Range Air-to-Air Missile

. bombing campaigns against, 282 GGE (Group of Governmental Experts), 346 ghost tracks, 142 Global Hawk drone, 17 go (game), 124–28, 149, 150; see also AlphaGo goal-driven AI systems, 238–40 goal misalignment, 243 goal-oriented behavior, 32 Goethe, Johann Wolfgang von, 148 golems, 234 Good, I. J., 233 Google

The Singularity Is Nearer: When We Merge with AI

by Ray Kurzweil  · 25 Jun 2024

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma

by Mustafa Suleyman  · 4 Sep 2023  · 444pp  · 117,770 words

The Rationalist's Guide to the Galaxy: Superintelligent AI and the Geeks Who Are Trying to Save Humanity's Future

by Tom Chivers  · 12 Jun 2019  · 289pp  · 92,714 words

Whiplash: How to Survive Our Faster Future

by Joi Ito and Jeff Howe  · 6 Dec 2016  · 254pp  · 76,064 words

Supremacy: AI, ChatGPT, and the Race That Will Change the World

by Parmy Olson  · 284pp  · 96,087 words

Rage Inside the Machine: The Prejudice of Algorithms, and How to Stop the Internet Making Bigots of Us All

by Robert Elliott Smith  · 26 Jun 2019  · 370pp  · 107,983 words

Architects of Intelligence

by Martin Ford  · 16 Nov 2018  · 586pp  · 186,548 words

AI Superpowers: China, Silicon Valley, and the New World Order

by Kai-Fu Lee  · 14 Sep 2018  · 307pp  · 88,180 words

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again

by Eric Topol  · 1 Jan 2019  · 424pp  · 114,905 words

The Road to Conscious Machines

by Michael Wooldridge  · 2 Nov 2018  · 346pp  · 97,890 words

Machine, Platform, Crowd: Harnessing Our Digital Future

by Andrew McAfee and Erik Brynjolfsson  · 26 Jun 2017  · 472pp  · 117,093 words

A World Without Work: Technology, Automation, and How We Should Respond

by Daniel Susskind  · 14 Jan 2020  · 419pp  · 109,241 words

Scary Smart: The Future of Artificial Intelligence and How You Can Save Our World

by Mo Gawdat  · 29 Sep 2021  · 259pp  · 84,261 words

Rule of the Robots: How Artificial Intelligence Will Transform Everything

by Martin Ford  · 13 Sep 2021  · 288pp  · 86,995 words

Coders: The Making of a New Tribe and the Remaking of the World

by Clive Thompson  · 26 Mar 2019  · 499pp  · 144,278 words

The Perfect Police State: An Undercover Odyssey Into China's Terrifying Surveillance Dystopia of the Future

by Geoffrey Cain  · 28 Jun 2021  · 340pp  · 90,674 words

Ghost Work: How to Stop Silicon Valley From Building a New Global Underclass

by Mary L. Gray and Siddharth Suri  · 6 May 2019  · 346pp  · 97,330 words

Radical Technologies: The Design of Everyday Life

by Adam Greenfield  · 29 May 2017  · 410pp  · 119,823 words

Artificial Unintelligence: How Computers Misunderstand the World

by Meredith Broussard  · 19 Apr 2018  · 245pp  · 83,272 words

I, Warbot: The Dawn of Artificially Intelligent Conflict

by Kenneth Payne  · 16 Jun 2021  · 339pp  · 92,785 words

Artificial Whiteness

by Yarden Katz

Artificial Intelligence: A Modern Approach

by Stuart Russell and Peter Norvig  · 14 Jul 2019  · 2,466pp  · 668,761 words

WTF?: What's the Future and Why It's Up to Us

by Tim O'Reilly  · 9 Oct 2017  · 561pp  · 157,589 words

Future Politics: Living Together in a World Transformed by Tech

by Jamie Susskind  · 3 Sep 2018  · 533pp

Nexus: A Brief History of Information Networks From the Stone Age to AI

by Yuval Noah Harari  · 9 Sep 2024  · 566pp  · 169,013 words

The Future Is Faster Than You Think: How Converging Technologies Are Transforming Business, Industries, and Our Lives

by Peter H. Diamandis and Steven Kotler  · 28 Jan 2020  · 501pp  · 114,888 words

The Price of Tomorrow: Why Deflation Is the Key to an Abundant Future

by Jeff Booth  · 14 Jan 2020  · 180pp  · 55,805 words

Industry 4.0: The Industrial Internet of Things

by Alasdair Gilchrist  · 27 Jun 2016

The Technology Trap: Capital, Labor, and Power in the Age of Automation

by Carl Benedikt Frey  · 17 Jun 2019  · 626pp  · 167,836 words

The Globotics Upheaval: Globalisation, Robotics and the Future of Work

by Richard Baldwin  · 10 Jan 2019  · 301pp  · 89,076 words

The Means of Prediction: How AI Really Works (And Who Benefits)

by Maximilian Kasy  · 15 Jan 2025  · 209pp  · 63,332 words

On the Future: Prospects for Humanity

by Martin J. Rees  · 14 Oct 2018  · 193pp  · 51,445 words

New Dark Age: Technology and the End of the Future

by James Bridle  · 18 Jun 2018  · 301pp  · 85,263 words

The Optimist: Sam Altman, OpenAI, and the Race to Invent the Future

by Keach Hagey  · 19 May 2025  · 439pp  · 125,379 words

Novacene: The Coming Age of Hyperintelligence

by James Lovelock  · 27 Aug 2019  · 94pp  · 33,179 words

Shape: The Hidden Geometry of Information, Biology, Strategy, Democracy, and Everything Else

by Jordan Ellenberg  · 14 May 2021  · 665pp  · 159,350 words

What We Owe the Future: A Million-Year View

by William MacAskill  · 31 Aug 2022  · 451pp  · 125,201 words

The Book of Why: The New Science of Cause and Effect

by Judea Pearl and Dana Mackenzie  · 1 Mar 2018

The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do

by Erik J. Larson  · 5 Apr 2021

The Autonomous Revolution: Reclaiming the Future We’ve Sold to Machines

by William Davidow and Michael Malone  · 18 Feb 2020  · 304pp  · 80,143 words

Human Frontiers: The Future of Big Ideas in an Age of Small Thinking

by Michael Bhaskar  · 2 Nov 2021

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy

by George Gilder  · 16 Jul 2018  · 332pp  · 93,672 words

Ten Lessons for a Post-Pandemic World

by Fareed Zakaria  · 5 Oct 2020  · 289pp  · 86,165 words

Succeeding With AI: How to Make AI Work for Your Business

by Veljko Krunic  · 29 Mar 2020

Prediction Machines: The Simple Economics of Artificial Intelligence

by Ajay Agrawal, Joshua Gans and Avi Goldfarb  · 16 Apr 2018  · 345pp  · 75,660 words

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

by Aurélien Géron  · 13 Mar 2017  · 1,331pp  · 163,200 words

Possible Minds: Twenty-Five Ways of Looking at AI

by John Brockman  · 19 Feb 2019  · 339pp  · 94,769 words

Work in the Future The Automation Revolution-Palgrave MacMillan (2019)

by Robert Skidelsky Nan Craig  · 15 Mar 2020

Money: Vintage Minis

by Yuval Noah Harari  · 5 Apr 2018  · 97pp  · 31,550 words

The Job: The Future of Work in the Modern Era

by Ellen Ruppel Shell  · 22 Oct 2018  · 402pp  · 126,835 words

The People vs Tech: How the Internet Is Killing Democracy (And How We Save It)

by Jamie Bartlett  · 4 Apr 2018  · 170pp  · 49,193 words

The Future We Choose: Surviving the Climate Crisis

by Christiana Figueres and Tom Rivett-Carnac  · 25 Feb 2020  · 197pp  · 49,296 words

A Hacker's Mind: How the Powerful Bend Society's Rules, and How to Bend Them Back

by Bruce Schneier  · 7 Feb 2023  · 306pp  · 82,909 words

MegaThreats: Ten Dangerous Trends That Imperil Our Future, and How to Survive Them

by Nouriel Roubini  · 17 Oct 2022  · 328pp  · 96,678 words

AI 2041: Ten Visions for Our Future

by Kai-Fu Lee and Qiufan Chen  · 13 Sep 2021

Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins

by Garry Kasparov  · 1 May 2017  · 331pp  · 104,366 words

The Economic Singularity: Artificial Intelligence and the Death of Capitalism

by Calum Chace  · 17 Jul 2016  · 477pp  · 75,408 words

From Bacteria to Bach and Back: The Evolution of Minds

by Daniel C. Dennett  · 7 Feb 2017  · 573pp  · 157,767 words

Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence

by Richard Yonck  · 7 Mar 2017  · 360pp  · 100,991 words

The Ages of Globalization

by Jeffrey D. Sachs  · 2 Jun 2020

Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future

by Luke Dormehl  · 10 Aug 2016  · 252pp  · 74,167 words

People, Power, and Profits: Progressive Capitalism for an Age of Discontent

by Joseph E. Stiglitz  · 22 Apr 2019  · 462pp  · 129,022 words

Red Flags: Why Xi's China Is in Jeopardy

by George Magnus  · 10 Sep 2018  · 371pp  · 98,534 words

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

by Karen Hao  · 19 May 2025  · 660pp  · 179,531 words

These Strange New Minds: How AI Learned to Talk and What It Means

by Christopher Summerfield  · 11 Mar 2025  · 412pp  · 122,298 words

The Long History of the Future: Why Tomorrow's Technology Still Isn't Here

by Nicole Kobie  · 3 Jul 2024  · 348pp  · 119,358 words

What Algorithms Want: Imagination in the Age of Computing

by Ed Finn  · 10 Mar 2017  · 285pp  · 86,853 words

Human + Machine: Reimagining Work in the Age of AI

by Paul R. Daugherty and H. James Wilson  · 15 Jan 2018  · 523pp  · 61,179 words

The Simulation Hypothesis

by Rizwan Virk  · 31 Mar 2019  · 315pp  · 89,861 words

Your Computer Is on Fire

by Thomas S. Mullaney, Benjamin Peters, Mar Hicks and Kavita Philip  · 9 Mar 2021  · 661pp  · 156,009 words

Work: A History of How We Spend Our Time

by James Suzman  · 2 Sep 2020  · 909pp  · 130,170 words

A Generation of Sociopaths: How the Baby Boomers Betrayed America

by Bruce Cannon Gibney  · 7 Mar 2017  · 526pp  · 160,601 words

Calling Bullshit: The Art of Scepticism in a Data-Driven World

by Jevin D. West and Carl T. Bergstrom  · 3 Aug 2020

Seeking SRE: Conversations About Running Production Systems at Scale

by David N. Blank-Edelman  · 16 Sep 2018

Outnumbered: From Facebook and Google to Fake News and Filter-Bubbles – the Algorithms That Control Our Lives

by David Sumpter  · 18 Jun 2018  · 276pp  · 81,153 words

Fixed: Why Personal Finance is Broken and How to Make it Work for Everyone

by John Y. Campbell and Tarun Ramadorai  · 25 Jul 2025

On the Edge: The Art of Risking Everything

by Nate Silver  · 12 Aug 2024  · 848pp  · 227,015 words

The Currency Cold War: Cash and Cryptography, Hash Rates and Hegemony

by David G. W. Birch  · 14 Apr 2020  · 247pp  · 60,543 words

System Error: Where Big Tech Went Wrong and How We Can Reboot

by Rob Reich, Mehran Sahami and Jeremy M. Weinstein  · 6 Sep 2021

How to Fix the Future: Staying Human in the Digital Age

by Andrew Keen  · 1 Mar 2018  · 308pp  · 85,880 words

Leadership by Algorithm: Who Leads and Who Follows in the AI Era?

by David de Cremer  · 25 May 2020  · 241pp  · 70,307 words

The Musical Human: A History of Life on Earth

by Michael Spitzer  · 31 Mar 2021  · 632pp  · 163,143 words

Move Fast and Break Things: How Facebook, Google, and Amazon Cornered Culture and Undermined Democracy

by Jonathan Taplin  · 17 Apr 2017  · 222pp  · 70,132 words

To Be a Machine: Adventures Among Cyborgs, Utopians, Hackers, and the Futurists Solving the Modest Problem of Death

by Mark O'Connell  · 28 Feb 2017  · 252pp  · 79,452 words

Machine Translation

by Thierry Poibeau  · 14 Sep 2017  · 174pp  · 56,405 words

Know Thyself

by Stephen M Fleming  · 27 Apr 2021

Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone

by Satya Nadella, Greg Shaw and Jill Tracie Nichols  · 25 Sep 2017  · 391pp  · 71,600 words

Being You: A New Science of Consciousness

by Anil Seth  · 29 Aug 2021  · 418pp  · 102,597 words

Applied Artificial Intelligence: A Handbook for Business Leaders

by Mariya Yao, Adelyn Zhou and Marlene Jia  · 1 Jun 2018  · 161pp  · 39,526 words

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

by Aurelien Geron  · 14 Aug 2019

The AI Economy: Work, Wealth and Welfare in the Robot Age

by Roger Bootle  · 4 Sep 2019  · 374pp  · 111,284 words

This Is for Everyone: The Captivating Memoir From the Inventor of the World Wide Web

by Tim Berners-Lee  · 8 Sep 2025  · 347pp  · 100,038 words

The Everything Blueprint: The Microchip Design That Changed the World

by James Ashton  · 11 May 2023  · 401pp  · 113,586 words

Angrynomics

by Eric Lonergan and Mark Blyth  · 15 Jun 2020  · 194pp  · 56,074 words

Power and Progress: Our Thousand-Year Struggle Over Technology and Prosperity

by Daron Acemoglu and Simon Johnson  · 15 May 2023  · 619pp  · 177,548 words

Gods and Robots: Myths, Machines, and Ancient Dreams of Technology

by Adrienne Mayor  · 27 Nov 2018

The Precipice: Existential Risk and the Future of Humanity

by Toby Ord  · 24 Mar 2020  · 513pp  · 152,381 words

The Hype Machine: How Social Media Disrupts Our Elections, Our Economy, and Our Health--And How We Must Adapt

by Sinan Aral  · 14 Sep 2020  · 475pp  · 134,707 words

Driverless: Intelligent Cars and the Road Ahead

by Hod Lipson and Melba Kurman  · 22 Sep 2016

The Driver in the Driverless Car: How Our Technology Choices Will Create the Future

by Vivek Wadhwa and Alex Salkever  · 2 Apr 2017  · 181pp  · 52,147 words

Them and Us: How Immigrants and Locals Can Thrive Together

by Philippe Legrain  · 14 Oct 2020  · 521pp  · 110,286 words

Beginners: The Joy and Transformative Power of Lifelong Learning

by Tom Vanderbilt  · 5 Jan 2021  · 312pp  · 92,131 words

The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip

by Stephen Witt  · 8 Apr 2025  · 260pp  · 82,629 words

Blueprint: The Evolutionary Origins of a Good Society

by Nicholas A. Christakis  · 26 Mar 2019

If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

by Eliezer Yudkowsky and Nate Soares  · 15 Sep 2025  · 215pp  · 64,699 words

How to Spend a Trillion Dollars

by Rowan Hooper  · 15 Jan 2020  · 285pp  · 86,858 words