Go Grandmaster Lee Sedol Grabs Consolation Win Against Google's AI

After three straight losses, score one for the humans.
Lee Sedol
Geordie Wood for WIRED

SEOUL, SOUTH KOREA --- Korean Go grandmaster Lee Sedol has won his first game against AlphaGo, Google's artificially intelligent computing system, after losing three straight in this week's historic match. AlphaGo had already claimed victory in the best-of-five contest, a test of artificial intelligence closely watched in Asia and across the tech world. But on Sunday evening inside Seoul's Four Seasons hotel, Lee Sedol clawed back a degree of pride for himself and the millions of people who watched the match online.

The gathered Korean reporters and photographers cheered and applauded when AlphaGo resigned roughly five hours into the game. And they cheered more loudly---and repeatedly---as he walked into the post-game press conference.

AlphaGo's dominance in the first three games was notable because no machine had previously beaten a top human player at Go---and because some technologies at the heart the system are already used inside Google and other big-name Internet companies. AlphaGo highlights the enormous power of these technologies and points the way forward for the AI techniques that have driven its success this week---techniques that are poised to reinvent everything from scientific research to robotics. And yet, as Lee Sedol showed today, machines are by no means infallible.

Because AlphaGo is driven by machine learning technologies---technologies that allow machines to learn tasks largely on their own---Google could over the next weeks and months retrain AlphaGo to an even higher level of performance. But Lee Sedol’s win in Game Four is a reminder that even the most proficient AI still has a long way to go before it can truly duplicate human thought. Yes, a machine can beat a top human at Go. But that doesn't mean it can, for example, pass an eighth grade science test, much less converse like a human or exhibit good old common sense.

Where is the Weakness?

Though the match had been decided the day before---with AlphaGo taking a three-games-to-none lead in the best-of-five contest---Game Four began with its own drama. As match commentator Chris Garlock said just before the game began, one big question remained: Does AlphaGo have a weakness?

It was a question that arose during the press conference in the wake of Game Three, a solemn affair where Lee Sedol apologized to the Korean public and the larger Go community. "I don't know what to say today, but I think I will have to express my apologies first," he told the press, through an interpreter. "I should have shown a better result, a better outcome, a better contest in terms of the games played." The Korean admitted to straining under the immense public pressure. The match was literally front-page news in Korea, where an estimated 8 million people play Go and Lee Sedol is a national figure even among those who don't follow the game. But now that much of the pressure was off, he vowed to continue looking for that weakness.

"Although AlphaGo is a strong program, I would not say that it is a perfect program," he said. "Yes, compared to human beings, its moves are different and at times superior. But I do think there are weaknesses for AlphaGo."

He was particularly upset with his play during the second game, when he felt he made crucial mistakes and failed to capitalize on errors by AlphaGo. "There were a number of opportunities that I admittedly missed," he said.

Game Two All Over Again

Game Four began a lot like Game Two, as if Lee Sedol were trying to make amends for past mistakes. As in Game Two, he played white, which meant he moved second, and he responded to AlphaGo's opening much as he had in the second game. "It's just about the same game," commentator Michael Redmond said six moves into the match.

Commentator Chris Garlock asked whether Lee Sedol, in an effort to find a weakness, might resort to playing moves that were as unusual as possible. But as Redmond pointed out, that didn't really work in Game One. Judging from how AlphaGo has operated so far, it's unlikely that unusual or even blatantly weird moves would be particularly effective against the machine.

Using what are called deep neural networks---networks of hardware and software that mimic the web of neurons in the human brain---AlphaGo first learned the game of Go by analyzing thousands upon thousands of moves made by real live human players. Thanks to another technology called reinforcement learning, it then climbed to an entirely different and higher level by playing game after game after game against itself. In essence, these games generated all sorts of new moves that the machine could use to retrain itself. By definition, these are inhuman moves.

In other words, the AlphaGo system does not operate by playing in familiar ways. It thrives by playing in ways no human ever would.

'Off The Map'

As the game progressed, Lee Sedol took far more time with each move than his inanimate opponent. This was also the case in Games Two and Three, when, after his clock ran down, the Korean was forced to play at a rapid-fire pace. AlphaGo, by contrast, has managed its time well. And this was no accident. Before the match, Demis Hassabis---who oversees the Google AI lab, known as DeepMind, that built AlphaGo---said another neural network had been added to AlphaGo specifically devoted to managing time.

As in the early parts of Game Two, Lee Sedol seemed to command a large amount of territory on the board and AlphaGo very little. This was hardly a sign that Sedol was ahead in the game, but it did indicate he was using much the same strategy he used in his losing second game.

But commentators proposed another strategy. "I'd just pull the plug," Redmond said of AlphaGo. "It's dependent on its Internet connection, isn't it? All we need is someone with scissors." Indeed, AlphaGo does depend on an Internet connection, which ties into a vast network of machines inside Google data centers across the globe. But a pair of scissors wouldn't be enough to cut the cord. Prior to the match, Google ran its own fiber optic cables into the Four Seasons to ensure the connection didn't go down.

When they first built AlphaGo, Hassabis and his team trained and ran the system on a single machine. But in October, just before AlphaGo's closed-door match with three-time European champion Fan Hui, researchers upgraded the system to a much higher level of processing power. Deep neural networks typically run a large number of connected machines, each equipped with graphics processing units, or GPUs, chips that were originally designed to render images for games and other highly graphical software. These chips, it turns out, are also well suited to this breed of machine learning. In October, Hassabis said that AlphaGo ran on a network that spanned 170 GPU cards and 1,200 standard processors, or CPUs.

'A Very Dangerous Fight'

As the two-hour mark approached, Redmond called the contest Lee Sedol's type of game, saying it was developing "into a very dangerous fight." Lee Sedol likes to play on a knife edge. And he's very good at it. But as Redmond pointed out, so is AlphaGo.

Lee Sedol seemed to be in a better position than he had been in Game Three. He seemed calmer as well. But after another twenty minutes of play, Redmond, himself a very successful Go player, felt that AlphaGo had the edge. And Lee Sedol had only about 25 minutes left on his play clock, nearly an hour less than AlphaGo. The difference is key, since once a play clock runs out, a player must make each move in less than 60 seconds.

At this point, AlphaGo started to play what Redmond and Garlock considered unimpressive or "slack" moves. The irony is that this may have indicated that the machine was confident of a win. AlphaGo makes moves that maximize its probability of winning, not its margin of victory. "This was AlphaGo saying: 'I think I'm ahead. I'm going to wrap this stuff up,'" Garlock said. "And Lee Sedol needs to do something special, even if it doesn't work. Otherwise, it's just not going to be enough." Lee Sedol, leaning forward with his face in his hands, deep in concentration, took several minutes to make his next move.

And then, a few minutes later, the big move came. It was move 78, when Lee Sedol played a "wedge" in the middle of the board. "It was the move that made [the game] the most complicated," Andrew Jackson, who was doing a separate online commentary for the US Go Association, later told me. And this added degree of complication shifted the play away from the machine. Demis Hassabis later tweeted that AlphaGo made a significant mistake with move 79, directly after the big move move from Lee Sedol. And about 10 moves later, he said, AlphaGo's calculations indicated that its chances of winning had dropped.

Resignation?

Then, near the three hour mark, Lee Sedol's play clock ran out. But AlphaGo's play continued to confound the commentators. "I get the impression that AlphaGo has gone off on a tangent," Redmond said. Again, this hardly meant AlphaGo was in trouble, but it prompted Garlock to ask if the machine would ever resign.

Eventually it would, and its approach to resignation is surprisingly, well, human. According to David Silver, another DeepMind researcher who led the creation of AlphaGo, the machine will resign not when it has zero chance of winning, but when its chance of winning dips below 20 percent. "We feel that this is more respectful to the way humans play the game," Silver told me earlier in the week. "It would be disrespectful to continue playing in a position which is clearly so close to loss that it's almost over."

At this point in the game, the machine had not yet decided its chances had fallen below that threshold. But to the commentators, its play was starting to deteriorate. "I get the feeling AlphaGo is running out of winning moves," Redmond said, adding that AlphaGo in the past often made poor moves when it felt it was ahead. But, he said, "not that bad."

Finally, Redmond and Garlock agreed that the match was looking quite good for Lee Sedol. But the Korean continued to struggle with time. He had twice failed to make a move within the allotted 60 seconds. He then made a move just milliseconds before the clock ran out again. If it had, Lee Sedol would have forfeited the match.

Then, about three hours and forty minutes into the match, Lee Sedol stood up from the table and left the room, taking a permitted break from play. Under the rules, his play clock did not resume until he had returned. The play had reached a level of excitement not seen since Game One. "Lee Sedol has a chance this time," Redmond said.

The End Game

But the Korean still faced clock trouble. And it was telling that AlphaGo had not resigned. The machine still had 15 minutes left on its original play clock as the end game arrived, with the two opponents rapidly trying to rack up the points they had angled for over the last four-and-a-half hours.

Then AlphaGo played what Redmond called "another nonsense move." And soon, the machine resigned.

After the match, Redmond and others pointed to what they saw as the game's turning point: move 78. In the middle of the contest, Go aficionados agreed, Lee Sedol was behind. But he built up to that move in the center of the board and turned the tide. "Lee Sedol played a brilliant move. It took me by surprise. I'm sure that it would take most opponents by surprise. I think it took AlphaGo by surprise," Redmond said.

The moment AlphaGo resigned, an enormous cheer rose from the Korean commentary room and its throng of Korean reporters and photographers. And then came the applause in the English room. A day before, the atmosphere was palpably solemn. But on Sunday, Lee Sedol did find a weakness in AlphaGo. And the mood changed. The human could win after all.

Update: This story has been updated to show that Lee Sedol would have forfeited the match if he had failed to make each of three moves in under 60 seconds and to clarify the roles of Demis Hassabis and David Silver.