Recap: IBM’s Watson Dominates at Jeopardy!

“I, for one, welcome our new computer overlords.”

So wrote Ken Jennings as part of his correct response to the Final Jeopardy! clue in tonight’s final game between Ken Jennings, Brad Rutter, and the newcomer, IBM’s Watson.

Though it didn’t do as well today as it did in match 1, Watson had another impressive showing in Game 2, earning $41,413 and combined with his $35,734 in game 1, won the two-day affair with combined winnings of $77,147. At the end of the Double Jeopardy! round, Ken Jennings computed that he wouldn’t be able to beat Watson even if Ken wagered it all, so he wagered conservatively in Final Jeopardy!. The category was 19th Century  Novelists and the clue was:

“William Wilkenson’s ‘An Account of the Principalities of Wallachia and Moldavia’ inspired this author’s most famous novel.”

All three contestants got the right answer: Bram Stoker, who wrote Dracula (I got it right at home, playing along). In the end, Ken Jennings wound up with $24,000. Brad Rutter came in third with $21,600 in winnings. Of course, IBM’s Watson was playing for charity, and the cool $1,000,000 prize will be split between World Vision and World Community Grid.

Overall, I was very impressed with Watson’s showing. And yes, I was wrong with my prediction that I made last week. The last three days on Jeopardy! have been a blast.

So what’s next for IBM? According to the New York Times:

For I.B.M., the future will happen very quickly, company executives said. On Thursday it plans to announce that it will collaborate with Columbia University and the University of Maryland to create a physician’s assistant service that will allow doctors to query a cybernetic assistant. The company also plans to work with Nuance Communications Inc. to add voice recognition to the physician’s assistant, possibly making the service available in as little as 18 months.

###
References:

1) Selected Nuances of Watson’s Strategies (How does Watson know what it know?)

2) Watson’s Wagering Strategies (excellent blog post from one of IBM’s researchers)

3)  All the questions and answers from Game 1 (part 1 and part 2) and from Game 2 in this Jeopardy! contest between Watson, Ken Jennings, and Brad Rutter.

 

Ken Jennings and Brad Rutter vs. IBM’s Watson on Jeopardy!

Jeopardy! is one of my all-time favorite shows on Television. When I was in high school, I used to watch every show (it was a nightly ritual). These days, I watch Jeopardy! less than I used to in my younger days, but I’ll be certainly tuning in next week to see Ken Jennings (winner of 74 consecutive games on the show, with a total loot of over $3 million in prize money) and Brad Rutter (the biggest all-time money winner on the show) take on IBM’s artificial intelligence software, Watson.

Perhaps I am downplaying my enthusiasm. I’m really, really looking forward to this Jeopardy! contest, which will take place over three days on February 14, 15, and 16. Watson doesn’t just represent a machine force-fed a bunch of encyclopedias, dictionaries, and thesauri (although that’s certainly a major component of it); no, the machine is also facing the arduous task of deciphering the natural (English) language and everything it entails: nuances, hyperboles, puns, dialects, slang, metaphor, and so much more. In other words, Jeopardy! is a perfect test-drive for Watson: the questions on the show aren’t meant for a computer to answer. And this is why this Watson is such a huge deal:

Watson is a powerful machine. Its setup is called “Massively Parallel Probabilistic Evidence-Based Architecture,” and it runs on 2,800 Power7 processing cores. If all that computer and interpreting power presents an answer that satisfies Watson’s confidence interval, it “rings in” with the answer:

Concerning Watson, Richard Powers made a superb op-ed contribution to the New York Times. In “What is Artificial Intelligence?” (link via @openculture) he elaborates on Watson’s challenge:

Open-domain question answering has long been one of the great holy grails of artificial intelligence. It is considerably harder to formalize than chess. It goes well beyond what search engines like Google do when they comb data for keywords. Google can give you 300,000 page matches for a search of the terms “greyhound,” “origin” and “African country,” which you can then comb through at your leisure to find what you need.

Asked in what African country the greyhound originated, Watson can tell you in a couple of seconds that the authoritative consensus favors Egypt. But to stand a chance of defeating Mr. Jennings and Mr. Rutter, Watson will have to be able to beat them to the buzzer at least half the time and answer with something like 90 percent accuracy.

So the task is two-fold: first arrive at the correct answer and then “buzz in” before Ken Jennings or Brad Rutter. From my observation, Ken Jennings is one of the best players to have mastered the art of the buzz-in (I suspect that for many questions that his competitors knew that he knew as well, his ability to buzz-in before them contributed to at least 30% of his daily winnings on the show). However, I believe the IBM engineers designed Watson for maximum efficiency: as soon as the last syllable escapes Alex Trebek’s lips, Watson will ring in with the answer.

Actually, Watson already competed in a practice round against humans–and beat them, badly. The most interesting tidbit comes from this Discover Magazine piece:

The questions were fed in plain text to Watson, but it had to wait the same amount of time to ring in as the human players did. To make the game fair, it also had to trigger a mechanical signaling button. Watson spoke in a stilted computerized voice–and was almost never wrong.

So if that is the case, Watson isn’t truly “listening” to the questions posed by Trebek; rather, it is reading the plain text. This is important for a reason: if there’s a clue where Alex Trebek accentuates certain part of the answer or perhaps changes his intonation or even his accent, this will be a help for the human contestants. The discrepancy between how the humans are processing the answers vs. Watson cannot be overlooked. For those unfamiliar with Jeopardy!’s buzzer system, the way it is designed is to lock-out the buzzer until Alex Trebek has finished reading the question, and the lock-out period is determined by a human producer (who sits off-camera, and has a button of his/her own which enables the buzzers). If a contestant were to buzz in before the producer pushes that button, the contestant’s buzzer is automatically locked out for three seconds, and any attempt to buzz in before that “penalty” period expires locks the contestant’s buzzer for another three seconds. So this raises an important question: how does Watson know when it can buzz in (i.e., how does it receive notification that the lock-out period has passed?). Will the human players have the advantage here, or will Watson?

Another important question to ponder is this one raised by Richard Powers:

Answers, for Watson, are a statistical thing, a matter of frequency and likelihood. If, after a couple of seconds, the countless possibilities produced by the 100-some algorithms converge on a solution whose chances pass Watson’s threshold of confidence, it buzzes in.

This raises the question of whether Watson is really answering questions at all or is just noticing statistical correlations in vast amounts of data.

In a sense, Watson is like a machine trained in quantum mechanics: it can never be certain about any of the answers, but if it can break the confidence threshold (whatever that may be) for the answer (err, question), it will surely try to buzz in with the response.

I like Powers’s conclusion:

It does not matter who will win this $1 million Valentine’s Day contest. We all know who will be champion, eventually. The real showdown is between us and our own future. Information is growing many times faster than anyone’s ability to manage it, and Watson may prove crucial in helping to turn all that noise into knowledge.

I finish this post with a prediction: I believe Ken Jennings will be the champion of this Jeopardy! contest. I have high hopes for Watson, and I believe it will do quite well, but I am sticking with the all-time champion. Ken is just too good with the buzzer and an absolute whiz for me to bet against him.

What about you? Who do you think will win?