Watson wins Jeopardy! - trick or triumph
Written by Mike James   
Friday, 18 February 2011
Article Index
Watson wins Jeopardy! - trick or triumph
Getting it right - or wrong


Lexical Answer Type

Notice that this sort of matching would be done on multiple fragments and syntax would be used to guide the sort of results that rank highest. It is all a question of working out what the entry is that the information is about - the LAT or Lexical Answer Type. In this case the LAT is "This archaic term" i.e. we are looking for a word. Knowing the LAT allows Watson to pick out the the item in the matched record that is the answer.  In the case of this example the entry has to be a word - Rapscallion in this case. Watson can then perform a simple transformation to get the question form of the answer "What is Rapscallion?".

Of course this misses out lots of the detail in what Watson is doing but it gives you flavour of the overall approach. The categories that the questions fall into can be used to narrow down the LAT. The question also needs to be treated in different ways depending on its overall type. For example,

Category: Diplomatic Relations

Clue: Of the four countries in the world that the United States does not have diplomatic relations with, the one that's farthest north.

In this case the Category isn't of any real use in working out the LAT - it isn't a diplomatic relation. The question itself reveals that the LAT is a set of countries but this is a tough thing to work out. The answer algorithm also has to be customised find the one of the four items that satisfies the first part of the clue "the four countries in the world that the United States does not have diplomatic relations with" that also satisfies the second "farthest north". Even so you can see that with enough time and tweaking you can eventually construct an algorithm that works most of the time given a big enough sample of questions.



Failure mode

Overall Watson is a huge complex conglomerate of algorithms.

Don't misunderstand, this isn't a criticism - to be able to create such a system is a remarkable feat but it does tend to produce AI systems that are "brittle" and likely to fail in ways that make a human gasp.

For example, the final Jeopardy question in the second round of the contest was:

"Its largest airport is named for a World War II hero; its second largest, for a World War II battle" in the category of "U.S. Cities,"

Watson buzzed in and answered "What is Toronto?"

The audience gasped as the answer was stupidly wrong. How could this happen?

Without more data we can't be sure but Watson must have not used the category "U.S. Cities" to constrain the solution item to be a U.S. City - statistically this would have been the correct thing to do but the question also didn't contain fragments that pinned it down to a US city either. So the wrong candidate item was picked. Watson doesn't understand anything.

A massively parallel machine

Even so, by the end of the third round Watson had achieved a convincing win and there must be a great future for him and his technology. To make this approach work fast enough the algorithms have to be massively paralleled.  Fortunately because so many of the sub tasks don't interact this is fairly easy.  However, to get the time to answer down from 2 hours in the first version took a lot of hardware - a Linux based cluster using up to 90 IBM Power 750 servers with 16TBytes of RAM occupying 10 racks using 80 kwatts of power.

Watson may have beaten the humans but it took a lot of computing power - a total of 2800 Power 7 cores.




Other applications?

IBM has plans to use the DeepQA algorithms in commercial products but there are a number of issues that haven't been answered. The approach used by Watson is brittle and while it might not matter that it made a laughing stock of itself by seeming to "think" that Toronto was a US city, similar mistakes in other areas of application would be no laughing matter.

There is also the small issue that Jeopardy! has a format that suits the approach that Watson takes, or rather suits statistical AI. Being provided with the answer and having to find the question isn't the same as being asked a question and finding the answer. In Jeopardy! the clue is long and has lots of information that allows the algorithms to pin down an entry in a knowledge base corresponding to candidate items that can then be used to form a simple question with few words - like "What is Toronto?"

It often doesn't even matter that the form of the question isn't quite right for the answer given as the clue. Watson isn't a question answering machine - it is an answer questioning machine and it could well be that this is the easier of the two directions. In short, unless IBM can find applications that mimic Jeopardy!, Watson might well be a very expensive dead end.

Only time will tell if Watson is some sort of breakthrough to a new commercialization of AI, but for now we need to give it the credit for doing a very difficult task well and for revitalising AI in the mind of the public.

Well done Watson...

Watson wins on Jeopardy

Building Watson: An Overview of the DeepQA Project

Last Updated ( Monday, 19 March 2012 )

RSS feed of all content
I Programmer - full contents
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.