Turing's Test, the Loebner Prize and Chatterbots

Written by Mike James

Thursday, 12 September 2013

Article Index
Turing's Test, the Loebner Prize and Chatterbots
The Loebner prize
Restricted Domains

Page 2 of 3

The Loebner prize

For Turing his test of intelligence was intended as a talking point rather than anything real - at least at the stage of development computers were at them.

After all what is the point of seeing if a program actually passes the Turing test?

Apart from the fact that we are still nowhere near having a program that could make a serious attempt at the Turing test, most programs are produced to do something more specific than simulate a general intelligence. Left to its own devices one day the Turing test might be passed, almost by accident as it were.

But in 1990 the test was elevated into a goal in its own right by the offering of a $100,000 prize by Dr Hugh Loebner for the first program to pass the open-ended Turing test.

What's the "open-ended Turing test"?

Well essentially the real goal is a program that can carry on a reasonable conversation via a terminal and the "open-ended" part of the specification would let the judge direct the conversation to any topic they liked. This is clearly a difficult test because it allows the judge to move the conversation to areas as diverse as arithmetic, poetry, world politics, philosophy and the inner emotions of humanity.

For a machine to cope well with this full range was very definitely out of reach when the prize was initiated and for foreseeable future. Had we insisted on sticking to the "open ended" part of the test the Loebner prize would have been a theoretical one with no challengers.

To make it possible to hold the contest at the time, the hurdle had to lowered and the obvious way to do this was to make the Turing test a "restricted-domain" in which the judges were forced to keep to a small number of topics - romantic relationships, Shakespeare's plays, Burgundy wines and whimsical conversation.

The program now only had to convince the judge that it is a human when talking about small range of knowledge, a more feasible task given the available computing power and AI techniques. Still not easy mind you, but a low enough hurdle to tempt programmers to have a go.

The catch was that the full $100,000, plus the specially minted solid 18-carat gold Medal was to be awarded to the first program to pass the open-ended Turing test and only a lesser award of $1500 and a bronze medal to the best restricted-domain program each year.

GoldPrizeHGL

The Loebner prize medal

The first contest

The first Loebner prize contest was held in on November 8th 1991 at The Computer Museum in Boston.

There were ten finalists selected from over 100 entries worldwide. The judges slaved all day typing their side of the dialog into the programs and at the end of the day some of the judges were fooled by the machines. More to the point some of the judges were fooled by the humans into thinking that they were machines!

It has to be made clear here that it is an assumption of the Turing test that the human reference can do what it likes to influence the outcome of the test, but in this case none of the human challengers were doing anything other than responding normally, i.e. as humans!

So it seems that one outcome of the first Loebner Turing test was to demonstrate that not even real humans are good at passing even the restricted Turing test. Perhaps detecting a human by what they type at you is intrinsically more difficult than you might imagine. If this is the case then it makes it easier for a machine to "pass" the test just because of the errors in measurement.

Another important point was that the judges were not computer experts - and this coupled with another factor made the test far less satisfactory than it could have been.

The second factor was that the contestants were allowed to pick their restricted domain of discourse (i.e. what they were going to talk about!) and as long as it was within the expertise of "ordinary people" then it was deemed acceptable. This, in combination with non-expert judges, made one type of program and its restricted domain a dead certainty to win.

Chatterbots

For years there had been programs that could hold a convincing type of conversation that can best be described as "defensive". They worked by picking up key words in the input sentence and by "turning" the sentence around.

For example, if the input contained any of the words no, not etc.. then the program responded with "Why are you being negative?" if the input contained "never" it responded with "Really never?" and so on.

Eliza, developed by Joseph Weizenbaum between 1964 and 1966 is the first example of what are now termed "chatterbots", programs designed to simulate an intelligent conversation with one or more human users via auditory or textual methods.

The Eliza program was based on a human mode of interaction typified by a Rogerian therapist trained not to make any creative input to a conversation, only to keep it going so that patients could explore their feelings. Taking to Rogerian therapist is very like talking to a brick wall with an slightly clever echo!

The important point is that this sort of program uses very simple template matching and transformation tricks to take the input sentence and generate an output sentence.

For example, it might have a template "I don't understand X" where X is the remainder of the sentence. The template transformation might be

"I don't understand X" -> "Why don't you understand X"

So if the user types in

"I don't understand life"

the machine responds with

"Why don't you understand life"

Of course this simple template transformation soon shows its limitations. For example:

"I don't understand you"

generates

"Why don't you understand you?"

But the clever programmer can patch this up by adding the rule that

"you" -> "me"

Slowly the chatbot accumulates templates and transformations until it is a huge mess of specific rules. Then the day comes when the rules start to contradict and override each other and the true nature of the approach is finally revealed.

You may have noticed that this is just Searl's Chinese room approach to AI. In this case the templates form a compressed form of the lookup table that the book represents.

Eliza and Parry

To give you some idea how effective this method is here is a conversation between two programs - Weizenbaum's Eliza, a therapist, and Parry, a program dating from 1972 written by psychiatrist Kenneth Colby which uses similar methods to pretend to be paranoid:

1 Parry - I don't understand your motives

2 Eliza - Why don't you understand my motives?

3 Parry - Shouldn't I be?

4 Eliza - Why do you say that just now?

5 Parry - That's the way it is.

6 Eliza - Suppose that's just the way it were not.

7 Parry - So?

8 Eliza - I am not sure I understand you fully.

9 Parry - You harp on the same questions too much.

10 Eliza - What does that similarity suggest to you?

11 Parry - I don't get you.

12 Eliza - Do you wish you did get I?

13 Parry - It's about time you showed some interest in my feelings.

14 Eliza - Oh? I showed some interest in your feelings?

and so on.

Remember there are two programs talking to each other here and not a human in sight.

At first it seems like a convincing interaction but the interaction is partly random responses and partly triggered by key words. For example, the opening sentence is one of a random selection from a paraniod's phrase book. The response is just Eliza using the simple rule used as an example in the previous section, to turn a question around.

The seemingly intelligent response at 10, where Elisa appears to understand the deeper significance of 9, is actually triggered simply by the presence of the word "same" in 9. The analysis of the conversation would carry on to reveal that each response was triggered by a trivial feature of the previous input. And yet we are fooled because we read intelligent interaction into the conversation. Eliza and Parry borrow our skill of interpreting vague and ambiguous language to make themselves seem intelligent.

This is fun and people have claimed in the past that programs such as Eliza have passed the full open-ended Turing test because the judges who talked to them were not given any instructions to limit their conversation and yet still believed that they were talking to a human.

Even after being told that they were talking to a program, many of the participants would not believe that it was so. Weizenbaum, the inventor of Eliza wrote:

" Eliza created the most remarkable illusion of having understood in the minds of many people who conversed with it... They would often demand to be permitted to converse with the system in private, and would, after conversing with it for a time, insist, in spite of my explanations, that the machine really understood them."

<< Prev - Next >>

Last Updated ( Friday, 13 September 2013 )