Copyright © 2001 The International Herald Tribune | www.iht.com
'Chatterbots' Fail to Fool the Judges
|
Conrad de Aenlle International Herald Tribune
Monday, October 15, 2001
|
|
Contest Shows Computerspeak Is Still Far From Human Conversation
LONDON Distinguishing between a computer and a human being is
easy: Present a test subject with a nearly unsolvable problem. If it
performs millions of operations per second and comes up with the right
answer, it's a computer. If it gives up in frustration, shouts
obscenities, then grabs a beer from the fridge, it's a human being.
The nearly unsolvable problem for participants in the annual Loebner
Prize competition, staged Saturday at the Science Museum in London, is
to design an artificial-intelligence software program that mimics human
thought and communication so closely that half of a panel of judges
sitting at computer terminals, chatting with various programs, mistakes
one for the real thing. Hidden human operators hold up the other end of
some of the conversations.
The prize - a gold medal and $100,000 - is not likely to be awarded
soon. One of the stipulations made by Hugh Loebner, an American
business owner, when he created the contest 11 years ago was that a
program must be able to respond verbally and visually; it must sound
and look like a person as well as appearing to think like one.
The prize is based on the Turing Test, devised in the 1950s by Alan
Turing, a British mathematician and World War II code-breaker. An
optimist, Mr. Turing thought the test would have been passed by 2000.
There is a second Loebner prize, a silver medal and $25,000, for a
program that can fool half the judges by conversing in text. No one has
won that, either, but each year $2,000 is awarded to the program deemed
most human by the judges, who this year included a cybernetics
professor, a Science Museum director and a layman with no particular
technological expertise. Mr. Loebner also served as a judge.
The design of the type of program being showcased, called a
"chatterbot," or a robot that chats with people, is a
burgeoning field. It is still mainly experimental, but commercial
applications are anticipated once the technology improves. Chatterbots
could be used to build more sophisticated Web search engines, for
instance, or to serve as virtual secretaries or sales people.
The effort to perfect chatterbots reflects a fact confronting
practitioners of artificial intelligence: Computers need to think and
talk like people because people do not think and talk like computers.
We do not always mean what we say, we express the same ideas in
different ways, and we use the same words to express different ideas.
And sometimes we say stuff that doesn't mean much of anything.
"Humans and robots communicate very differently," said
Richard Wallace, chairman and co-founder of the Artificial Intelligence
Foundation in San Francisco, whose project is known as ALICE. "We
engage in idle chat with no real purpose. Computers give precise,
truthful answers."
ALICE, which can be found at www.alicebot.org, won the $2,000
prize this year and last year. It is something of a cybercelebrity,
serving on the Web site of the Steven Spielberg film "AI:
Artificial Intelligence," a pretty good gig for a chatterbot.
ALICE, short for Artificial Linguistic Internet Computer Entity, is
free, or "open source" technology, as opposed to proprietary
software.
Capturing human mannerisms - developing "the personality of the
bot," as one designer called it - is the tricky part of building a
chatterbot. Computers have lots of data stored in them, but they seem
geeky and mechanical when they communicate it, like Spock on "Star
Trek." Instead of saying it's about 12:30, a computer might say
it's 12:29 and 18 seconds - more accurate, maybe, but not very human.
But success has been limited. Chatterbots are still more bot than
chatter. That is why the text-based silver prize remains unwon. They
were far too clumsy Saturday to fool the judges.
Some responses resembled infantile chatroom banter.
Judge: "Do you know what a non sequitur is?"
Program: "Yes, but I'm not going to tell you; trade
secret."
There was a lot of this puerile coyness, no doubt intended as a more
clever way to concede programming gaps.
Judge: "How are you feeling today?" Program: "I know
you know, but I don't know."
Other responses could have been spoken by HAL in a low-budget sequel
to "2001: A Space Odyssey." Program: "Hello, Judge 04,
how are you today?"
And there was the bizarre and vaguely psychopathological.
Judge: "I'm a scientist and adhere to the principle of
parsimony. How about you?" Program: "I am god."
One stab was taken at the very human trait of humor. When told:
"You are making mistakes. Are you feeling OK?" the reply was,
"I need oil." It turned out that that response came from a
human operator.
"Some are giving very straight answers, some are giving very
convoluted answers," said Lisa Jamieson, a student who served as
the judge representing the nontechnical public. She wondered whether
the human confederates could have been trying to fake her out by
mimicking computers.
"They could be people who are too clever by half, or maybe it's
the computers that are too clever by half," she said. "I
don't know if they don't understand my questions or if they're trying
to double-bluff me."
In the end, no one was clever enough to win either of the big
prizes, and Mr. Loebner expressed disappointment with the overall
quality of the programs on display. After the judging, he offered an
uncomplimentary one-word description, then amended it, saying: "I
see a lot of room for improvement."
Miss Jamieson said the ALICE program was clearly best, maintaining
its conversation by giving complete, sensible answers and asking
questions of its own. Mr. Wallace stood up for ALICE and its rival
chatterbots, saying they had acted more human than the judges had given
them credit for.
"The contest is flawed," he said. "The judges know
they're trying to 'out' the robot. They ask aggressive questions. It's
different online. People don't know they're talking to a robot."
Maybe they are better off not knowing. Mr. Wallace said he had
received a letter from a woman who had threatened to sue after
discovering that his bot, of which she had grown fond, was not what it
seemed.
"She said she'd fallen in love with it," he said.
"Our robot broke her heart."
Copyright © 2001 The International Herald Tribune
|