Can an Alligator Run the Hundred Meter Hurdles?

Gary Marcus, writing in The New Yorker, offers a summary of why artificial intelligence isn’t so intelligent (and has a long way to go to catch up with the human brain). He focuses on the research of Hector Levesque, who is a critic of the modern A.I.:

In a terrific paper just presented at the premier international conference on artificial intelligence, Levesque, a University of Toronto computer scientist who studies these questions, has taken just about everyone in the field of A.I. to task. He argues that his colleagues have forgotten about the “intelligence” part of artificial intelligence.

Levesque starts with a critique of Alan Turing’s famous “Turing test,” in which a human, through a question-and-answer session, tries to distinguish machines from people. You’d think that if a machine could pass the test, we could safely conclude that the machine was intelligent. But Levesque argues that the Turing test is almost meaningless, because it is far too easy to game. Every year, a number of machines compete in the challenge for real, seeking something called the Loebner Prize. But the winners aren’t genuinely intelligent; instead, they tend to be more like parlor tricks, and they’re almost inherently deceitful. If a person asks a machine “How tall are you?” and the machine wants to win the Turing test, it has no choice but to confabulate. It has turned out, in fact, that the winners tend to use bluster and misdirection far more than anything approximating true intelligence. One program worked by pretending to be paranoid; others have done well by tossing off one-liners that distract interlocutors. The fakery involved in most efforts at beating the Turing test is emblematic: the real mission of A.I. ought to be building intelligence, not building software that is specifically tuned toward fixing some sort of arbitrary test.

The crux, it seems to me, is how machines interpret the subtleties of human communication and how we talk. Marcus offers the following example in which a substitute of one word yields disparate answers:

The large ball crashed right through the table because it was made of Styrofoam. What was made of Styrofoam? (The alternative formulation replaces Stryrofoam with steel.)

a) The large ball
b) The table

Continuing, he explains:

These examples, which hinge on the linguistic phenomenon known as anaphora, are hard both because they require common sense—which still eludes machines—and because they get at things people don’t bother to mention on Web pages, and that don’t end up in giant data sets.

More broadly, they are instances of what I like to call the Long-Tail Problem: common questions can often be answered simply by trawling the Web, but rare questions can still stymie all the resources of a whole Web full of Big Data. Most A.I. programs are in trouble if what they’re looking for is not spelled out explicitly on a Web page. This is part of the reason for Watson’s most famous gaffe—mistaking Toronto for a city in the United States.

Levesque’s paper is short and easily accessible for the layman.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s