Friday, October 28, 2005
By Byron Spice,Pittsburgh Post-Gazette
Stan Jou's lips were moving, but nosound was coming out.
Mr. Jou, a graduate student in languagetechnologies at Carnegie
Mellon University, was simply mouthingwords in his native Mandarin
Chinese. But 11 electrodes attached tohis face and neck detected his
muscle movements, enabling a computerprogram to figure out what he
was trying to say and then translatehis Mandarin into English.
The result boomed out of a loudspeakera few seconds later:
"Let me introduce our new prototype," asynthesized voice
announced. "You can speak in Mandarin and ittranslates into English
or Spanish."
"This is a bit ofscience fiction," said Alex Waibel, director of the
InternationalCenter for Advanced Communications Technologies, "but
it is a visionthat we think is very exciting." And where it once
seemed a distantdream, it now is being actively developed thanks to
recent advancesin machine translation.
This particular gadget, when fullydeveloped, might allow anyone to
speak in any number of languagesor, as Dr. Waibel put it, "to switch
your mouth to a foreignlanguage."
It was one of several translation devices his researchgroup
demonstrated publicly for the first time yesterday in a
videoconferencewith reporters in Pittsburgh and at the University of
Karlsruhe inGermany.
"We want to make language translation transparent,"explained Dr.
Waibel, a computer scientist who holds jointappointments at Carnegie
Mellon and Karlsruhe.
The truecenterpiece of the demonstration was the videoconference
itself. AsDr. Waibel spoke, computer software translated his speech
intoSpanish and German.
Previous computer systems have translated thespoken word in limited
contexts, or "domains," such as travel ormedical information. But
yesterday's demonstration was of so-called"open domain" speech-to-
speech translation, a technically difficultfeat to pull off because
the spoken word is often ungrammatical andfilled with colloquialisms.
"This is definitely a new frontier,"said Kevin Knight, director of
the University of SouthernCalifornia's Information Sciences
Institute. "If you look in thescientific literature, you couldn't
find too much today on opendomain speech translation."
What has made this possible has been adramatic change in how
computer translation programs are written.In the past, most
translation software has been based on sets ofrules -- dictionary
definitions, grammatical rules and such. Inother words, programmers
tried to make a computer think like ahuman.
But increasingly, the trend in artificial intelligence isto allow
the computers to think like computers, using statisticalmethods to
draw meaning out of masses of information, said RandallE. Bryant,
dean of Carnegie Mellon's School of Computer Science.
Speechrecognition programs began using these statistical methods 15
yearsago, Dr. Knight said. Only recently have they been applied to
speechtranslation "and that's why things have been improving a lot
Theavailability on the Internet of large amounts of translated text
hasbeen a major boon, said Dr. Waibel.
The results aren't perfect.When Dr. Waibel announced he would take
questions from reporters inGermany and America, the computer heard
it as "so we glycogen italternating questions between Germany and
America." And the systemsdon't really understand what they are
translating, so may havetrouble sometimes when a speaker tries to be
humorous or ironic.
Buthe predicted open domain systems could be ready for use within
"As we make contact, people will be more likely to learnother
languages," Dr. Waibel said. U.S. soldiers in Iraq, forinstance, who
have handheld devices that repeat foreign phrases,ultimately have
learned to speak those phrases themselves anddiscard the machines.
For more informationon subvocal speech, visit:
