-
How is it that so many
intergalactic species in movies and TV
-
just happen to speak perfect English?
-
The short answer is that no one
wants to watch a starship crew
-
spend years compiling an alien dictionary.
-
But to keep things consistent,
-
the creators of Star Trek
and other science-fiction worlds
-
have introduced the concept
of a universal translator,
-
a portable device that can instantly
translate between any languages.
-
So is a universal translator
possiblein real life?
-
We already have many programs
that claim to do just that,
-
taking a word, sentence,
or entire book in one language
-
and translating it into almost any other,
-
whether it's modern English
or Ancient Sanskrit.
-
And if translation were just a matter
of looking up words in a dictionary,
-
these programs would run circles
around humans.
-
The reality, however,
is a bit more complicated.
-
A rule-based translation program
uses a lexical database,
-
which includes all the words
you'd find in a dictionary
-
and all grammatical forms they can take,
-
and set of rules to recognize the basic
linguistic elements in the input language.
-
For a seemingly simple sentence like,
"The children eat the muffins,"
-
the program first parses its syntax,
or grammatical structure,
-
by identifying the children
as the subject,
-
and the rest of the sentence
as the predicate
-
consisting of a verb, "eat,"
-
and a direct object, "the muffins."
-
It then needs to recognize
English morphology,
-
or how the language can be broken down
into its smallest meaningful units,
-
such as the word, "muffin,"
-
and the suffix, "s,"
used to indicate plural.
-
Finally, it needs to understand
the semantics,
-
what the different parts of the sentence
actually mean.
-
To translate this sentence properly,
-
the program would refer to a different set
of vocabulary and rules
-
for each element of the target language.
-
But this is where it gets tricky.
-
The syntax of some languages
allows words to be arranged in any order,
-
while in others, doing so could make
the muffin eat the child.
-
Morphology can also pose a problem.
-
Slovenian distinguishes between
two children and three or more
-
using a dual suffix absent
in many other languages,
-
while Russian's lack of definite articles
might leave you wondering
-
whether the children are eating
some particular muffins,
-
or just eat muffins in general.
-
Finally, even when the semantics
are technically correct,
-
the program might miss their finer points,
-
such as if the children
"mangiano" the muffins,
-
or "divorano" them.
-
Another method is
statistical machine translation,
-
which analyzes a database
of books, articles and documents
-
that have already
been translated by humans.
-
By finding matches between source
and translated text
-
that are unlikely to occur by chance,
-
the program can identify corresponding
phrases and patterns,
-
and use them for future translations.
-
However, the quality
of this type of translation
-
depends on the size
of the initial database,
-
and the availability of samples
for certain languages
-
or styles of writing.
-
The difficulty that computers have
with the exceptions, irregularities
-
and shades of meaning
that seem to come instinctively to humans
-
has led some researchers to believe
that our understanding of language
-
is a unique product
of our biological brain structure.
-
In fact, one of the most famous
fictional universal translators,
-
the Babel fish from
The Hitchhiker's Guide to the Galaxy,
-
is not a machine at all,
but a small creature
-
that translates the brain waves
and nerve signals of sentient species
-
through a form of telepathy.
-
For now, learning a language
the old fashion way
-
will still give you better results than
any currently available computer program.
-
But this is no easy task,
-
and the sheer number
of languages in the world,
-
as well as the increasing interaction
between the people who speak them,
-
will only continue to spur greater
advances in automatic translation.
-
Perhaps by the time we encounter
intergalactic life forms,
-
we'll be able to communicate with them
through a tiny gizmo,
-
or we might have to start compiling
that dictionary, after all.