Google seeks world of instant translations

In Google’s vision of the future, people will be able to translate documents instantly into the world’s main languages, with machine logic, not expert linguists, leading the way.

Google’s approach, called statistical machine translation, differs from past efforts in that it forgoes language experts who program grammatical rules and dictionaries into computers.

Instead, they feed documents humans have already translated into two languages and then rely on computers to discern patterns for future translations.

While the quality is not perfect, it is an improvement on previous efforts at machine translation, said Franz Och (35) a German who heads Google’s translation effort at its Mountain View headquarters south of San Francisco.

”Some people that are in machine translations for a long time and then see our Arabic-English output, then they say, that’s amazing, that’s a breakthrough,” said Och.

”And then other people who have never seen what machine translation was … they read through the sentence and they say, the first mistake here in line five — it doesn’t seem to work because there is a mistake there.”

But for some tasks, a mostly correct translation may be good enough.

Speaking over lunch this week in a Google cafeteria famed for offering free, healthy food, Och showed a translation of an Arabic web news site into easily digestible English.

Two Google workers speaking Russian at a nearby table said, however, that a translation of a news site from English into their native tongue was understandable but a bit awkward.

Feeding the machine

Och, who speaks German, English and some Italian, feeds hundreds of millions of words from parallel texts such as Arabic and English into the computer, using United Nations and European Union documents as key sources.

Languages without considerable translated texts, such as some African languages, face greater obstacles.

”The more data we feed into the system, the better it gets,” said Och, who moved to the United States from Germany in 2002.

The program applies statistical analysis, an approach he hopes will avoid diplomatic faux pas, such as when Russian leader Vladimir Putin’s translator miffed then German chancellor Gerhard Schroeder by calling him the German ”Fuehrer.” The word is verboten in that context because of its association with Adolf Hitler.

”I would hope that the language model would say, well, Fuehrer Gerhard Schroeder is … very rare but Bundeskanzler Gerhard Schroeder is probably 100 times more frequent than Fuehrer and then it would make the right decision,” Och said.

The centre of Google’s effort looks surprisingly modest. Och shares a spartan office with two others on his team, with little clutter other than a shelf of linguistic books above his desk. That’s because the muscle work is performed by machines.

So far, Google is offering its own statistical machine translations of Arabic, Chinese and Russian to and from English. Third-party software gives access on the site to German and other languages, Och said.

”So far, the focus is let’s make it really, really good,” Och said. ”As part of a general Google philosophy, once it’s really useful and it has impact, then there will be found ways how to make money out of it.”

Miles Osborne, a professor at the University of Edinburgh, who spent a sabbatical last year working on the Google project, praises Google’s effort but sees limitations.

”The best systems [eg Google] can be very good indeed for language pairs such as Arabic-English,” he said.

But he added software will not overtake humans in expert translations as it has in playing chess; software should be used for understanding rather than polishing documents.

”It may also be useful when deciding whether to pay a human to do a good job: you could imagine looking at Japanese patent documents and seeing if they are relevant, for example,” he said.

Google chairperson Eric Schmidt also sees broad political consequences of a world with easy translations.

”What happens when we have 100 languages in simultaneous translation? Google and other companies are working on statistical machine translation so that we can on demand translate everything all the time,” he told a conference earlier this year.

”Many, many societies have operated in language-defined communities where they really don’t understand and are not particularly sympathetic to other peoples’ views because of the barrier of language. We’re about to have that breakthrough and it is a huge thing.” – Reuters

Keep the powerful accountable

Subscribe for R30/mth for the first three months. Cancel anytime.

Subscribers get access to all our best journalism, subscriber-only newsletters, events and a weekly cryptic crossword.

Related stories


Already a subscriber? Sign in here


Latest stories

Police were left dangling during July unrest, KZN top cop...

Whoever organised the unrest in response to the pending arrest of Jacob Zuma was responsible for the lethal violence that followed, the KZN police commissioner Nhlanhla Mkhwanazi said.

Urgent interdict filed to block Shell’s Wild Coast seismic survey

Shell’s 3D seismic survey is set to begin on Wednesday. But a high court application brought by rights groups to block it will be heard as an urgent matter on Tuesday

Police intelligence surprised by July unrest ‘modus operandi’

Minister of police only receives information from national commissioner that relates to him, commission hears

Sars appeals to ConCourt to keep Zuma’s tax secrets

The high court ruled banning the release of tax information was unconstitutional. Sars disagrees, saying is not only wrong but makes an exception of the former president

press releases

Loading latest Press Releases…