Visit our company website at Admerix.com

Wednesday, November 10, 2010

Machine Translation and the "Over Reverence Phenomenon"

Like many companies, Admerix has experimented with MT engines and other methods to reconfigure existing localization processes.

The dream of many companies is that they can eliminate translator fees by initially translating with machine translation and then get another translator to edit (read: retranslate) at an editing rate. We have even seen some companies trying to pass this off as a proofreading only process!

This sort of concept is already used by some companies—especially for more difficult work. They have the cheapest unqualified linguist to do initial work and then try to contract a normally expensive subject specialist to “edit” the translation. Most professional linguists see through this kind of scheme and will turn down these “editing” jobs that are really re-translation at cut-rate editing prices.

Although Admerix has never used machine translation on any live job, we had to investigate and understand what all the fuss is about.

One thing we discovered as we were testing the new machine translate/edit paradigm is an interesting phenomenon that we call the “Over Reverence Phenomenon.”

As we review the efforts of linguists editing machine translation, we consistently find editors deferring to often wrong or stilted work coming from machine translation. This results is less refined translation work than if an editor is editing the work of a live translator.

This reverence for machine language-generated text occurs even with experienced editors. It seems evident that there is some conscious or unconscious desire to defer to the choices made by a computer-based translation engine—particularly with regard to terminology. Editors appear reticent to challenge MT output on specific linguistic issues.

As we spoke with editors who undertook our tests, several of them expressed the opinion that the MT output seemed correct and they were happy to learn the correct terms from it.

Anyone experienced with the localization industry will find this state of affairs unusual as many linguists revel in tearing up the work of other linguists—even over purely preferential and stylistic issues. There seems to be something about an MT engine though that makes editors hesitant to rewrite stilted or incorrect grammar—instead deferring to the supposedly superior knowledge of computers.

The other machine translation trend we've been aware of recently is MLVs discovering their linguists are exposing proprietary client data to machine translation engines through Google Translator Toolkit.

Many larger companies are finding their resources have been heavily using the service and thus giving away valuable and confidential translation to a public MT engine. Non-disclosure agreements, formerly held in light regard as a bureaucratic nicety, are suddenly becoming important again to ensure that linguists are not exposing client data to Google Translate. Of course, the corollary to this is that end clients may begin to see translation gains from work that has been entered into the Google Translate system.

We are also aware that some Asian vendors have sold or bartered millions of units of client translation memory to private companies that are creating their own translation engines.

Just to be clear, Admerix doesn’t use machine translation on its projects and we never expose client material to public MT engines. We hold our NDAs to be inviolate.

We also don’t feel that eliminating linguists is key to profitable projects. Our experience is that expert project management is key for the inevitable challenges that come with every corporate project. Translation often turns out to be the most trouble-free aspect of any corporate localization project.

1 comment:

  1. Interesting results and possible thesis fodder for someone.

    I find this plausible based on a broad spectrum of experiences (mundane stuff, like a hunting acquaintance who asked me what "communicator" he should buy for a trip to Africa), and when I see the reaction of some to MT output that isn't total trash at first glance, I do sometimes wonder at the allowances made. While we are busy being amazed by MT parlor tricks, let's not forget the purpose of the text.

    It would be interesting to see the behavior of people give a human translation and told that it is computer-generated.

    ReplyDelete