Sunday, October 16, 2016

Machine Translation Blows Chunks

I picked up a copy of Stephen Hunter's 2008 novel Night of Thunder a few weeks ago at a thrift store.* In the story, the hero says these words, in Vietnamese, to a young girl whose warning saved him and a friend from being ambushed:
"Can on co em. Co that gan da va su can dam cua co da cuu sinh mang chung toi."
I entered that into Google Translate, and this is the result:
"Can shrink on you. Liver is real leather and courage. You saved my life."
All was not lost, however. In a forum about learning Vietnamese, someone had asked about the lines in the book. A Vietnamese-speaker said that this is what they meant:
"Thank you sister. You're very brave and your bravery has saved our lives."
Which makes a lot more sense than does Google's result.
________________________________________
* As a pissant-level writer, I suppose that I should be buying new books so that the authors get paid.

5 comments:

  1. I tried the same thing, and got the same result. However, it asked underneath "did you mean" and offered "Cảm ơn cô em. Cô thật gan dạ và sự can đảm của cô đã cứu sinh mạng chúng tôi". It translated THAT as "Thank you brother. She was brave and courage she has saved our lives". Much closer, so I suspect the more serious flaw in translation was caused by the lack of accent marks.

    I have used Google Translate with moderate success on several documents in German I received when requesting records for geneological research. It saved me going to the local college and finding a German professor or Grad student to pay to translate everything. It's a useful idiot, if you respect it's simplicity and it's general inability to consider idiom, dialect and some context. For things such as a government or military forms, it is usually quite good to understand the entries, since they are very standardized. When you get to a free entry portion, it gets a bit more wobbly.

    ReplyDelete
  2. Google being what it is, it is also likely that Google Translate offering up that EXACT alternative was the result of the googlemachine seeing multiple posts about the exact problem the post refers to. Note that forvo.com also offers translations - and in many cases, short recordings of native speakers actually pronouncing those words;

    If googlemachine was able to figure out that most of the words were misspelled (lacking accent marks) all by itself, then it is a LOT smarter than we realized.

    ReplyDelete
  3. Probably, just the lack of accents in American English preconditioned you to deem them unimportant? As shown above, adding the accents makes the problem go away. Maybe that restricion in American is why your car-cab company is called Uber, not Über?

    ReplyDelete
  4. There weren't any accent marks in the original text.

    ReplyDelete
  5. And that's the problem here. Without the diacritical markings, a majority of other languages can lose varying amounts of meaning. Why the author or the publisher decided to print it that way, I don't know. If you ask around, I bet some of the "better" translations relied upon context to interpolate the meaning in the absence of the accents.

    ReplyDelete

House Rules #1, #2 and #6 apply to all comments. Rule #3 also applies to political comments.

In short, don't be a jackass. THIS MEANS YOU!
If you never see your comments posted, see Rule #7.

All comments must be on point and address either the points raised in the blog post or points raised by commenters in response.
Any comments that drift off onto other topics are subject to deletion.

(Please don't feed the trolls.)

中國詞不評論,冒抹除的風險。僅英語。

COMMENT MODERATION IS IN EFFECT UFN. This means that if you are an insulting dick, nobody will ever see it.