Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Facebook apologizes after wrong translation sees Palestinian man arrested (theverge.com)
103 points by pulisse on Oct 24, 2017 | hide | past | favorite | 89 comments


Given how ridiculously common the phrase "Good morning" is in Arabic, (I mean, look at any Arabic 101 textbook, the Sabah Al-khair / Sabah al-noor exchange is one of the first things they teach you), it's puzzling why Facebook's translation algorithm didn't see that as a fixed phrase, rather than trying to translate it word by word.

So much for machine learning.


I have to play the native Arabic speaker resident here and weigh on this controversy

1) The guy wrote يصبحهم (non standard Arabic) but it was picked as يذبحهم by FB which translates to (He slaughters/slays them)

2) The smaller problem lies in that the word يصبحهم is not standard Arabic and probably Palestinian slang like in Egypt we got many variations on the morning greeting like صباحو or صبّح صبّح etc etc

3) The bigger problem was why FB translator went out of its way to change the letter from ص to ذ instead of returning a "Not Found" or "Please double check spelling and try again" or "Did you mean ....?"

It's really very strange.


Most sources for machine translation from Arabic-English pairs are data from US Army / anti-terrorist data.

So the data are learned to translate terrorist messages very well, as well as army terms. However, it doesn't translate day-to-day use that well.

Machine learning is only as good as the source data - and Arabic-English language pair specifically has data from US military complex, that loves translating terrorist messages. So translating Al Qaeda messaging will be good, translating regular message boards will be worse. The machine learns to translate everything as an attack, since that is what it knows.

(I cannot give you exact sources, but I knew people from inside Google. Facebook uses Bing, but I think the data sources will be similar.)


> Most sources for machine translation from Arabic-English pairs are data from US Army / anti-terrorist data.

Interesting! This is HN, where's the source?

> I cannot give you exact sources

I find your claim hard enough to believe but this seals the deal.


It'll turn out in due time that Bin Laden was just sending the US his best wishes.

Joking of course, but up to a point. There are cases of words from the "enemy" deliberately translated in their more literal sense when their sense is figurative (one for all, "death to America"- even the official translation is "down with America"). And statements by Bin Laden himself have been deliberately ignored in the parts that made more sense from a political point of view, to reduce them to a more comfortable "we hate you because you're free".


Given that both Arabic and English are official languages of the UN (and in which all UN documents are written), wouldn't those documents form (a large part) of the data for training? Not that that would have much in the way of "Good morning" in it except perhaps speech transcriptions.


You're going to have to back that up. This would be a clearly biased dataset.


Makes me less confident in US CT capabilities.


Right, though this was the slightly less textbook (but just as common in day-do-day language) "Yisbah-hum", a shortened and less formal 'good morning' greeting.


A while back I heard a story from a doctor friend about colleagues who would use Google Translate to translate notes to Spanish-speaking patients. That kind of stuff is an accident waiting to happen.

The issue is not machine learning per se although that's behind a lot of the free services that have popped up. Rather it's that once you get outside the scope of translations the system can handle the results may be catastrophically bad. You can't judge the risk unless you understand the assumptions behind the translation service. Most people don't know this and just use free services without thinking.


Always get the machine to translate it back again as a first check that you haven't got something that the system isn't good at. I use it often when I write Norwegian because even after thirty years I still don't remember what gender everything is (I'm English but I write in Norwegian when writing to the locals). If I don't get my original Norwegian back I know it's worth a closer look.

Doing it from your own language to a language you don't speak at all without at least that simple double check is asking for trouble.


More importantly, why are police in a multilingual nation relying on buggy machine translation to make arrests. It's not like it was a post in some obscure language from PNG or something.


It's not that uncommon, algorithms make a lot of mistakes and this is becoming a critical issue as more and more of our life is ruled by algorithms.


The operative quote from the article is: "no Arabic-speaking officer had read the man’s Facebook post."

This reminds me of the kind of alarmist, ham fisted policing we often see in America, where police will barge into an elderly person's home and trash the place because they forgot to double check the address.


I would consider obvious it'd be enormously advantageous for Israeli police to have, at least, a basic understanding of Arabic.

It's like a Miami policemen without any Spanish/Portuguese knowledge.


Useful to whom? Often times the police aren't really there to be helpful to minorities. You'd think Miami Police would know a bit of Spanish just out of sheer necessity but in reality very little do. I'm not surprised at all that more Israeli police don't speak any Arabic


Spanish speaking people are definitely _not_ a minority down here in Miami hehe.


I took the time to look this up. Miami is 70% hispanic of any race according to 2010 census.

Though that doesn't mean 70% of Miamians only speak Spanish.

Source: https://www.google.com/search?q=miami+demographics


However, it's quite possible that a plurality speaks at least Spanish.

Being able to know folks are saying around you is a survival skill for jobs like law enforcement.


Minorities are usually defined in terms of power rather than sheer numbers.


The mayor of Miami, Tomás Pedro Regalado was born in Cuba, and 3 of the 6 city commissioners were either born in Cuba or born to Cuban parents. The city commissioners are:

* Wifredo "Willy" Gort - born in Cuba, https://en.wikipedia.org/wiki/Willy_Gort

* Ken Russell - born in Coral Gables, FL, http://www.miamiherald.com/news/local/community/miami-dade/a... . Does not appear to have a Cuban connection.

* Frank Carollo - born in Miami of Cuban-born parents, http://www.miamiherald.com/news/local/community/miami-dade/a...

* Francis Suarez - born in Miami. His father Xavier Suarez, Miami’s first Cuban-born mayor in 1985, http://www.miamiherald.com/news/local/community/miami-dade/a...

* Keon Hardemon - born in Miami, "scion of one of Liberty City’s most politically active families and the son of a Miami police officer", http://www.miamiherald.com/news/local/community/miami-dade/a... . Does not appear to have a Cuban connection.

I think that helps demonstrate local majority political power.


I don't subscribe to that nonsense lol, but I realize that's the normal definition.


A lot of them do, it's very common for Israeli's to be fluent in Arabic. But not all of them obviously.

And in this case, considering what someone with evil intent and a bulldozer can do, I can understand arrest first, figure out what's going on second.

It's not like arresting someone after the fact - in that situation you can take some time. Here it's before the fact, and seconds count.


You'd think they'd take the 30s required grab literally the first person in the office they could find who can understand the language it's written in.

Moreover, how did they arrest him without someone speaking a modicum of Arabic? Or did they just run into his home and snatch him by force?

I personally find this kind of behavior by the armed representatives of government to be horribly undemocratic.


A typical bulldozer has a top speed of less than 10 mph. I reckon they had time to think.


[flagged]


Are you trying to derail the conversion? This is not appropriate.

Delete it before people start posting defense/accusations of what happened then, which has absolutely nothing to do with this thread.


This is unsurprising considering the US / Israeli Police coordination. [1] Additionally many similar cases of Israeli police being repressive against Palestinians have been well documented [2].

[1] https://theintercept.com/2017/09/15/police-israel-cops-train...

[2] https://www.amnesty.org/en/countries/middle-east-and-north-a...


This is scary and 1984 ish. A man was arrested because of a bad translation.


More like the movie Brazil, during the opening of which:

> a fly gets jammed in a printer and results in the incarceration and accidental death during interrogation of cobbler Archibald Buttle – instead of renegade air conditioning specialist and suspected terrorist Archibald Tuttle

https://en.wikipedia.org/wiki/Brazil_(1985_film)


Collateral damage. People are routinely murdered for being in the same area as people who are suspected of terrorism on the basis of evidence which has never seen a courtroom.


Not sure if you're saying that makes it OK. I don't think that makes it OK.


[flagged]


Extrajudical killings is more a US drone strike topic than an Israel one.


Arrested because a huge centralized corporation based on collecting people personal data to profile them has replaced humans by machines which made an obvious mistake.

Then other people reported said man to local authorities who acted on it without checking the source.


What's even more 1984-ish is that the phrase "attack them" will get you arrested in Israel.


Only if you're not of the favored ethnic group.


Especially when you're part of the non-favored ethnic group that perpetrates and celebrates the routine murder of innocent men, women, and children. Murder that is some times committed using construction equipment.

Now if you were a cop, and just received a photo of what looks like a member of the non-favored ethnic group with a murder weapon, saying "kill them" I'd bet you wouldn't just shrug it off. You take action to save innocent lives. Ask questions later.


What an awful society that is where you can get arrested for posting two words on social media by a police department run by politicians you can't vote for.


The predator will not eat a bad tasting butterfly, but we can trick it to do so.

They talk about a "linguistic translation" error in the attempt to deflect focus away from the meta data triggers that this person generated.

Cyber phrenology is real.

https://www.cultstate.com/2017/10/13/The-Butterfly-War/


An amusing read, but the overbearing narcissism (I did this; something happened; that something is therefore caused by my action) makes it hard to read as a rational argument and not a psychoanalysis case study.


This is where I saw the author being a blackhatter parading as a greyhatter. Or maybe even worse, a blackhatter who doesn't know he's a blackhatter.

The idea that increasing the noise/signal ratio (ie: "make them eat more butterflies") is going to cause Facebook and their ilk to see the flaw in reasoning is not only malicious (Milo's 20,000 fake followers + residuals did not go away), but it also betrays a profoundly scary naivete over the subject matter.

It makes it worse when you write off your body of work as "for research" but fail to set up a methodology more complex than "provoke social justice people into attacking each other, algorithmically."

The entire piece neglects to consider the "rm monkey" ie: the trolls that don't have such motives but chaos. Some of us, growing up, just liked to see systems implode. He surrounds himself with them on 4chan but fails to recognize the complexity of their brigading.

The author, a self-perceived "arbiter of truth" waving his scepter (an infantile of scripts), finds himself far more intelligent than both his merry band of trolls on 4chan, Google/Facebook, and protected groups all at the same time - but he may be wrong on all 3 accounts.

Dunning Krueger is not understanding that many on 4chan prefer the butterfly war.


The author performed all of this during a three year period, I believe.

If ego was the only driving focus, wouldn't he have revealed himself earlier? Especially during the rise of alt-right micro-celebrities? How can you be a mere narcissist with a Hollywood background and completely avoid intentionally giving yourself the limelight?

This strikes me as something far more methodical.


Are you implying this was part of some cyber-phrenological attack to poison the "predator"? Or something? Pardon my ignorance, as my attempts to understand just what the hell cyber phrenology is all about is a bit stymied by the fact that this theory reads like a William Gibson novel...


The Butterfly War describes a theory of meta data warfare that forces a predator to change its behavior and target people it's been trained not to target.

In this case, the connection between Silicon Valley and national security apparatuses are so hypervigilant, everyone in the anaylsis pipeline accepted the meta data that arrested this guy to be sufficient.

The Butterfly War teaches you how to put anyone you want into that pipeline.


I'm with you so far. What I'm wondering is if you're saying this Palestinian guy was "put into the pipeline" by someone performing such a metadata attack, since he seems at first glance to fit the profile of someone the security apparatus would target anyway.

In any case, it's an intriguing tactic.


I think the correlation is that if the metadata-to-securiry response pipeline is incorrectly targeting people it is already biased against, then this is proof that the pipeline can also incorrectly target the people it is biased to ignore.


The referenced article was interesting.

A single blackhatter posing as a greyhatter (read: "I did it for 'research purposes'" shortsightedly) can cause massive social disruption - and Facebook holds no culpability.


The fact that you can get arrested for a Facebook post in Israel is already alarming.


Also it appears they have a feed of posts they parse automatically. There is nothing in the article that says someone called the police over the post.


In what country can you never be arrested for a Facebook post?


In the US which has freedom of speech laws. But which country has arrested users for Facebook posts other than authoritarian countries? Particularly for such a relatively innocuous post.


People have been arrested over facebook in the US, for example:

https://www.theguardian.com/us-news/2017/may/18/facebook-com... http://www.businessinsider.com/people-arrested-for-facebook-...

AFAIK most of the countries have had arrestations over facebook posts, those who have not are following this global trend.


Interesting I did not know that. Of course there are limits to free speech. Threats, and involvement in criminal activity, like if I order somebody to shoot someone, that’s not protected by free speech. This was a pretty innucous statement not a threat.


"Authoritarian country" is too prone to no-true-scottsmanship. So I'll just say that it's fairly common in the U.K., and definitely not unheard of in the U.S. Even if someone legally should not be arrested, people are people, and the police will often detain someone just to be on the safe side, when responding to a complaint. Putting something on Facebook opens you up to complaints from a potentially wide audience of lurkers who may not understand the context of a post, or worse, might be angry with you and looking for petty revenge. So, people do get arrested for Facebook posts in practice, regardless of whether it is legal to post the content.


Sorry I should have added charges and found guilty. Because Israel has imprisoned people based on Facebook posts.


People have been arrested in the US for social media posts. And well before the internet, for things they've said or written.

Also, generally, the bar for "arrested" is pretty low. Cops can arrest you with very little fear of reprisal.


You can be arrested in the US if you were to post a legitimate threat


The bar for arrest is lower than that. You can be arrested for posting rap lyrics that somebody else interprets as a threat.


Do you have a specific case in mind? I would have thought it would require specifically the police to interpret it as a threat.



It only requires the police to be ambivalent or in the mood to cuff someone. I do know of 2 such cases. One in the sensitive climate following the Boston bombing. Another that was revenge, like a SWATting but with regular police.


Threats or rather what the police considers threats gets people arrested in the US.


Facebook often asks me if I want to translate posts that are already in English. I'm not surprised that "mistakes may happen".


This is both the fault of Facebook and the police. Unfortunately, not all police forces require evidence to arrest people. I would argue that the larger problem is actually the way in which the police handled this and not so much Facebook's inaccurate translation.



Maybe they feared he was preparing an attack using a goedendag [1] (itself named after a possibly mythical shibboleth [2])?

[1] https://en.wikipedia.org/wiki/Goedendag

[2] https://en.wikipedia.org/wiki/Shibboleth


Note that in Arabic there are no vowels. Also note that it's not the common "good morning" (Sabah Al-khair), but a rarer term: one word, 6 letters, no vowels. The equivalent in english would be "gdmrng".


No, seeing as there are no vowels in Arabic, the equivalent is not an English word with vowels, with the vowels removed.

It is a shortened form though, and is equivalent to "Morning!"


Arabic text has much higher entropy than English. The last two letters in the Arabic word he used (H M) is also the common suffix for "them". That's not strictly equivalent to simply "Morning!" in English.

Maybe the translation algorithm confused "Yisbah-hum" (good morning) with "Yitbah-hum" (Kill them)?


Thank you! all this reading, including the source article, and finally someone gives a likely source for the mistake.


AFAIK, spoken Arabic has vowels, but the script is an abjad with optional vowel markers. So English text written using only the consonants would be a good approximation.


Actually, good morning صباح الخير has a strong vowel right there "ا" in the first word.

One of the problems people face with learning Arabic is that weak vowels are implicit by nature and you need to know them by heart in order to read the words properly.


How does that happen?

I know that "googlebombing" google translate used to be a thing as i have seen it done fairly often(usually stuff like translating android to iphone etc) in the Danish/English translator, where the fact that a few votes could change a words meaning.

I wondering if what happened here is a less harmless version of that were a group of people deliberately messing with crowd sourced translation tools, and if facebook have any kind of sanity checking on their translate function or if they are as vulnerable as google translate to that kind of manipulation?


I'm not exactly a Facebook fan but the blame for this needs to be shared by the Israeli police.


Blame should rest mostly on the police.


IMHO responsibility is 90% facebook, 8% people who reported it and 2% police. I do not expect police officers to be expert in using facebook.


I think it is 99% police. I do not use FB translate, but Google translate. Sometimes it works, sometimes it does not. I expect that it is not 100% reliable, and anybody who frequently uses the automated translation tools for critical tasks (presumably the police in this case) should know this. My suspicion is that this was just an excuse for the police to arrest somebody.


It's not so much being Facebook experts as it is actually having a solid pretext for an arrest.

I guess if Facebook isn't marking the automatic translations that is a problem, but they shouldn't be giving someone a bad day before they understand which part of the post he wrote and what it likely means.


I pasted the offending Arabic into Google translate and got "Become them" (https://translate.google.com/#auto/en/%D9%8A%D8%B5%D8%A8%D8%...).

Bing just says "become", Translate.com and Cambridge dictionaries have "Come in the morning to them".

Yandex says "ISG world"!

All a bit odd or downright weird but none so threatening as Facebook's. Was Facebook's translation engine sabotaged?


The trend will continue as long as the 0 accountability standard remains.


For Facebook or the Israeli state?


Both?


precisely - but Facebook sits upstream here


I have no love for Facebook, but it should always be the LEO's duty to verify everything, even if just to save their own ass.



Crazy!!! ... let's take politics aside for a sec. What if the man was actually planning an attack? That would be really crazy that an AI tool could actually find that with the context of the picture.

Let's give another example ... let's say I pose with a fake gun and I write: "Ready to party" would the algorithm render a different translation than if I posed with balloons?


That's strange. I'm sure it could just as easily have happened to an Israeli citizen. (Just kidding!)


[flagged]


[flagged]


The parent comment wasn't good, but would you please follow the site guidelines regardless? In this case you broke the one that asks: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize."

https://news.ycombinator.com/newsguidelines.html


Translation: don't have a dark sense of humor about things.

It's not because I lack empathy, you dolt. Obviously everyone of us is far removed from that risk. It just reads like something that could have been an Onion article: ridiculous and exasperating. Enjoy being a stiff, and go fuck yourself.


Ai yai yai, we ban accounts that break the rules like this, so would you please read them and follow them when commenting here? Even if another comment was unduly stiff.

https://news.ycombinator.com/newsguidelines.html




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: