Given how ridiculously common the phrase "Good morning" is in Arabic, (I mean, look at any Arabic 101 textbook, the Sabah Al-khair / Sabah al-noor exchange is one of the first things they teach you), it's puzzling why Facebook's translation algorithm didn't see that as a fixed phrase, rather than trying to translate it word by word.
I have to play the native Arabic speaker resident here and weigh on this controversy
1) The guy wrote يصبحهم (non standard Arabic) but it was picked as يذبحهم by FB which translates to (He slaughters/slays them)
2) The smaller problem lies in that the word يصبحهم is not standard Arabic and probably Palestinian slang like in Egypt we got many variations on the morning greeting like صباحو or صبّح صبّح etc etc
3) The bigger problem was why FB translator went out of its way to change the letter from ص to ذ instead of returning a "Not Found" or "Please double check spelling and try again" or "Did you mean ....?"
Most sources for machine translation from Arabic-English pairs are data from US Army / anti-terrorist data.
So the data are learned to translate terrorist messages very well, as well as army terms. However, it doesn't translate day-to-day use that well.
Machine learning is only as good as the source data - and Arabic-English language pair specifically has data from US military complex, that loves translating terrorist messages. So translating Al Qaeda messaging will be good, translating regular message boards will be worse. The machine learns to translate everything as an attack, since that is what it knows.
(I cannot give you exact sources, but I knew people from inside Google. Facebook uses Bing, but I think the data sources will be similar.)
It'll turn out in due time that Bin Laden was just sending the US his best wishes.
Joking of course, but up to a point. There are cases of words from the "enemy" deliberately translated in their more literal sense when their sense is figurative (one for all, "death to America"- even the official translation is "down with America"). And statements by Bin Laden himself have been deliberately ignored in the parts that made more sense from a political point of view, to reduce them to a more comfortable "we hate you because you're free".
Given that both Arabic and English are official languages of the UN (and in which all UN documents are written), wouldn't those documents form (a large part) of the data for training? Not that that would have much in the way of "Good morning" in it except perhaps speech transcriptions.
Right, though this was the slightly less textbook (but just as common in day-do-day language) "Yisbah-hum", a shortened and less formal 'good morning' greeting.
A while back I heard a story from a doctor friend about colleagues who would use Google Translate to translate notes to Spanish-speaking patients. That kind of stuff is an accident waiting to happen.
The issue is not machine learning per se although that's behind a lot of the free services that have popped up. Rather it's that once you get outside the scope of translations the system can handle the results may be catastrophically bad. You can't judge the risk unless you understand the assumptions behind the translation service. Most people don't know this and just use free services without thinking.
Always get the machine to translate it back again as a first check that you haven't got something that the system isn't good at. I use it often when I write Norwegian because even after thirty years I still don't remember what gender everything is (I'm English but I write in Norwegian when writing to the locals). If I don't get my original Norwegian back I know it's worth a closer look.
Doing it from your own language to a language you don't speak at all without at least that simple double check is asking for trouble.
More importantly, why are police in a multilingual nation relying on buggy machine translation to make arrests. It's not like it was a post in some obscure language from PNG or something.
The operative quote from the article is: "no Arabic-speaking officer had read the man’s Facebook post."
This reminds me of the kind of alarmist, ham fisted policing we often see in America, where police will barge into an elderly person's home and trash the place because they forgot to double check the address.
Useful to whom? Often times the police aren't really there to be helpful to minorities. You'd think Miami Police would know a bit of Spanish just out of sheer necessity but in reality very little do. I'm not surprised at all that more Israeli police don't speak any Arabic
The mayor of Miami, Tomás Pedro Regalado was born in Cuba, and 3 of the 6 city commissioners were either born in Cuba or born to Cuban parents. The city commissioners are:
You'd think they'd take the 30s required grab literally the first person in the office they could find who can understand the language it's written in.
Moreover, how did they arrest him without someone speaking a modicum of Arabic? Or did they just run into his home and snatch him by force?
I personally find this kind of behavior by the armed representatives of government to be horribly undemocratic.
This is unsurprising considering the US / Israeli Police coordination. [1] Additionally many similar cases of Israeli police being repressive against Palestinians have been well documented [2].
More like the movie Brazil, during the opening of which:
> a fly gets jammed in a printer and results in the incarceration and accidental death during interrogation of cobbler Archibald Buttle – instead of renegade air conditioning specialist and suspected terrorist Archibald Tuttle
Collateral damage. People are routinely murdered for being in the same area as people who are suspected of terrorism on the basis of evidence which has never seen a courtroom.
Arrested because a huge centralized corporation based on collecting people personal data to profile them has replaced humans by machines which made an obvious mistake.
Then other people reported said man to local authorities who acted on it without checking the source.
Especially when you're part of the non-favored ethnic group that perpetrates and celebrates the routine murder of innocent men, women, and children. Murder that is some times committed using construction equipment.
Now if you were a cop, and just received a photo of what looks like a member of the non-favored ethnic group with a murder weapon, saying "kill them" I'd bet you wouldn't just shrug it off. You take action to save innocent lives. Ask questions later.
What an awful society that is where you can get arrested for posting two words on social media by a police department run by politicians you can't vote for.
An amusing read, but the overbearing narcissism (I did this; something happened; that something is therefore caused by my action) makes it hard to read as a rational argument and not a psychoanalysis case study.
This is where I saw the author being a blackhatter parading as a greyhatter. Or maybe even worse, a blackhatter who doesn't know he's a blackhatter.
The idea that increasing the noise/signal ratio (ie: "make them eat more butterflies") is going to cause Facebook and their ilk to see the flaw in reasoning is not only malicious (Milo's 20,000 fake followers + residuals did not go away), but it also betrays a profoundly scary naivete over the subject matter.
It makes it worse when you write off your body of work as "for research" but fail to set up a methodology more complex than "provoke social justice people into attacking each other, algorithmically."
The entire piece neglects to consider the "rm monkey" ie: the trolls that don't have such motives but chaos. Some of us, growing up, just liked to see systems implode. He surrounds himself with them on 4chan but fails to recognize the complexity of their brigading.
The author, a self-perceived "arbiter of truth" waving his scepter (an infantile of scripts), finds himself far more intelligent than both his merry band of trolls on 4chan, Google/Facebook, and protected groups all at the same time - but he may be wrong on all 3 accounts.
Dunning Krueger is not understanding that many on 4chan prefer the butterfly war.
The author performed all of this during a three year period, I believe.
If ego was the only driving focus, wouldn't he have revealed himself earlier? Especially during the rise of alt-right micro-celebrities? How can you be a mere narcissist with a Hollywood background and completely avoid intentionally giving yourself the limelight?
Are you implying this was part of some cyber-phrenological attack to poison the "predator"? Or something? Pardon my ignorance, as my attempts to understand just what the hell cyber phrenology is all about is a bit stymied by the fact that this theory reads like a William Gibson novel...
The Butterfly War describes a theory of meta data warfare that forces a predator to change its behavior and target people it's been trained not to target.
In this case, the connection between Silicon Valley and national security apparatuses are so hypervigilant, everyone in the anaylsis pipeline accepted the meta data that arrested this guy to be sufficient.
The Butterfly War teaches you how to put anyone you want into that pipeline.
I'm with you so far. What I'm wondering is if you're saying this Palestinian guy was "put into the pipeline" by someone performing such a metadata attack, since he seems at first glance to fit the profile of someone the security apparatus would target anyway.
I think the correlation is that if the metadata-to-securiry response pipeline is incorrectly targeting people it is already biased against, then this is proof that the pipeline can also incorrectly target the people it is biased to ignore.
A single blackhatter posing as a greyhatter (read: "I did it for 'research purposes'" shortsightedly) can cause massive social disruption - and Facebook holds no culpability.
In the US which has freedom of speech laws. But which country has arrested users for Facebook posts other than authoritarian countries? Particularly for such a relatively innocuous post.
Interesting I did not know that. Of course there are limits to free speech. Threats, and involvement in criminal activity, like if I order somebody to shoot someone, that’s not protected by free speech. This was a pretty innucous statement not a threat.
"Authoritarian country" is too prone to no-true-scottsmanship. So I'll just say that it's fairly common in the U.K., and definitely not unheard of in the U.S. Even if someone legally should not be arrested, people are people, and the police will often detain someone just to be on the safe side, when responding to a complaint. Putting something on Facebook opens you up to complaints from a potentially wide audience of lurkers who may not understand the context of a post, or worse, might be angry with you and looking for petty revenge. So, people do get arrested for Facebook posts in practice, regardless of whether it is legal to post the content.
It only requires the police to be ambivalent or in the mood to cuff someone. I do know of 2 such cases. One in the sensitive climate following the Boston bombing. Another that was revenge, like a SWATting but with regular police.
This is both the fault of Facebook and the police. Unfortunately, not all police forces require evidence to arrest people. I would argue that the larger problem is actually the way in which the police handled this and not so much Facebook's inaccurate translation.
Note that in Arabic there are no vowels. Also note that it's not the common "good morning" (Sabah Al-khair), but a rarer term: one word, 6 letters, no vowels. The equivalent in english would be "gdmrng".
Arabic text has much higher entropy than English. The last two letters in the Arabic word he used (H M) is also the common suffix for "them". That's not strictly equivalent to simply "Morning!" in English.
Maybe the translation algorithm confused "Yisbah-hum" (good morning) with "Yitbah-hum" (Kill them)?
AFAIK, spoken Arabic has vowels, but the script is an abjad with optional vowel markers. So English text written using only the consonants would be a good approximation.
Actually, good morning صباح الخير has a strong vowel right there "ا" in the first word.
One of the problems people face with learning Arabic is that weak vowels are implicit by nature and you need to know them by heart in order to read the words properly.
I know that "googlebombing" google translate used to be a thing as i have seen it done fairly often(usually stuff like translating android to iphone etc) in the Danish/English translator, where the fact that a few votes could change a words meaning.
I wondering if what happened here is a less harmless version of that were a group of people deliberately messing with crowd sourced translation tools, and if facebook have any kind of sanity checking on their translate function or if they are as vulnerable as google translate to that kind of manipulation?
I think it is 99% police. I do not use FB translate, but Google translate. Sometimes it works, sometimes it does not. I expect that it is not 100% reliable, and anybody who frequently uses the automated translation tools for critical tasks (presumably the police in this case) should know this. My suspicion is that this was just an excuse for the police to arrest somebody.
It's not so much being Facebook experts as it is actually having a solid pretext for an arrest.
I guess if Facebook isn't marking the automatic translations that is a problem, but they shouldn't be giving someone a bad day before they understand which part of the post he wrote and what it likely means.
Crazy!!! ... let's take politics aside for a sec.
What if the man was actually planning an attack? That would be really crazy that an AI tool could actually find that with the context of the picture.
Let's give another example ... let's say I pose with a fake gun and I write: "Ready to party" would the algorithm render a different translation than if I posed with balloons?
The parent comment wasn't good, but would you please follow the site guidelines regardless? In this case you broke the one that asks: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize."
Translation: don't have a dark sense of humor about things.
It's not because I lack empathy, you dolt. Obviously everyone of us is far removed from that risk. It just reads like something that could have been an Onion article: ridiculous and exasperating. Enjoy being a stiff, and go fuck yourself.
Ai yai yai, we ban accounts that break the rules like this, so would you please read them and follow them when commenting here? Even if another comment was unduly stiff.
So much for machine learning.