Businessweek Archives

What's Google's algorithm for ferreting out "racism" in Portuguese?


? Sci Fi's Webisodes |

Main

| Facebook revolt: Not all friends are equal ?

September 05, 2006

What's Google's algorithm for ferreting out "racism" in Portuguese?

Stephen Baker

Google has agreed to provide Brazilian authorities with data on users who encourage racism, homophobia and pedophilia. (ex Battelle) Plenty of serious questions about privacy and freedom of expression, of course. But I'm wondering exactly how Google goes about locating hate speech.

It can't just be a question of looking for hateful words. If so, a literary analysis of Huckleberry Finn could end up in the batch. There are plenty of more advanced methods, which analyze the syntax and verb groupings in a text. That takes up a lot of computing power and produces lots of false positives.

Blog analysis companies like Umbria Inc. use human readers to pick out examples of what they're looking for. Then they use these as templates to "teach" the machine how to find more of the same. Some anti-spam companies use a similar approach. As we all know, they don't always get it right. Regardless of the technical specifics, I'm betting that some Brazilian who puts off-color jokes on Orkut, or perhaps pictures of his eight-year-old daughter's birthday party, is going to be IDed by Google's computers as a criminal suspect.

09:48 AM

international, search

The framers of the US Constitution were a gutsy lot. They feared neither words nor ideas. What they did fear was repression of free expression. They assumed people could sort through the words and ideas that were out there and decide which ones to pay attention to and which ones to ignore. We in the US, as well as people in many other nations, are now clearly ruled by those who fear words and ideas that differ from their own. The logical next step must be repression of free speech, which tends to spread out like an oil spill until only the ideas of those in power are considered appropriate. Perhaps the framers were a mere blip on history's screen. Perhaps their time has come and gone and no longer has relevancy in today's world. Perhaps what once resonated as universal truths were nothing more than a bunch of lofty sounding homolies designed to protect the free thinking Founding Fathers. Have we been misled all along? Or are we being misled now? You can only select one.

Posted by: dan cook at September 5, 2006 05:24 PM

Awww, Danny boy, how I wish you were right. If the framers of the constitution were truly fearless they would have been gutsy enough to address the plight of the approximately 22% of the US population in slavery, in the US constituation. Mind you, it is also true that if they did address that question, they would probably have lost Virginia, the Carolinas, Georgia..etc. So, it was practical and innovative, but it may lack some gutts, which would eventually cost us over 700,000 civil war lives and an Abolitionist hero name John Brown. But, I am digressing far from the path I have chosen, free expression.

The problem is not free expression. It is the idea that we have to tolerate expressions we do not agree with. The perfect example is why should rational people have to listen to a Hitler-ite hate speech. But to take a page from Washington, we don't have to tolerate anyone. The same right that gives us the protection umbrella to express ourselves, is the same one that affords those with other ideas to verbalise their opinion. So, to paraphrase Voltaire, " I disapprove of what you say, but I will defend to the death your right to say it."

The only way to combat fallacies and half-truths is to bring it out in the open and expose it to discerning light.

Posted by: Nicholas Padilla at September 24, 2006 06:46 AM


Coke's Big Fat Problem
LIMITED-TIME OFFER SUBSCRIBE NOW

(enter your email)
(enter up to 5 email addresses, separated by commas)

Max 250 characters

 
blog comments powered by Disqus