Saturday, February 7, 2015

And Thus Spake Google

The humanities are under attack. Enrollments are plummeting, tax cutting zombies in state legislatures are looking for more reasons to cut higher education funding and, most worrisome, a national panel of distinguished persons has published a report.

As an historian and former lesser deanlet in a college with the word “humanities” in its official title, I find the attacks discouraging. And, as someone who writes about technology, I can see further dangers. Specifically, is computer technology in general, and Google in particular, going to destroy the role of the humanities in studying foreign countries and cultures? Even foreign languages themselves? Is French writer Fabien Cazenave right to suggest Google’s Translate software as a solution to the EU’s multiplicity of languages?[i]

 Before I explain, a disclaimer. I love Google. Every day I use Search, Gmail, Calendar, Chrome, Maps, and the excellent and eerily prescient Google Now. “The Google,” as W called it, is truly great.

But.

Consider Google’s effort to digitize libraries. Google started this massive effort and others have since joined the party, frantically scanning books, letters, newspapers, diplomatic cables, boring legislative zombie speeches and much more. Soon, all of this will be on the Web and all the text will be searchable.

This digitizing was clearly something Google did to help scholars (they’ll sell ads, but I don’t expect many businesses will be clamoring to pay to “capture the eyeballs” of people studying Nineteenth century Ottoman diplomacy).

Unfortunately, the Law of Unintended Consequences and the rule of No Good Deed Goes Unpunished both apply here. Specifically, it’s likely that historians, literary scholars and other humanists who currently need to spend years abroad will soon be constrained to sit at home instead.

It’s easy to imagine the conversations, at least those in public universities:

Faculty member:  “Even though I can get all the stuff online and the legislature has cut our budget again, it isn’t enough to simply read the documents. I also have to go to Ankara for a year to better understand the culture. So, can I have a sabbatical to…”
Dean:  “No.”

I’m a former area studies person (Eastern Europe and Russia) and this travel issue has worried me for some time. It’s not much appreciated, but people in the humanities, with their deep knowledge of foreign cultures, have long comprised an important part of America’s foreign policy and security foundation (for example, they had key roles in the early CIA). In addition to knowledge, humanists provided perspective. Looking back on the Cold War, one can see where their insights provided a much needed antidote to rigid military and diplomatic views.

What about an essential key to culture, language learning? Is it under threat too? I mean, Google is telling us that its Chrome browser-based Translate software provides a billion translations a day for some 200 million users.[ii]  Wow.

There’s even a rumor on the Internet that the National Security Agency, overwhelmed by the volume of call and e-mail intercepts coming from its PRISM Project, turned to Google to help with the translating. Hmm. Assuming this started right after 9/11, it could sure explain…

Anyway, I know that Google’s not kidding about its translations being widely used, since I see many technology articles in English that refer to translations of press releases and software and hardware reviews that are provided by Google.

Will it no longer be necessary for people to learn other languages? Will our literary scholars not merely miss going to Ankara, but also skip learning Turkish, too? 

Not yet. Here, finally, comes the good news:  if the NSA did use Google Translate for intelligence gathering, it wouldn’t make it a part of Project PRISM. “Operation Funhouse Mirror” might be more like it. Google Translate at the moment is often useful but it can be…odd. Let me explain.

In addition to compulsive reading about technology, I also try to maintain my withered foreign language skills by scanning news in various languages and occasionally employ Translate for help, which it sometimes provides. Mostly, though, I use it when I’m bored and need a good laugh.

Here are some Google translations from three different languages. The common thread is that they don’t make any sense in English.

Pain to us ad nauseam to listen to the poor pensioners of inadequate mothering in misery swept the country.”[iii]
 “Sinner's mouth really speaks. Talking Bible is more suggestive than anything.'s What I said everything.”[iv]
These first two are from Bulgarian and Romanian respectively. A bit on the esoteric side. But translate will do vastly better with a major language like French. Right?

No.

Much Translate fun can be had with the case of Dominique Strauss-Kahn, the former head of the International Monetary Fund who is famously persona non grata at the New York Sofitel. Given his experience in Gotham, you won’t be surprised to learn that DSK, as the French call him, was recently indicted for acting as pimp for “libertine” parties at the Carlton Hotel in Lille.

Want to know more? Translate elucidates with phrases like:

“The case of Carlton summarize it thus a history of false brethren feet nickel?”[v]

Different things went wrong in these three translations. In the Bulgarian example, Translate missed the author’s passion, with the result that a very emphatic but still entirely lucid statement comes out as gibberish.

In Romanian, Translate tripped on a proverbial expression (and mistranslated some words). In French the software stumbled on a cultural reference—a popular comic series. Should Google have known these references? Yes. Both phrases were used on the front page of major newspapers; they aren’t esoteric.

A reasonable critic might ask if the sentences would make sense if read as part of the entire translated article. Probably not (the links referenced after each will allow you to see for yourself). Each of these stories, like most Google Translations, is threaded with many hopelessly garbled and incoherent sentences, sometimes with words that can be read in a way that contradicts the original intent. The resulting confusion is so great that in most cases you’d have to do a lot of guessing about meaning. Needless to say, that can be very dangerous when you don’t know the language yourself—the only reason you’d be using the software.

And don’t think that technical articles necessarily do better. It’s hard to have an overall Translate favorite, but this one, from a Korean language cell phone review a while back, is tough to beat:

“Being frank more, more when it tries to talk, it leans against you in the land of the product which is Samsung also there is different mysterious expectation. Like all things to in gear composition and Samsung SPH m4300 which degree is big in PDA market, hoyk with the product which it draws it is visible with the polyvalent opinion thing. Of course at the degree where the reaction of the market against hereupon will correspond in him hot thing authorization also is an unknown….”[vi]

Google recently announced that it is planning to move Translate into “real-time communication.”
Imagine the possibilities! Pilots from around the world could crew together without having to learn English! As the big 797 comes in for a landing, conversations will  go something like this:

Pilot [speaking in Korean]: “Lower the landing gear!”
Co-pilot [hearing in English]: “Mysterious expectation in him hot thing?”

In the same vein, think about an international visitor renting a Google self-driving car at the Paris airport and telling it what to do via Translate:

Visitor from Korea [in Korean]:  Take me to the Carlton Hotel!
Google Self-Drive Car [Replying in Korean]:  Hey Sexy lady! Sinner’s mouth of inadequate mothering really speaks!

After reading a while English this kind, you start yourself doing it too. So, Googlers, the translation work keep up. For posterity do it. Really. Grandchildren your will think funny also. But give up the day job not.

Sorry, got carried away there. A more serious summary would go like this. Google Translate is very much at the leading edge of technology—an example of the amazing potential of Cloud Computing. In addition to state of the art hardware, its software uses paired translations to augment the standard dictionary files. 

The use of these translation pairs means that the system is often very context aware:  type a word from an article or story into the website and you’ll get one translation. Then, if you submit the entire original sentence or paragraph, you’ll often see the rendering change as the big interpreter in the sky analyzes the additional data and builds relationships of meaning.

This means that Translate can scale and grow. More CPUs, more memory and especially more paired translations will make it continuously more potent.

Will it ever catch up to humans?  There’s reason for skepticism.

First, the human brain is no slouch when it comes to computation. In linear kinds of problems we don’t match up well, but in parallel, branching kinds of analyses—what’s needed for language—we’re far ahead of the machines.

Just as important, language is always evolving (think of the recent evolution of the word “wicked” from bad to good).  Culture is one reason for language change, and, as illustrated in the Romanian and French translations above, as long as people read books and watch films and TV, computers are going to have a hard time keeping up.

So, can Google eliminate the need to learn languages? Maybe, but not soon. The company has an awesome track record but, before people rely on Translate they’re going to have to be sure it isn’t dangerous. We’re talking a decade or more, I think.

Finally, if Google wants to help the humanities—and enhance its chances of getting Translate right some day—a really good idea would be to create a Google Scholars program. These would be awards to faculty and graduate students in history, literature, philosophy and such who could spend a year or so studying or doing research in a foreign land.

When they come back, many Google Scholars would get jobs at universities or federal agencies or businesses while others would be hired to work at Google itself—maybe on Translate. Still others could be given fellowships at a new Google Humanities Center to look at big picture technology-humanities issues—sort of a Xerox PARC[vii] for the soft side. I’m confident that, just as happened at PARC, the creativity of folks at this institute would lead to lots of useful new services and products (in fact, taking a good first step, Google has just opened a cultural center in Paris[viii]).

By the way, and apropos of new services, don’t worry too much about the Internet rumor that the NSA is using Google Translate—I made that up a few paragraphs back so it might not be true.

Finally, Google, while I’ve got your attention, how about we collaborate on a new “first person shooter” computer game? We could call it Zap the Finance Committee Zombies.  The public is ready for this. I couldn’t being frank more.




[vi] The link to this review still exists in Google but the review itself is gone—it was in 2005.