Imagine discovering a secret language spoken only online by a knowledgeable and learned few. Over a period of weeks, as you begin to tease out the meaning of this curious tongue and ponder its purpose, the language appears to shift in subtle but fantastic ways, remaking itself daily before your eyes. And just when you are poised to share your findings with the rest of the world, the entire thing vanishes.
This fairly describes my roller coaster experience of curiosity, wonder and disappointment over the past few weeks, as I’ve worked alongside security researchers in an effort to understand how “lorem ipsum” — common placeholder text on countless Web sites — could be transformed into so many apparently geopolitical and startlingly modern phrases when translated from Latin to English using Google Translate. (If you have no idea what “lorem ipsum” is, skip ahead to a brief primer here).
Admittedly, this blog post would make more sense if readers could fully replicate the results described below using Google Translate. However, as I’ll explain later, something important changed in Google’s translation system late last week that currently makes the examples I’ll describe impossible to reproduce.
It all started a few months back when I received a note from Lance James, head of cyber intelligence at Deloitte. James pinged me to share something discovered by FireEye researcher Michael Shoukry and another researcher who wished to be identified only as “Kraeh3n.” They noticed a bizarre pattern in Google Translate: When one typed “lorem ipsum” into Google Translate, the default results (with the system auto-detecting Latin as the language) returned a single word: “China.”
Capitalizing the first letter of each word changed the output to “NATO” — the acronym for the North Atlantic Treaty Organization. Reversing the words in both lower- and uppercase produced “The Internet” and “The Company” (the “Company” with a capital “C” has long been a code word for the U.S. Central Intelligence Agency). Repeating and rearranging the word pair with a mix of capitalization generated even stranger results. For example, “lorem ipsum ipsum ipsum Lorem” generated the phrase “China is very very sexy.”
Kraeh3n said she discovered the strange behavior while proofreading a document for a colleague, a document that had the standard lorem ipsum placeholder text. When she began typing “l-o-r..e..” and saw “China” as the result, she knew something was strange.
“I saw words like Internet, China, government, police, and freedom and was curious as to how this was happening,” Kraeh3n said. “I immediately contacted Michael Shoukry and we began looking into it further.”
And so the duo started testing the limits of these two words using a mix of capitalization and repetition. Below is just one of many pages of screenshots taken from their results:
The researchers wondered: What was going on here? Has someone outside of Google figured out how to map certain words to different meanings in Google Translate? Was it a secret or covert communications channel? Perhaps a form of communication meant to bypass the censorship erected by the Chinese government with the Great Firewall of China? Or was this all just some coincidental glitch in the Matrix?
For his part, Shoukry checked in with contacts in the U.S. intelligence industry, quietly inquiring if divulging his findings might in any way jeopardize important secrets. Weeks went by and his sources heard no objection. One thing was for sure, the results were subtly changing from day to day, and it wasn’t clear how long these two common but obscure words would continue to produce the same results.
“While Google translate may be incorrect in the translations of these words, it’s puzzling why these words would be translated to things such as ‘China,’ ‘NATO,’ and ‘The Free Internet,'” Shoukry said. “Could this be a glitch? Is this intentional? Is this a way for people to communicate? What is it?”
When I met Shoukry at the Black Hat security convention in Las Vegas earlier this month, he’d already alerted Google to his findings. Clearly, it was time for some intense testing, and the clock was already ticking: I was convinced (and unfortunately, correct) that much of it would disappear at any moment. Continue reading