Sunday, February 28, 2010

Of Namesakes and Mandrakes: T9 Dictionary Collisions

I've always been fascinated/bemused by the mixups that happen when you use T9 predictive typing on your cell phone. While it is quite good overall, it can lull you into a false sense of security and if you don't actually make sure to read every word, you might text a friend asking if they'd like to go watch 'some mother' instead of 'some movies.' Because, of course, 668437 is the T9 code for both 'mother' and 'movies.' Bus ride can become cup ride, Jordan can become Korea, and (fittingly) kiss can become lips.

I decided to put my Computer Science degree to use and look for other fun "collisions"--cases where the same T9 code produces multiple words, preferably humorously. While I couldn't code in the humor criterion, I did create a simple Java program that prints out all the T9 collisions in a dictionary text file*.

Perusing through this file of 4717 collisions, I found a few I probably would have never stumbled across on my own. Namesake & mandrake are equivalent according to your phone, as are enemy & endow and imam & hobo (please don't hate me, Muslim fundamentalists--I love Islam and am only reporting what I found. Wage jihad against Tegic Communications, the heathen company that developed this godless technology).

Of course, there are a lot of boring collisions, like producer & produces, but there are some fun ones to find as well. I uploaded the output of the program to a public Google doc--check it out and see if you notice any particularly funny or unusual ones (I have only glanced over random sections of it).

After doing all this, I found this page that analyzes T9 collisions for numeric codes with the most words associated with them, which is also interesting, but he never would have found out that pennant & remnant go together, so I like my approach too.

So have you ever had any good mixups using T9 predictive typing? Did you find any gems from my list that I missed?

* I wish I had access to the actual T9 dictionary they use in phones, but I couldn't find it online, so I just used the default linux (American English) dictionary. I also didn't count one-letter words, because they are stupid, and I ignored any words that include anything other than the letters A-Z.


  1. spence and i were just talking about this today! my friend brooke is arnold and it always makes me laugh.

  2. Aly, I love it!

    A few others that people shared on my buzz but didn't bother to click over to comment on my blog (slackers!) were:

    "Stoke my boat" for "stole my coat"
    "I accidentally told a girl we could still be friends without 'eating each other', instead of 'dating each other'. She replied with, 'I'm not a cannibal.'"
    "Turkmenistan" for "thermodynamics"

  3. A guy in my high school said that when he tried to say "church" he always ended up saying "b*tch," but I'm pretty sure that was a result of him missing a keystroke since there's one less letter...

  4. This is fun. And oh so nerdy. I love it!

  5. And then just yesterday I texted someone about 'the cool of mormon.' That would probably be the most awesome book of scripture yet.

  6. Can't remember any examples at the moment, but excellent topic.

    How easy would it be to generate a list of words that were within one letter difference of a given word? That seems to happen a lot, too, as noted above, and might be an interesting ultra-nerdy programming project.

  7. James: not too hard, I'll have to get on that ASAP.

    And Twinky, yeah, but still pretty funny. I wonder if it was at all on purpose, or perhaps a freudian slip?

