« The Bug Hunter | Main | Disliking Microsoft »

March 25, 2005

Statistically Improbable Phrases

I was looking at the amazon listing for the book Debugging by Thinking and on the listing it has the "Statistically Improbable Phrases" (a self-referential term if I ever saw one) for the book. For example "debugging tactics". If you click on that it shows you other books with the same phrase, and then further clicks shows you where it appears in each book.

This is from an explanatory page from amazon: "Our computers scan the text of all books in the Search Inside program. If they find a phrase that occurs a large number of times in a particular book relative to how many times it occurs across all Search Inside books, that phrase is a SIP in that book."

You can actually enter any phrase into it with some trivial MBA-applicant-style URL tweaking. For example if I search on my name, it comes up with a bunch of references. Unfortunately they are all bogus (or someone else). My name somehow wound up on a list of sample names that Microsoft Press uses for books. I need to nag Addison-Wesley to get Find the Bug into Search Inside, so it will show up here.

You can have fun with this. I can track down the precise reference for the phrase "what cavemen used to debug fire" that I used a couple of days ago. Or find all books that refer to "Robert Scoble". Or "squeamish ossifrage" or "throbbing genitalia" or "naked lacrosse". Let's say you decide that any book with the phrase "dripping honey pot" in it belongs on your night table. Now you can find them. Or it actually, since there is only one right now. That makes it a "sipwhack".

I feel like a little kid again, typing "eat boogers" into a Unix shell so that I can laugh at the response "I don't know how to eat boogers".

Posted by AdamBa at March 25, 2005 12:05 PM

Trackback Pings

TrackBack URL for this entry:
http://proudlyserving.com/cgi-bin/mt-tb.cgi/178

Comments