Previous Section  < Day Day Up >  Next Section

Hack 13. Seek Out Weblog Commentary

Build queries to search only recent commentary appearing in weblogs.

There was a time when you needed to find current commentary, you didn't turn to a full-text search engine like Google. You searched Usenet, combed mailing lists, or searched through current news sites like CNN.com and hoped for the best.

But as search engines have evolved, they've been able to index pages more quickly than once every few weeks. In fact, Google tunes its engine to more readily index sites with a high information churn rate. At the same time, a phenomenon called the weblog (http://www.oreilly.com/catalog/essblogging/) has arisen: an online site keeps a running commentary and associated links, updated daily—and indeed, even more often in many cases. Google indexes many of these sites on an accelerated schedule. If you know how to find them, you can build a query that searches just these sites for recent commentary.

1.25.1. Finding Weblogs

When weblogs first appeared on the Internet, they were generally updated manually or by using homemade programs. Thus, there were no standard words you could add to a search engine to find them. Now, however, many weblogs are created using either specialized software packages (lsuch as Movable Type, http://www.movabletype.org, or Radio Userland, http://radio.userland.com) or as web services (such as Blogger, http://www.blogger.com/). These programs and services are more easily found online with some clever use of special syntaxes or magic words.

For hosted weblogs, the site: syntax makes things easy. Blogger weblogs hosted at blog*spot (http://www.blogspot.com) can be found using site:blogspot.com. Even though Radio Userland is a software program able to post its weblogs to any web server, you can find the majority of Radio Userland weblogs at the Radio Userland community server (http://radio.weblogs.com) using site:radio.weblogs.com.

Finding weblogs powered by weblog software and hosted elsewhere is more problematic; Movable Type weblogs, for example, can be found all over the Internet. However, most of them sport a "powered by movable type" link of some sort; searching for the phrase "powered by movable type" will, therefore, find many of them.

It comes down to magic words typically found on weblog pages, shout-outs, if you will, to the software or hosting sites. The following is a list of some of these packages and services and the magic words used to find them in Google:


Blogger

"powered by blogger" or site:blogspot.com


Blosxom

"powered by blosxom"


Greymatter

"powered by greymatter"


Geeklog

"powered by geeklog"


Manila

"a manila site" or site:editthispage.com


Pitas (a service)

site:pitas.com


pMachine

"powered by pmachine"


uJournal (a service)

site:ujournal.org


LiveJournal (a service)

site:livejournal.com


Radio Userland

intitle:"radio weblog" or site:radio.weblogs.com


WordPress

"powered by wordpress"

1.25.2. Using These "Magic Words"

Because you can't have more than 10 words in a Google query, there's no way to build a query that includes every conceivable weblog's magic words. It's best to experiment with the various words, and see which weblogs have the materials you're interested in.

First of all, realize that weblogs are usually informal commentary and you'll have to keep an eye out for misspelled words, names, etc. Generally, it's better to search by event than by name, if possible. For example, if you were looking for commentary on a potential strike, the phrase "baseball strike" would be a better search, initially, than a search for the name of the Commissioner of Major League Baseball, Bud Selig.

You can also try to search for a word or phrase relevant to the event. For example, for a baseball strike you could try searching for "baseball strike" "red sox" (or "baseball strike" bosox). If you're searching for information on a wildfire and wondering if anyone had been arrested for arson, try wildfire arrested and, if that doesn't work, wildfire arrested arson.

Why not search for arson to begin with? Because it's not certain that a weblog commentator would use the word "arson." Instead, he might just refer to someone being arrested for setting the fire. "Arrested" in this case is a more certain word than "arson."


    Previous Section  < Day Day Up >  Next Section