Previous Section  < Day Day Up >  Next Section

Hack 14 Study Channel Statistics with pisg

figs/moderate.gif figs/hack14.gif

Most IRC clients will give you the option of saving messages to a log file. Generate entertaining statistics from these log files.

pisg is the Perl IRC Statistics Generator. It's available from the web site http://pisg.sourceforge.net and is one of the most popular IRC statistics generators in use today. This hack will show you how to use it to create amusing statistics for your channels and display them to everybody on the Web.

3.4.1 Running pisg

The most important thing you need in order to run pisg is a log file. This log file should contain timestamps so pisg can tell when each message was sent. pisg supports several log file formats, including those used by mIRC, XChat, Eggdrop, irssi, infobot, and PircBot. You will also need Perl in order to run pisg.

3.4.1.1 Editing pisg.cfg

Editing pisg.cfg should be your first step. Set up a channel item that corresponds to the options you would like for your channel. This lets you specify the name of the channel, the log file to read from, the format of the log file, the maintainer of the log file, and the name of the output file, for example:

<channel="#irchacks">

 Logfile = "#irchacks.freenode.log"

 Format = "mIRC"

 Maintainer = "Bob"

 OutputFile = "irchacks.html"

</channel>

Once everything is set up, it's just a simple case of executing the pisg script:

% ./pisg

pisg will then tear away at your log files and churn out its statistics. In a matter of seconds to minutes (depending on your computer's speed and the size of the log), you will have a file called irchacks.html (or whatever else you called it) containing all of the statistics.

3.4.2 Publishing pisg Statistics

Copy the output HTML file to somewhere that can display web pages. Any old web server will do the job, as it is just a static HTML page with no server-side content.

If you run your own web host, you could set the OutputFile to be a full path in a directory where the document would be visible on the Web. On a Unix/Linux box, you could even set up a symlink to the file. Wherever you decide to place the HTML file, you must also ensure that the files from the gfx directory are in the same place. These are used to create the colored bar charts in the pisg output.

3.4.2.1 Setting up statistics options

pisg has more configuration options than you can shake a stick at. They are generally well documented. One common option to change is to use ShowWords and SortByWords instead of sorting by number of lines (which is more vulnerable to users attempting to pad their stats).

3.4.2.2 Nickname tracking

pisg has automatic nickname tracking. When it is enabled, this feature watches for people who change their IRC nickname and will merge the statistics for two nicknames if it thinks it is appropriate. Unfortunately, many channels have periods of silliness in which people may temporarily play a game of "musical nicks," or various people may switch to the same nick temporarily. This can seriously mess up the statistics. If this is an issue, you can use user lines instead.

3.4.2.3 User lines

User lines are little lines in the configuration file that contain information about a user. They support several options:

 <user nick="Fennec" alias="Fennec* Foo* Jacob* Jake|PDA" sex="m"

        link="http://fennec.homedns.org">

The user's nick is the name of the user, as it will appear on the stat page. The aliases are all other nicknames that should be considered to be the same user. Wildcards are allowed with the * character, but they have a tendency to slow down statistics generation. The sex can be set to m or f and will cause the name to display as blue or pink and will also set several pronouns to use, for example, "he" or "she" instead of "he/she."

Either nickname tracking or user lines is necessary for a meaningful Users With Most Nicknames section.

Other useful options available for user lines include the ignore="y" option, which can be added to ignore a user. This is often applied to bots; however, some channels also include their bots in statistics, and it can be particularly amusing if the bots talk as much as some regular users.

3.4.2.4 Photos and photo galleries

If you can cajole a channel's user base into sending pictures of themselves (or if you manage to track them down, stalk them, and take pictures yourself), you can use pisg as a sort of impromptu photo gallery. First, set your ImagePath to where the images will be accessible. Then you can add pic="nickname.png" to each user line.

With PicHeight and PicWidth, you can set a default picture height and width for your page. Dimensions of approximately 6648 pixels allow for a compact but effective gallery.

Setting a user's bigPic option will cause the user's picture to link to the specified file. Including a wildcard as a user's picture will cause one of the pictures that match the wildcard for that user to be randomly selected. Setting the UserPics option will allow more than one picture per row. The DefaultPic option will allow you to set up a default user picture.

3.4.2.5 Headers and footers

A custom header (or footer) with some spiffy and topical images or a quote is a nice way of adding a personalized touch to your statistics. This should be in HTML (ideally, XHTML). For example, here are the contents of a generalized header file:

<table border="1"><tr><td>

 <table border="0">

  <tr><td><img src="image.png" alt="caption" /></td>

      <td align="center"><div align="center">

        Spiffy amusing headline here!

         <hr />

        <font size="-4"><span style="color: #AAAAAA; font-size: 9px;">

          Informational byline here.

        </span></font>

       </div></td>

      <td><img src="picture.png" alt="Caption" /></td>

  </tr>

 </table>

</td></tr></table>

Change image.png, picture.png, the captions, and the headline/byline as you see fit. This header works well with images approximately 48 pixels high.

3.4.3 The Results

If you've set everything up correctly, you'll end up with something like Figure 3-5, with a colorful bar chart showing which times of the day are most active, along with pictures of each user. This bar chart is interesting in that it shows activity starting at 8 a.m. and steadily growing before falling back down at lunchtime. Even IRC users have to stop for lunch.

Figure 3-5. Output from pisg, showing activity periods and user info
figs/irch_0305.gif


pisg also generates several other pieces of information that are not readily obvious, such as the Big Numbers section, shown in Figure 3-6. This shows who asked the most questions, who shouted the most, who was most aggressive, who was most disliked, and who was the happiest.

Figure 3-6. Some of the other statistics obtained from pisg
figs/irch_0306.gif


The pisg web site (http://pisg.sourceforge.net) contains links to hundreds of real examples of pisg in action.

Thomas Whaples

    Previous Section  < Day Day Up >  Next Section