Team LiB
Previous Section Next Section

Hack 71. Work with RSS Feeds

Suck syndicated news and blog updates into Firefox using scripts.

Really Simple Syndication (RSS) is a collection of XML-based file formats. RSS support in Firefox is in flux at the time of writing, and that heavily affects the ease with which you can get anything done. This hack provides some jumping-off points for integrating this technology with Firefox.

Firefox and Thunderbird RSS support is ongoing, and remarks made here should be reviewed for versions that are greater than 1.0. That said, most of the comments here are general and should be accurate for some time.


6.15.1. Understand the RSS Mess

Here is a list of reality checks for RSS:

  • There is no feeding of any kind. RSS does not use an event or message-forwarding architecture, unless an RSS client and RSS server pair are particularly sophisticated. In order to receive RSS information, the client must poll the server at regular intervals. This is what Mozilla technology does.

  • RSS describes documents. Feed content is in the form of XML documents, usually delivered over HTTP, and has no special status beyond that.

  • RSS is not based on RDF. Early versions were a mixture of alleged RDF and non-RDF XML. Modern versions have nothing to do with RDF. No version can be considered RDF.

  • There are many versions of RSS. According to this URL, the number of subtle variants is quite high, although some are no longer popular:

http://diveintomark.org/projects/feed_parser/
  • Atom and CDF are not specifically RSS, unless you use RSS as an umbrella term for XML-based feed technology. Even that, however, is not an accurate use of RSS, because asynchronously delivered SOAP messages [Hack #65] can be used to implement a true feed system (which is not trivial to do in Firefox). Such SOAP messages have nothing to do with RSS.

6.15.2. Exploit Firefox Support for RSS

RSS documents are XML, so they can be loaded into Firefox as XML using standard web techniques. The XMLHttpRequest object can be used to download such documents from the server of origin, and from elsewhere if the right security is in place.

Loading RSS documents as plain XML is not much of a start, though, because the RSS-specific tags receive no special treatment. That raw XML must be reduced to more useful information. There are several alternate strategies for turning RSS into RDF-like content, which is a more useful format for processing purposes:

  • Use W3C DOM 2 Core operations

  • Use code borrowed from Thunderbird

  • Deliver the content to Firefox's bookmark datasource as a set of Live Bookmarks

In the first two cases, the syndicated information ends up in JavaScript as a DOM 2 Document Fragment or Document. It can be further processed into RDF facts for use in a datasource. If that is done, XUL templates and other Mozilla RDF technology can be applied to the feed data in a simple and flexible way. In the third case, the data is automatically inserted into the bookmarks datasource, where it is automatically visible to the user.

Datasources can be used only in a secure environment, not in ordinary web pages. Firefox does not support Internet Explorer's addBookmark() feature for security reasons.

To see how to parse the XML content of an RSS feed, study or steal Thunderbird source code last seen at the following URL:

http://lxr.mozilla.org/aviarybranch/source/mail/extensions/newsblog/content/Feed.js

To put that content in a datasource, just use script operations to assert the required facts [Hack #70] .

Firefox does not offer any XPCOM components that directly support RSS feeds. The parsing of RSS data is buried deep in the platform and can't be accessed by JavaScript. The bookmarks datasource provides indirect access via Live Bookmarks. Here are the component details:

@mozilla.org/browser/bookmarks-service;1 nsIBookmarksSerivce

Treating RSS data as a Live Bookmark just to get automated RSS support is a major hack. Care must be taken to bury the Live Bookmark deep in the bookmarks folder hierarchy so that the user doesn't see it on the Bookmarks Toolbar, unless that is a desirable effect.

6.15.3. Receive Notification of New Items

If you turn RSS content into RDF facts and use a datasource, then all the facilities of datasources are laid open to scripting. This provides a flexible environment for complex handling of RSS information.

If RDF facts in a datasource change, a script can find out the changes without constantly probing the datasource by installing an observer. This tactic is similar to using the DOM 2 Events addEventListener() method for events like click. Here's part of an RDF datasource observer:

// object implementing nsIRDFObserver
var obs = {
  onAssert(ds, sub, pred, obj) {
    window.alert('New fact added to datasource ' + ds.URI +
          'with fact subject of ' + sub.Value);  
  },
  onUnassert(ds, sub, pred, obj) { ... },
  onChange( ... ) { ... },
  ...
};

// lodge the observer, 'ds' is a datasource object.
ds.addObserver(obs);

// finished initialization

The incomplete part of the definition is much the same as the sample implementation of the onAssert() method. Search Mozilla Cross-Reference (http://lxr.mozilla.org) for the nsIRDFObserver.idl file for the interface details.

Once the observer has been added to the datasource, it will be notified of all datasource changes. In this particular observer, an alert will result each time a fact is added. Any processing at all can be done where the alert occurs.

    Team LiB
    Previous Section Next Section