Hack 70. Work with RDF Facts

Make Firefox act like a small database server that is smart about RDF data.

The W3C's Resource Description Framework (RDF) standard is one of the more complex XML standards. RDF is an application of XML, but it makes little use of the XML tag hierarchy. Instead, it turns any specified tags into facts. Facts are stored in memory separate from the DOM tree of the RDF content. This hack shows how to use facts. Firefox has both scripting and direct XML support for facts.

What is Firefox RDF good for? It's a very general mechanism for managing any kind of data, whether that data fits in an XML hierarchy or not. The output of an SQL query is an obvious example.

6.14.1. Learn RDF

If you are new to RDF, be selective about the material you lean on. Much RDF literature is not suited for beginners. The book Rapid Application Development with Mozilla (Prentice Hall PTR) is one source of a digestible tutorial. An alternate approach is to learn a little Prolog. RDF and Prolog are both examples of propositional calculus (a field of mathematical logic). On top of these ideas, you also need a bit of experience scripting Mozilla XPCOM components.

Very briefly, RDF is a more general way of representing data than relational tables. RDF is concerned with making statements about things. Each statement is comprised of three pieces of information: the thing we're making the statement about, a property of the thing, and the value of that property. In RDF parlance, this is called a triple, and the thing, property, and values are called subject, predicate, and object, respectively. For Example, Firefox Hacks (the thing) has an author (the property) of Nigel McFarlane (the value).

To compare RDF with SQL, here's an example of two SQL tables that have one record each. They might look like this, with data shown in plain type and schema information in bold:

table employee:

empid name   jobcode
----- ------ -------
2     Fred   33

table department:

depid name   manager
----- ------ -------
3     sales  2

RDF requires that all information be stated as facts. Look at the standard (or chrome) .rdf examples for a first glimpse. Here's a shorthand (one of several) for the facts covering the above two tables:

<- employee, empid, 2 ->
<- 2, name, Fred ->
<- 2, jobcode, 33 ->

<- department, depid, 3 ->
<- 3, name, sales ->
<- 3, manager, 2 ->

That's six facts, with the first one reading "employee has a property empid with value 2." Each fact must have three parts (obscurely named subject, predicate, and object). That provides regularity and makes it possible for processing systems to act in a general way. A set of facts are lumped together into a fact store. No tables are required. Here's a sample RDF document for these facts:

<?xml version="1.0"?>
<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

  <Description about="urn:example:employee">  
    <empid
 resource="urn:example:key:2
">
  </Description>                              <!-- one fact -->

  <Description about="urn:example:key:2

    name
="Fred"

    jobcode="33"/>                            <!-- two facts -->

  <Description about="urn:example:department
">
    <depid
 resource="urn:example:key:3
">
  </Description>                              <!-- one fact -->

  <Description about="urn:example:key:3
    <name
>sales
</name
>
    <manager resource="urn:example:key:
2"/>
  </Description>                              <!-- two facts -->

</RDF>

You can see that the RDF content is hard to digest, which is why shorthand notations are popular. If you prefer procedures, read this next bit on scripting RDF first. If you're more visual and interactive, skip down to the bit about XUL templates. If you understand database transactions, try the example in [Hack #72], although that one is more advanced.

Use of RDF facts in Firefox is restricted to secure content and to remotely delivered XUL. There are plans to make some RDF available to ordinary web pages, but that functionality is not ready as of this writing.

6.14.2. Manipulate Content in Firefox's Head

To load an RDF file, you need a datasource that can turn XML into facts. Firefox provides a one-stop shop for datasources called the RDF Service. You can also make your own datasource. Here's a script that loads the contents of an RDF URL into a new datasource:

var klass = Components.classes["@mozilla.org/rdf/rdf-service;1"];
var rs = klass.createInstance(Components.interfaces.nsIRDFService);
var ds = rs.GetDataSourceBlocking("file:///tmp/test.rdf");

The URL specified is a Unix one. For Windows, try file:///C|/tmp/test.rdf. The GetdataSourceBlocking() method is preferable to GetdataSource() if you're just starting out. It avoids strange behavior caused by asynchronous loading.

After this code runs, the ds object holds the datasource. It should be retained as a variable as long as the set of RDF facts is needed. The datasource is invisibly full of RDF facts, and you can add, delete, modify, or query it as you see fit. In RDF land, asserting a fact pretty much means adding or inserting one. The methods that do so are defined in the file nsIRDFDataSource.idl. Read that file here:

http://lxr.mozilla.org/mozilla/source/rdf/base/idl/nsIRDFDataSource.idl

Here's some code that checks whether Fred is still employed and adds Joe if that's not the case. This kind of thing is similar to transaction processing in SQL:

var found      = false;
var ns         = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"

// RDF nodes
var pred_name  = rs.GetResource(ns + "name");
var pred_empid = rs.GetResource(ns + "empid");
var emp        = rs.GetResource("urn:example:employee");

var who        = rs.GetLiteral("Fred");
var dep        = rs.GetLiteral("sales");

// query for Fred
var empid = ds.GetSource(pred_name, who, true);
if ( empid != null ) {
  found = ds.HasAssertion(emp, pred_empid, empid, true);
}

// insert Joe
if (!found)
{
  who   = rs.GetLiteral("Joe");
  empid = rs.GetResource("urn:example:key:" + 5);
  ds.Assert(emp,   pred_empid, empid, true);
  ds.Assert(empid, pred_name,  who, true);
}

In this code, the variables pred_name and pred_empid are roughly equivalent to object property names: "employee 3 has a property called name with a value of Fred." pred is short for predicate. For RDF reasons, predicates must be URLs. The URLs are never downloaded. Each piece of a fact must be an nsIRDFNode object or a subtype of that type, so we have to construct them laboriously. The logic is simple: test for the presence of two facts and insert two more if necessary.

The test part is done with the GetSource() and HasAssertion() methods. The latter method tests for the existence of one fact. The former tests for any fact that has its second and third parts as described, and returns the unknown first part. That's how we find out Fred's employee number. Always use TRue as the final argument; it's almost irrelevant, but it's required. Here are the two facts we test for:

<- 2, name, Fred ->
<- employee, empid, 2 ->

If either one is missing, then Fred's not there, so Joe is added in using the Assert() method. You can add a fully described fact only. Here are Joe's two facts:

<- employee, empid, 5 ->
<- 5, name, Joe ->

Unlike SQL, there's no commit required, since the datasource is all in memory. The whole thing can be flushed back out to disk as RDF if required:

var rds = ds.QueryInterface(Components.interfaces.nsIRDFRemoteDataSource);
rds.Flush(  );

You can also receive notification if a datasource has changed [Hack #71], either because it was refreshed from its origin or because extra facts were inserted by scripts.

6.14.3. Display Facts with Templates

Firefox XUL templates are one of the hardest Mozilla technologies, but a lot can be done with them once the art is learned. When starting out, do not experiment with tree-based templates, which are tricky, but do regularly check all the XUL and RDF XML that you create with an XML validation tool [Hack #59] .

Here's a simple XUL page that displays the dogs and cats in an RDF datasource using two templates. This example uses the simple template syntax (as opposed to examples of the extended template syntax [Hack #72] ). Notice here (everywhere) how the template syntax hooks into the normal XUL syntax:

<?xml version="1.0"?>
<?xml-stylesheet href="chrome://global/skin"?>
<window 
 xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul">
  <hbox> 
    <vbox style="border:solid thin" 
     datasources="pets.rdf" ref="urn:test:dogs"
>
      <template>
        <label uri="rdf:*" 
         value="Dog: rdf:http://www.example.org/test#name"/>
      </template>
    </vbox>
    <vbox style="border:solid thin" 

     datasources="pets.rdf" ref="urn:test:cats">
      <template>
        <label uri="rdf:*" 
         value="Cat: rdf:
http://www.example.org/test#name"/>
      </template>
    </vbox>
  </hbox> 
</window>

A template always starts with the datasources attribute lodged with the parent tag of the <template> tag. Figure 6-14 shows one window that might result.

Figure 6-14. Simple XUL window showing two template queries

The contents of the window are driven by the datasource that is filled with facts from pets.rdf. Here's a sample file with three pets in it. The highlighted items are used by the XUL template code. Compare the two documents. This is standard RDF, but laid out in a way that templates can automatically understand:

<?xml version="1.0"?>
<RDF
  xmlns:Test="http://www.example.org/test#"
  xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

  <Description about="http://www.example.org">
    <Test:Container>
      <Seq about="urn:test:dogs">
        <li resource="urn:test:dog1"/>
        <li resource="urn:test:dog2"/>
      </Seq>
      <Seq about="urn:test:cats">
        <li resource="urn:test:cat1"/>
      </Seq>
    </Test:Container>
  </Description>
  <Description about="urn:test:dog1" Test:name="Fido"/>
  <Description about="urn:test:dog2" Test:name="Spot"/>
  <Description about="urn:test:cat1" Test:name="Puss"/>
</RDF>

Even though there's just one datasource and just one RDF file, the displayed content is revealed two different ways. If the two templates were identical, the same content would be displayed in each half of the page. If the RDF file changes, the displayed XUL changes without any change to the XUL file. The user interface is thus data-driven.