Team LiB
Previous Section Next Section

Using XSLT to Describe HTML Output Using XML Input

XSLT is an XML document type definition adopted by the World Wide Web consortium (W3C) to carry data about how elements in an XML document should be mapped to visually defined HTML or XHTML elements suitable for display in a common Web browser.

Several modules in PHP provide the tools needed to help you to transform XML documents into their HTML or XHTML counterparts for display purposes, according to the XSLT templates that you provide.

XSL Stylesheets

XSL stylesheets are essentially XML-formatted lists of templates designed to match elements in an XML file whose document type is known in advance. Each time an element listed in the XSL stylesheet is found in the XML input document, the element and its associated data in the XML input document are replaced or altered according to the instructions given in the matching XSL stylesheet template.

For PHP users, the primary use of XSL stylesheets is for XSLT from XML- into HTML- or XHTML-formatted files. In short, XML elements in a document containing data that must be presented on the World Wide Web are transformed into HTML or XHTML tags according to the instructions given in the XSLT stylesheet file. Thus, the logical process of parsing an XML document into HTML or XHTML using XSLT goes something like this:

1.
Parse an element in the XML-formatted input document.

2.
Search for a template in the XSLT stylesheet that matches the newly parsed element.

3.
If a template is found that matches the element in question, replace the original element and data with new elements and (optionally) altered or transformed data, and process any additional XSLT instructions that occur in the matched template.

4.
If no template is found to match the element in question, do not process the element.

5.
Repeat this process for each element in the XML file until the entire element tree of the file in question has been processed.

Given the number of tools available to PHP users through PHP extension modules such as DOM XML, XSLT, and XML, and given the flexibility of XSLT elements, the process can be made slightly more complex than thisbut this list of steps provides a simple and essential overview of the logic behind XSLT stylesheets and their interaction with XML files to produce HTML or XHTML documents.

XSLT File Format Basics

Before you learn how to use PHP extension modules and their functions to parse XML documents into Web-friendly HTML or XHTML documents, it is important that you have a basic understanding of the XSLT file format and be able to construct your own templates and stylesheet files. After you have this understanding, you will be able to create well-formatted XSL documents that correlate well to your XML data and your own HTML or XHTML output needs.

Because XSLT is a purpose-specific subset of the XSL document type, it must be generalized enough to allow for transformations from and to any elements that might occur in an XML file. To this end, XSLT files use the XML namespaces featureXSL instruction elements all use the xsl: namespace to differentiate themselves from the actual elements in the templates that are defined in the stylesheet. This namespace is defined in the XSL stylesheet at http://www.w3.org/1999/XSL/Transform.

Each XSLT file must contain a root <xsl:stylesheet> element (or a root <xsl:transform> elementthe two are considered synonymous in the W3C specification). This root element can then contain any of a number of XSLT instruction elements from the xsl: namespace, along with HTML or XHTML elements or elements from other namespaces, as needed. Together, they act as templates during the transformation from well-formatted XML to well-formatted and renderable HTML or XHTML data.

Among children of the <xsl:stylesheet> node, two XSLT instructions do the bulk of the work in most XSLT stylesheets. They are the <xsl:template> instruction and the <xsl:apply-templates> instruction. By combining these two instructions judiciously, relatively complex transformations can be described.

Commonly Used XSLT Instructions

For a basic introduction to XSLT processing, you should know at least four XSLT instruction elements. These are <xslt:template>, <xslt:apply-templates>, <xslt:value-of>, and <xslt:if>. Many more exist and these appear in brief in Table 9.1; for more information on the additional instructions that appear in Table 9.1, refer to the aforementioned W3C XSLT specification.

Table 9.1. Subset of Additional XSLT Instructions

Instruction

Use or Meaning

<xsl:attribute-set>

Use in conjunction with <xsl:attribute> and the xsl:use-attribute-set attribute to create named, extensible sets of element attributes for use in template output.

<xsl:decimal-format>

Use in conjunction with <xsl:value-of> and the format-number() function to streamline formatted numeric output.

<xsl:for-each>

Use to repeat processing of a given template segment for each matching element in the input document; requires select attribute.

<xsl:import>

Use to insert the children of another stylesheet's <xsl:stylesheet> element into the current stylesheet in place of the <xsl:import> element, with template rules present in the importing document overriding rules from the imported document; requires the href attribute and must be a top-level element.

<xsl:include>

Use to insert the children of another stylesheet's <xsl:stylesheet> element into the current stylesheet in place of the <xsl:include> element; requires the HRef attribute and must be a top-level element.

<xsl:number>

Use in conjunction with numeric functions, such as position(), and looping instructions, such as <xsl:for-each>, to output a sequence of formatted numbers over repeated calls, suitable for numbering paragraphs, items in a list, and so on.

<xsl:preserve-space>

Use to indicate that for the given list of space-separated elements, extra whitespace should be preserved in output (XSLT default); requires the elements attribute.

<xsl:sort>

Use to sort the order in which elements with the given name will be processed by matching XSLT templates; requires the select attribute.

<xsl:strip-space>

Use to indicate that for the given list of space-separated elements, extra whitespace should be stripped from output; requires the elements attribute.

<xsl:text>

Use to create literal text in processing output that would otherwise be altered or lostfor example, comments or entities.


The <xsl:template> instruction is used to apply a transformation template to a given node or element in the XML input file. The value of a single attribute, match, determines the list of node(s) or element(s) to which the template in question will apply. The value of the match attribute can be a simple element name, or it can be one of the patterns, as shown in Table 9.2, used to contextualize the match either within the document tree or by identity or other properties.

Table 9.2. Elementary XSLT Patterns and Meanings

Pattern

Description

*

Matches any element that occurs while processing the XML input file.

/

Matches only the root node of the XML input file.

.

Matches the current node.

elA/elB

Matches any child node elB that has specific parent node elA.

el1//elN

Matches any node elN that has node el1 as one of its ancestors.

id("NodeID")

Matches any element with an ID attribute of NodeID.

el1[N]

Matches any element el1 that is the Nth child of the same type belonging to this parent node.

position()=N

Matches any element that is the Nth child node of any type of its parent; use position=first() to match the first child node or position=last() to match the last child node.

el1[@attrib="Value"]

Matches any element el1 that has an attrib attribute with a value of Value.

@attrib

Matches any element that has an attrib attribute with a value that matches the current node's value of the attrib attribute.

el1|el2|elN|pat1|...

Matches any member of the bar-separated list, including element el1, element el2, element elN, pattern pat1, and so on.


The <xsl:apply-templates> instruction causes the XSLT processor to recursively process the XSLT stylesheet and the templates it includes in order to transform one or more of the child elements of the current match. The pattern value of the optional select attribute (refer again to Table 9.2) specifies which child elements of the current node should be matched against the list of templates in the XSLT stylesheet. If the select attribute is not present, all children of the current node will be processed using the templates present in the XSLT stylesheet.

The <xsl:value-of> instruction causes the XSLT processor to insert the text data from an element or attribute in the document tree. By supplying a pattern from Table 9.2 as the select attribute for <xsl:value-of>, you can instruct the XSLT processor to insert text data from nearly any node in the document tree at any place in your templates.

The <xsl:if> instruction allows for conditional processing in XSLT templates. The test attribute contains the expression to test, which is formed using patterns from Table 9.2 and operators from Table 9.4. When the statement contained in the test attribute is true, the section of the template within the <xsl:if> element will be processed. Note that the operators shown in Table 9.4 are a subset of comparison operators defined by the XML Path Language (XPath), which is documented completely at http://www.w3.org/TR/xpath.

Table 9.4. Operators for Forming Boolean Expressions

Operator

Meaning

=

True if values on either side of the operator match (are equal).

> or &gt;

True if the value on the left side of the operator is greater than the value on the right side of the operator.

< or &lt;

True if the value on the left side of the operator is less than the value on the right side of the operator.

!=

True if the values on either side of the operator do not match (are not equal).

>= or &gt;=

True if the value on the left side of the operator is greater than or equal to the value on the right side of the operator.

<= or &lt;=

True if the value on the right side of the operator is less than or equal to the value on the right side of the operator.


For syntactic and other information about the XSLT instructions shown in Table 9.1, or for details on additional XSLT instructions not mentioned here, refer to the aforementioned W3C documentation for XSLT, "XSL Transformations," REC-xslt-19991116.

Using XSLT Instruction Elements with XSLT Patterns

The pattern given in the <xsl:template> element's match attribute or the <xsl:apply-templates> or <xsl:value-of> elements' select attributes determines whether a given template or instruction in the XSLT input file will be processed. Though the pattern used by the XSLT template for matching can be a simple element name situated within a specific context (that is, a similar relative position in the document object model tree), a number of more complex or flexible patterns can also be used when attempting to match nodes or elements in the input file.

Table 9.2 gives a partial list of the types of patterns that can be used in XSLT instruction attributes for template matching or evaluation, along with the meaning of each. Refer to the W3C's aforementioned XSLT specification for a complete list.

Multiple elements from this pattern-matching grammar can be used in single match or select attribute values to construct fairly sophisticated match criteria. Table 9.3 shows some sample patterns that combine these grammatical operations and their meanings.

Table 9.3. Sample Patterns and Their Meanings

Pattern

Meaning

forest//*|tree/leaf|id('Spruce')

Matches any descendant of a <forest> element, any <leaf> element that is the child of a <tree> element or any element whose ID is 'Spruce'.

mountain//river/bend|bend[1]

Matches any <bend> element that is the child of a <river> element and that has a <mountain> element as its ancestor or any <bend> element that is the first <bend> element child of its parent node.

position()=first()|city[3]

Matches either the first child node of the current parent or the third <city> child element of the current parent.

city[@class='Major']|nation//city

Matches any <city> element with class 'Major' or any <city> element that has as one of its ancestor(s) a <nation> element.


By using the <xsl:template> and <xsl:apply-templates> elements judiciously in concert with the flexible pattern-matching grammar shown in Table 9.2, it is possible to create complex and nuanced sets of transformation rules for XML input documents containing relatively complex element trees.

By combining the patterns listed previously with the XPath comparison operators shown in Table 9.4 and (when desired) numerical or text data, Boolean expressions can be constructed. Such expressions can then be used in concert with XSLT instructions (for example, as the test attribute in an <xslt:if> instruction) for conditional statements or similar processing.

When you use the comparison operators listed in Table 9.4, note that it is more correct to use the entity form (&gt; or &lt;) of the greater-than or less-than operators because of these characters' use as grammatical elements in XML and XSLT documents.

Sample XML to HTML Transformation Using XSLT

Before delving into PHP and the modules that can be used to apply XSLT transformations, you should take some time to study the following sample XSLT stylesheet. The stylesheet shown in Listing 9.1 uses only a few XSLT instructions, along with patterns of the type discussed in Table 9.2. When applied against the XML input file shown in Listing 9.2, it produces the output shown in Listing 9.3.

A detailed discussion of these listings, including a step-by-step account of the logical flow of the XSLT stylesheet in Listing 9.1, follows the three listings. Note that the lines in each listing have been numbered here so that it will be easier to discuss the function of each line in the explanations that follow.

Listing 9.1. Sample XSLT Stylesheet forest.xsl
1   <?xml version="1.0" encoding="ISO-8859-1"?>
2   <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
3
4   <!-- This is a sample XSLT stylesheet to transform a simple XML file into
5        HTML output suitable for rendering in a Web browser. Note that the file's
6        root node is an <xsl:stylesheet> node and that the version and xmlns
7        attributes are appropriately defined. -->
8
9       <xsl:template match="/">
10       <!-- This template matches only the root of the XML input file -->
11
12           <html>
13               <head>
14                   <title>Natural Features of Forests</title>
15               </head>
16               <body>
17
18                   <!-- Now apply templates to all children -->
19                     <xsl:apply-templates />
20
21                 </body>
22           </html>
23
24       </xsl:template>
25
26       <xsl:template match="nation">
27
28           <h1>National forests in <i>
29               <xsl:value-of select="name" /> </i> </h1>
30           <b><xsl:value-of select="name" /> Population: </b>
31           <xsl:value-of select="population" /> <br>
32           <b><xsl:value-of select="name" /> Size: </b>
33           <xsl:value-of select="size" /> <br>
34
35           <xsl:apply-templates select="forest" />
36
37       </xsl:template>
38
39       <xsl:template match="forest">
40
41           <h2>National Forest
42               <i><xsl:value-of select="name" /></i> </h2>
43           <b>Size: </b> <xsl:value-of select="size" /> <br>
44
45           <xsl:apply-templates select="naturalfeatures" />
46
47       </xsl:template>
48
49       <xsl:template match="naturalfeatures">
50
51           <b>Trees: </b> <xsl:apply-templates select="tree" /> <br>
52           <b>Rivers: </b> <xsl:apply-templates select="river" /> <br>
53           <b>Mountains: </b> <xsl:apply-templates select="mountain" /> <br>
54
55       </xsl:template>
56
57       <xsl:template match="tree|river|mountain">
58       <!-- This template matches <tree>, <river>, and <mountain> elements;
59            it outputs the content of the element, followed by a comma if
60            and only if the element is not the last child of its parent -->
61
62           <xsl:value-of select="." />
63           <xsl:if test="position()!=last()">, </xsl:if>
64
65       </xsl:template>
66
67   </xsl:stylesheet>

Listing 9.2. Sample XML Input File freedomland.xml
1   <?xml version="1.0" encoding="ISO-8859-1"?>
2   <nation>
3       <name>Freedomland</name>
4       <size>10,000 Square Miles</size>
5       <population>417,267</population>
6       <forest>
7           <name>Jim's Woods</name>
8           <size>100 Acres</size>
9           <naturalfeatures>
10               <tree>Spruce</tree>
11               <tree>Fir</tree>
12               <tree>Pine</tree>
13               <river>Northern Bend River</river>
14               <river>Walleye River</river>
15               <mountain>Hell's Peak</mountain>
16           </naturalfeatures>
17       </forest>
18       <forest>
19           <name>Dark West Woods</name>
20           <size>200 Square Miles</size>
21           <naturalfeatures>
22              <tree>Dendrite King Juniper</tree>
23              <tree>Twisted Birch</tree>
24              <river>Devil's Bend Creek</river>
25              <mountain>Crazy Crag</mountain>
26              <mountain>Dead Soldier Swell</mountain>
27           </naturalfeatures>
28           <manmadefeatures>
29              <campsite>Jackson Memorial Campsite</campsite>
30              <dam>Devil's Bend Dam</dam>
31           </manmadefeatures>
32       </forest>
33       <majorcity>
34          <name>Citizenville</name>
35          <size>26 Square Miles</size>
36           <population>236,717</population>
37           <transportation>
38               <highway>Route 1</highway>
39               <highway>Interstate 2</highway>
40               <masstransit>Metro Bus</masstransit>
41               <masstransit>Metro Train</masstransit>
42               <masstransit>Citizen Rail Inc.</masstransit>
43          </transportation>
44       </majorcity>
45   </nation>

Listing 9.3. HTML Output, forest.xsl and freedomland.xml (formatted for readability)
1     <html>
2         <head>
3
4             <meta http-equiv="Content-Type"
5                 content="text/html; charset=UTF-8">
6             <title>XSLT Example:
7                 Natural Features of Forests</title>
8
9         </head>
10         <body>
11
12             <h1>National forests in <i>Freedomland</i></h1>
13             <b>Freedomland Population: </b>417,267<br>
14             <b>Freedomland Size: </b>10,000 Square Miles<br>
15
16             <h2>National Forest <i>Jim's Woods</i></h2>
17             <b>Size: </b>100 Acres<br>
18             <b>Trees: </b>Spruce, Fir, Pine<br>
19             <b>Rivers: </b>Northern Bend River, Walleye River<br>
20             <b>Mountains: </b>Hell's Peak<br>
21
22             <h2>National Forest <i>Dark West Woods</i></h2>
23             <b>Size: </b>200 Square Miles<br>
24             <b>Trees: </b>Dendrite King Juniper, Twisted Birch<br>
25             <b>Rivers: </b>Devil's Bend Creek<br>
26             <b>Mountains: </b>Crazy Crag, Dead Soldier Swell<br>
27
28         </body>
29     </html>

The following steps describe the logical flow of the XSLT stylesheet shown in Listing 9.1. Follow along with the XML input file in Listing 9.2 and the sample output in Listing 9.3 to gain some understanding of how XSLT stylesheets work in the broadest sense.

  • Lines 1 and 2 declare the file to be an XML format file and, more specifically, an XSLT stylesheet.

  • Lines 924 form a single template that matches the root element of the input XML file; all elements in the input XML file are children of this element.

  • At line 19, as the root element of the input document is being processed, the XSLT stylesheet is recursively applied to the input file, so that other matched elements in the file can also be processed.

  • Lines 2637 are matched first in the second pass through the XSLT stylesheet, generating output for the <nation> element.

  • Lines 2933 in particular generate output from the values of various child elements of the <nation> element.

  • At line 35, as the <nation> element is being processed, the XSLT stylesheet is recursively applied yet again to the input file, this time specifically to process the <forest> elements and their children.

  • Lines 3947 form the template that matches the <forest> elements selected in step 6.

  • Lines 42 and 43 in particular generate output from the values of various child elements of each <forest> element.

  • At line 45, as each <forest> element is being processed, the XSLT stylesheet is recursively applied once more to the input file, this time specifically to process the <naturalfeatures> elements and their children.

  • Lines 4955 form the template that matches the <naturalfeatures> elements selected in step 9.

  • At lines 5153, in processing each <naturalfeatures> element, the XSLT stylesheet is recursively applied again, once each for <tree>, <river>, and <mountain> elements.

  • Lines 5765 form the template that matches the <tree>, <river>, and <mountain> elements selected in step 11.

  • Lines 62 and 63 output the value of the current <tree>, <river>, or <mountain> element, appending a following comma and space only if the element in question is not the last of its kind to be processed. Because no further <xsl:apply-templates> instructions are in this template, the recursion ends here.

Note that in each template that is applied (with the exception of the last), the <xsl:apply-templates> instruction is called to prune the document tree further and apply a new matching template to the smaller set of elements. This process is repeated until all desired elements have been processed, and output in the desired format has been generated.

Note also that unmatched elements in the input XML file are not processed and generate no output at all. This can be seen in the case of the <majorcity> element and its children or the <manmadefeatures> element and its children in the sample XML file shown in Listing 9.2.

Now that you have gained a basic understanding of XSLT stylesheets using the staple <xsl:template> and <xsl:apply-templates> instruction elements, it's time to learn how to instruct PHP to apply XSLT stylesheets to XML input files to generate HTML or XHTML output on-the-fly.

There are several ways to cause on-the-fly XML transformations using PHP; the method that you prefer will depend in large part on the version of PHP that you use or that is shipped by your operating system maintainer or manufacturer.

    Team LiB
    Previous Section Next Section