XSL Transformations
In this chapter, I’m going to start working with the Extensible Styles Language (XSL). XSL has two parts—a transformation language and a formatting language.
The transformation language lets you transform documents into different forms, while the formatting language actually formats and styles documents in various ways. These two parts of XSL can function quite independently, and you can think of XSL as two languages, not one. In practice, you often transform a document before formatting it because the transformation process lets you add the tags the formatting process requires. In fact, that is one of the main reasons that W3C supports XSLT as the first stage in the formatting process, as we’ll see in the next chapter.
This chapter covers the transformation language, and the next details the formatting language. The XSL transformation language is often called XSLT, and it has been a W3C recommendation since November 11, 1999. You can find the W3C recommendation for XSLT at www.w3.org/TR/xslt.
XSLT is a relatively new specification, and it’s still developing in many ways. There are some XSLT processors of the kind we’ll use in this chapter, but bear in mind that the support offered by publicly available software is not very strong as yet. A few packages support XSLT fully, and we’ll see them here. However, no browser supports XSLT fully yet.
I’ll start this chapter with an example to show how XSLT works.
Using XSLT Style Sheets in XML Documents
You use XSLT to manipulate documents, changing and working with their markup as you want. One of the most common transformations is from XML documents to HTML documents, and that’s the kind of transformation we’ll see in the examples in this chapter.
To create an XSLT transformation, you need two documents—the document to transform, and the style sheet that specifies the transformation. Both documents are well-formed XML documents.
Here’s an example; this document, planets.xml, is a well-formed XML document that holds data about three planets—Mercury, Venus, and Earth. Throughout this chapter, I’ll transform this document to HTML in various ways. For programs that can understand it, you can use the <?xml-stylesheet?> processing instruction to indicate what XSLT style sheet to use, where you set the type attribute to "text/xml" and the href attribute to the URI of the XSLT style sheet, such as planets.xsl in this example (XSLT style sheets usually have the extension .xsl).
<?xml version="1.0"?> <?xml-stylesheet type="text/xml" href="planets.xsl"?> <PLANETS> <PLANET> <NAME>Mercury</NAME> <MASS UNITS="(Earth = 1)">.0553</MASS> <DAY UNITS="days">58.65</DAY> <RADIUS UNITS="miles">1516</RADIUS> <DENSITY UNITS="(Earth = 1)">.983</DENSITY> <DISTANCE UNITS="million miles">43.4</DISTANCE><!--At perihelion--> </PLANET> <PLANET> <NAME>Venus</NAME> <MASS UNITS="(Earth = 1)">.815</MASS> <DAY UNITS="days">116.75</DAY> <RADIUS UNITS="miles">3716</RADIUS> <DENSITY UNITS="(Earth = 1)">.943</DENSITY> <DISTANCE UNITS="million miles">66.8</DISTANCE><!--At perihelion--> </PLANET> <PLANET> <NAME>Earth</NAME> <MASS UNITS="(Earth = 1)">1</MASS> <DAY UNITS="days">1</DAY> <RADIUS UNITS="miles">2107</RADIUS> <DENSITY UNITS="(Earth = 1)">1</DENSITY> <DISTANCE UNITS="million miles">128.4</DISTANCE><!--At perihelion--> </PLANET> </PLANETS>
XSL Style Sheets
Here’s what the style sheet planets.xsl might look like. In this case, I’m converting planets.xml into HTML, stripping out the names of the planets, and surrounding those names with HTML <P> elements:
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="PLANETS"> <HTML> <xsl:apply-templates/> </HTML> </xsl:template> <xsl:template match="PLANET"> <P> <xsl:value-of select="NAME"/> </P> </xsl:template> </xsl:stylesheet>
All right, we have an XML document and the style sheet we’ll use to transform it. So, how exactly do you transform the document?
Making a Transformation Happen
You can transform documents in three ways:
-
In the server. A server program, such as a Java servlet, can use a style sheet to transform a document automatically and serve it to the client. One such example is the XML Enabler, which is a servlet that you’ll find at the XML for Java Web site, www.alphaworks.ibm.com/tech/xml4j.
-
In the client. A client program, such as a browser, can perform the transformation, reading in the style sheet that you specify with the <?xml-stylesheet?> processing instruction. Internet Explorer can handle transformations this way to some extent.
-
With a separate program. Several standalone programs, usually based on Java, will perform XSLT transformations. I’ll use these programs primarily in this chapter.
In this chapter, I’ll use standalone programs to perform transformations because those programs offer by far the most complete implementations of XSLT. I’ll also take a look at using XSLT in Internet Explorer.
Two popular programs will perform XSLT transformations: XT and XML for Java.
James Clark’s XT
You can get James Clark’s XT at www.jclark.com/xml/xt.html. Besides XT itself, you’ll also need a SAX-compliant XML parser, such as the one we used in the previous chapter that comes with the XML for Java packages, or James Clark’s own XP parser, which you can get at www.jclark.com/xml/
xp/ index.html.
XT is a Java application. Included in the XT download is the JAR file you’ll need, xt.jar. The XT download also comes with sax.jar, which holds James Clark’s SAX parser. You can also use the XML for Java parser with XT; to do that, you must include both xt.jar and xerces.jar in your CLASSPATH, something like this:
%set CLASSPATH=%CLASSPATH%;C:\XML4J_3_0_1\xerces.jar;C:\xt\xt.jar;
Then you can use the XT transformation class, com.jclark.xsl.sax.Driver. You supply the name of the SAX parser you want to use, such as the XML for Java class org.apache.xerces.parsers.SAXParser, by setting the com.jclark.xsl.sax.parser variable with the java -D switch. Here’s how I use XT to transform planets.xml, using planets.xsl, into planets.html:
%java -Dcom.jclark.xsl.sax.parser= org.apache.xerces.parsers.SAXParser com.jclark.xsl.sax.Driver planets.xml planets.xsl planets.html
XT is also packaged as a Win32 exe. To use xt.exe, however, you will need the Microsoft Java Virtual Machine (VM) installed (included with Internet Explorer). Here’s an example in Windows that performs the same transformation as the previous command:
C:\>xt planets.xml planets.xsl planets.html
XML for Java
You can also use the IBM alphaWorks XML for Java XSLT package, called LotusXSL. LotusXSL implements an XSLT processor in Java that can be used from the command line, in an applet or a servlet, or as a module in another program. By default, it uses the XML4J XML parser, but it can interface to any XML parser that conforms to the either the DOM or the SAX specification.
Here’s what the XML for Java site says about LotusXSL: "LotusXSL 1.0.1 is a complete and a robust reference implementation of the W3C Recommendations for XSL Transformations (XSLT) and the XML Path Language (XPath)."
You can get LotusXSL at www.alphaworks.ibm.com/tech/xml4j; just click the XML item in the frame at left, click LotusXSL, and then click the Download button (or you can go directly to www.alphaworks.ibm.com/tech/lotusxsl, although that URL may change). The download includes xerces.jar, which includes the parsers that the rest of the LotusXSL package uses (although you can use other parsers), and xalan.jar, which is the LotusXSL JAR file. To use LotusXSL, make sure that you have xalan.jar in your CLASSPATH; to use the XML for Java SAX parser, make sure that you also have xerces.jar in your CLASSPATH, something like this:
%set CLASSPATH= %CLASSPATH%;C:\lotusxsl_1_0_1\xalan.jar;C:\xsl\lotusxsl_1_0_1\xerces.jar;
Unfortunately, the LotusXSL package does not have a built-in class that will take a document name, a style sheet name, and an output file name like XT. However, I’ll create one named xslt, and you can use this class quite generally for transformations. Here’s what xslt.java looks like:
import org.apache.xalan.xslt.*; public class xslt { public static void main(String[] args) { try { XSLTProcessor processor = XSLTProcessorFactory.getProcessor(); processor.process(new XSLTInputSource(args[0]), new XSLTInputSource(args[1]), new XSLTResultTarget(args[2])); } catch (Exception e) { System.err.println(e.getMessage()); } } }
After you’ve set the CLASSPATH as indicated, you can create xslt.class with javac like this:
%javac xslt.java
The file xslt.class is all you need. After you’ve set the CLASSPATH as indicated, you can use xslt.class like this to transform planets.xml, using the style sheet planets.xsl, into planets.html:
%java xslt planets.xml planets.xsl planets.html
What does planets.html look like? In this case, I’ve set up planets.xsl to simply place the names of the planets in <P> HTML elements. Here are the results, in planets.html:
<HTML> <P>Mercury</P> <P>Venus</P> <P>Earth</P> </HTML>
That’s the kind of transformation we’ll see in this chapter.
There’s another way to transform XML documents without a standalone program—you can use a client program such as a browser to transform documents.
Using Browsers to Transform XML Documents
Internet Explorer includes a partial implementation of XSLT; you can read about Internet Explorer support at http://msdn.microsoft.com/xml/XSLGuide/. That support is based on the W3C XSL working draft of December 16, 1998 (which you can find at www.w3.org/TR/1998/WD-xsl-19981216.html); as you can imagine, things have changed considerably since then.
To use planets.xml with Internet Explorer, I have to make a few modifications. For example, I have to convert the type attribute in the <?xml-stylesheet?> processing instruction from "text/xml" to "text/xsl":
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="planets.xsl"?> <PLANETS> <PLANET> <NAME>Mercury</NAME> <MASS UNITS="(Earth = 1)">.0553</MASS> <DAY UNITS="days">58.65</DAY> <RADIUS UNITS="miles">1516</RADIUS> <DENSITY UNITS="(Earth = 1)">.983</DENSITY> <DISTANCE UNITS="million miles">43.4</DISTANCE><!--At perihelion--> </PLANET> <PLANET> <NAME>Venus</NAME> <MASS UNITS="(Earth = 1)">.815</MASS> <DAY UNITS="days">116.75</DAY> <RADIUS UNITS="miles">3716</RADIUS> <DENSITY UNITS="(Earth = 1)">.943</DENSITY> <DISTANCE UNITS="million miles">66.8</DISTANCE><!--At perihelion--> </PLANET> <PLANET> <NAME>Earth</NAME> <MASS UNITS="(Earth = 1)">1</MASS> <DAY UNITS="days">1</DAY> <RADIUS UNITS="miles">2107</RADIUS> <DENSITY UNITS="(Earth = 1)">1</DENSITY> <DISTANCE UNITS="million miles">128.4</DISTANCE><!--At perihelion--> </PLANET> </PLANETS>
I can also convert the style sheet planets.xsl for use in Internet Explorer. A major difference between the W3C XSL recommendation and the XSL implementation in Internet Explorer is that Internet Explorer doesn’t implement any default XSL rules (which I’ll discuss in this chapter). This means that I have to explicitly include an XSL rule for the root of the document, which you specify with /. I also have to use a different namespace in the style sheet, http://www.w3.org/TR/WD-xsl, and omit the version attribute in the <xsl:stylesheet> element:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <HTML> <xsl:apply-templates/> </HTML> </xsl:template> <xsl:template match="PLANETS"> <xsl:apply-templates/> </xsl:template> <xsl:template match="PLANET"> <P> <xsl:value-of select="NAME"/> </P> </xsl:template> </xsl:stylesheet>
You can see the results of this transformation in Figure 13.1.
Figure 13.1 Performing an XSL transformation in Internet Explorer.
We now have an overview of XSL transformations and have seen them at work. It’s time to see how to create XSLT style sheets in detail.