« Microsoft Paradox | Main | "Find the Bug" Available Soon »

October 05, 2004

Zen and the Art of XSLT

I have been learning XSLT (eXtensible Stylesheet Language Transform) for a project at work. XSLT is a language to translate XML into something else (different XML, HTML, broccoli soup, etc).

What I was trying to do was mostly around enforcing a particular XML schema, as opposed to transforming into a different schema.

I bought an XSLT book, but it wasn't that good. I bought it because most of the books were 500 pages and this one was 300 pages. The 500 page books all had lots of examples and then a 200-page reference at the end. The 300 page book had only the 300 pages of examples, without the reference. And it had a lousy index that only referenced the first time something was used. So the only way to figure out the specifics of how something worked was literally to scan through the book looking for examples that used it.

The effect of this was that I wound up redoing by .xsl file several times. Maybe it was for the best, since I think I understand XSLT better now than if I had just copied an example.

The interesting thing is how my XSLT progressed in different iterations. For example, I was trying to enforce a rule that there was a big outer tag "foo", and that "foo" contained a tag called "xyzzy", and that "xyzzy" could only have a subtag called "ossifrage". First I started out with some brawny, C-like XSL:

<xsl:template match="foo">
  <foo>
    <xyzzy><xsl:apply-templates select="xyzzy"/></xyzzy>
  </foo>
</xsl:template>

<xsl:template match="foo/xyzzy/*">
  <xsl:choose>
    <xsl:when test="name(.)='ossifrage'">
      <xsl:copy><xsl:apply-templates/></xsl:copy>
    </xsl:when>
  </xsl:choose>
</xsl:template>

So I'm explicitly laying down the <foo> tags, and then I match each child of "xyzzy" and do an explicit check for "ossifrage" subtags. Plus, if "foo" has multiple "xyzzy" subtags, their contents just get lumped together. And if "xyzzy" has multiple "ossifrage" subtabs, I process them all. Ugh!

Then I decided to relax a bit and let the XSLT pattern matcher do some of the work. Plus, I need to ensure there was only one "xyzzy" subtag below each "foo" tag. So I changed it to:

<xsl:template match="foo">
  <foo>
    <xyzzy><xsl:apply-templates select="xyzzy[1]"/></xyzzy>
  </foo>
</xsl:template>

<xsl:template match="foo/xyzzy">
  <xsl:apply-templates select="ossifrage"/>
</xsl:template>

<xsl:template match="xyzzy/ossifrage">
  <xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>

This is better, because I'm letting the pattern matcher figure out if "xyzzy" has an "ossifrage" subtag. And I use the "select=" attribute of the xsl:apply-templates for "foo/xyzzy" to limit the subtags to "ossifrage". Plus, I only will match one "xyzzy" subtag of "foo", because of the "[1]" predicate in the select="xyzzy[1]" attribute.

But, I still am indicating in two places that "xyzzy" should have "ossifrage" subtags. And I don't complain about unexpected tags, I just silently ignore them.

So then, I submitted myself to simply go with the XSLT flow, which really means shifting your mindset from imperative to functional programming (it helped that I figured out how the priority attribute works; my book was too old to mention it). So my .xsl now looks like:

<xsl:template match="foo" priority="2">
  <xsl:copy>
    <xsl:apply-templates select="xyzzy"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="foo/xyzzy[1]" priority="2">
  <xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>

<xsl:template match="foo/xyzzy/ossifrage" priority="2">
  <xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>

So now I don't emit any tags explicitly; it's all done by xsl-copy. I remove the unneeded select= lines from the apply-templates tags (NOTE: I keep the select="xyzzy" one under "foo" because I want to order the emitted subtags in that case; otherwise I wouldn't need it). I moved the enforcement of the '"foo" only has one "xyzzy" subtag' rule from a select attribute to a match attribute, which is cleaner.

And finally, I can now add a lower-priority match to catch anything I don't want, like a second "xyzzy" tag below "foo":

<xsl:template match="*" priority="0">
  <xsl:message terminate="yes">
  </xsl:message>
</xsl:template>

And I'm much happier.

I need to improve the processing of unexpected attributes; I still can't seem to make it match on "@*", even when I follow the examples. It could be the XSL processor I am using (which is just the standard "four lines of C#" one that is shown as an example in the Visual Studio documentation). Off to explore I go...

Posted by AdamBa at October 5, 2004 11:43 AM

Trackback Pings

TrackBack URL for this entry:
http://proudlyserving.com/cgi-bin/mt-tb.cgi/46

Comments