Tricky XSLT optimization

| 3 Comments | No TrackBacks

Rick Jelliffe writes:

Perhaps some tricky implementation of XSLT could figure out if a stylesheet is streamable and switch to a streaming strategy.
That would be rather effective optimization indeed. But how that could be implemented in XSLT/XQuery processor? Obviously full-blown stylesheet analysis would be feasible only having schema information available (that means XSLT 2.0/XQuery 1.0), but even without it it's still easy to detect some common streaming-friendly cases, such as:

1. Renaming elements or changing namespaces, e.g.:
<xsl:stylesheet version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="foo">
    <bar>
      <xsl:apply-templates select="@*|node()"/>
    </bar>
  </xsl:template>
</xsl:stylesheet>
It's easy to see that the stylesheet has identity transformation and a template for "foo" element, which actually replaces "foo" witrh "bar". Above is detectable and could be done more effective with XmlReader or XmlReader/XmlWriter pipeline.

2. Translating attributes to elements or similar, e.g.
<xsl:stylesheet version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="foo">
    <xsl:copy>
      <xsl:for-each select="@*">
        <xsl:element name="{name()}">
          <xsl:value-of select="."/>
        </xsl:element>
      </xsl:for-each>
      <xsl:apply-templates select="node()"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>
Also that's detectable what above stylesheet is doing and is implemenatable with only XmlReader or XmlReader/XmlWriter internally instead.

3. Pretty-printing using XSLT - frequent case, easily detectable - an ideal candidate for optimization. Just stream input through XmlTextWriter internally.

4. Adding root element or adding header/footer - ditto.

5. Changing PIs in the prolog (<?xml-stylesheet>).

6. What else?

Obviously to gain something with all above implemented XSLT processor should be given plain Stream/TextReader/XmlReader as input, not any already-in-memory XML store.

Related Blog Posts

No TrackBacks

TrackBack URL: http://www.tkachenko.com/cgi-bin/mt-tb.cgi/257

3 Comments

todoList.Add(this, Priority.Normal);

This is a very difficult problem. I'd love to read some whitepapers on solving this problem. :)

There was an incremental xslt processor white paper you blogged about a while back. Some of the ideas presented there can be applied to determine the effects of various xslt constructs.

http://www.tkachenko.com/blog/archives/000092.html

Leave a comment