Perhaps some tricky implementation of XSLT could figure out if a stylesheet is streamable and switch to a streaming strategy.That would be rather effective optimization indeed. But how that could be implemented in XSLT/XQuery processor? Obviously full-blown stylesheet analysis would be feasible only having schema information available (that means XSLT 2.0/XQuery 1.0), but even without it it's still easy to detect some common streaming-friendly cases, such as:
1. Renaming elements or changing namespaces, e.g.:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="foo"> <bar> <xsl:apply-templates select="@*|node()"/> </bar> </xsl:template> </xsl:stylesheet>It's easy to see that the stylesheet has identity transformation and a template for "foo" element, which actually replaces "foo" witrh "bar". Above is detectable and could be done more effective with XmlReader or XmlReader/XmlWriter pipeline.
2. Translating attributes to elements or similar, e.g.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="foo"> <xsl:copy> <xsl:for-each select="@*"> <xsl:element name="{name()}"> <xsl:value-of select="."/> </xsl:element> </xsl:for-each> <xsl:apply-templates select="node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>Also that's detectable what above stylesheet is doing and is implemenatable with only XmlReader or XmlReader/XmlWriter internally instead.
3. Pretty-printing using XSLT - frequent case, easily detectable - an ideal candidate for optimization. Just stream input through XmlTextWriter internally.
4. Adding root element or adding header/footer - ditto.
5. Changing PIs in the prolog (<?xml-stylesheet>).
6. What else?
Obviously to gain something with all above implemented XSLT processor should be given plain Stream/TextReader/XmlReader as input, not any already-in-memory XML store.
todoList.Add(this, Priority.Normal);
This is a very difficult problem. I'd love to read some whitepapers on solving this problem. :)
There was an incremental xslt processor white paper you blogged about a while back. Some of the ideas presented there can be applied to determine the effects of various xslt constructs.
http://www.tkachenko.com/blog/archives/000092.html