Helena Kupkova, known before by FastXML (she claimed it's 5x faster than MSXML), now working for Microsoft on XmlReader and who's behind amazing "Microsoft XML Diff and Patch 1.0" tool, has published an article at MSDN XML Dev Center called "XML Reader with Bookmarks".
In the article Helena discusses XmlBookmarkReader, which is XmlReader implementation enabling you to set a bookmark at an XML node, read on and then rewind the reader back to the bookmarked node if you wish. That's really cool. It's implemented by caching all XML nodes after bookmarked one along with their context (such as Depth, attributes, namespaces etc). If you think for a moment about how XmlReader works you realize that it can be modeled as just traversing of a non-circular singly linked list of nodes. The nodes in that list are ordered in a document order (except for attributes and namespaces), which is usually called preorder tree traversal in non-XML circles. So at any moment you can start recording nodes XmlReader reads to the linked list and then come back and replay nodes by reading them from that list instead of source XML.
As a demonstration Helena shows an example of XML filtering, which requires look ahead - like selecting "/books/book[contains(title, 'XML')]". Obviously this can't be done with XmlTextReader, nor with XPathReader, but done easily with XmlBookmarkReader. As a matter of fact, XmlBookmarkReader is the feature XPathReader really needs. We can leverage XmlBookmarkReader when evaluating predicates in XPathReader so we can get back to the context node once we done with a predicate. Then XPathReader will finally be able to work with notorious "book[contains(title, 'XML')]". That's the way to go. With "look ahead" and "look back through ancestors" features XPathReader can finally be really useful. Btw, XPathReader workspace is open to everybody interested to participate. I just found out I'm admin there :)
PS. There is a small typo in the XmlBookmarkReader.cs - the line
if ( bookmarks.Count > 0 != null ) {should probably be
if ( bookmarks != null ) {
Yes, you can use it in XPathReader, that would be great! It has the same licence/EULA as XPathReader (the original one from Dare's MSDN article) and that is a licence we ship with samples.
I should have tried to compile the bookmarking reader with .Net 1.1. I was carefull not to use any new features, but I should have done that nevertheless.
Hi Helena! I'm excited to see you on my blog!
Looks like you are working with C# 2.0 :) VS.NET 2003 does complain about that line. And also about String.Contains() method in the sample class.
Btw, what's the license? I mean would you mind if I go and use XmlBookmarkReader in XPathReader project?
Hi Oleg, thanks for pointing out the typo. I had the bookmarks stores in a linked list before but then I switched to a hastable. Interesting thing is that the C# compiler does not complain when comparing bool and null.
Has potential, but like Mark's post "Combining the XmlReader and XmlWriter classes for simple streaming transformations" last week, it seems like we need to be thinking of real editable streams that are random-access compatible instead of either rewriting the whole document or cache them into a mini-DOM.