Now I desperately need one more hour in a day, it's a pity the Earth is so close to the Sun, 24 hours is really not enough for us!
August 2003 Archives
In related news - yesterday I've been given Mono CSV commit access, thanks to Ben and Miguel. Seems like I'm the first Oleg amongst Mono guys, so my account is just "oleg".
Now I desperately need one more hour in a day, it's a pity the Earth is so close to the Sun, 24 hours is really not enough for us!
Am I right that it's impossible to validate in-memory XmlDocument without serializing it to string and reparsing?
XmlValidatingReader requires instance of XmlTextReader and what's worse it uses its internal properties, not exposed as XmlTextReader public API, so that won't work even if one would provide fake instance of XmlTextReader, which encapsulates XmlNodeReader within. :(
For those interested - I'm selling old hebrew book "Diaspora and Assimilation" by Zeev Zhabotinsky. Just found it recently in the loft :) Published in 1936 in Tel-Aviv (Palestine at that time).
From Mono CVS Commit Rules:
Also, remember to pat yourself on the back after the commit, smile and think we're a step closer to a better free software world.
According to XPath data model an element node may have a unique identifier (ID), which can be used then to select a node by its ID using XPath's id() function and to navigate using XPathNavigator.MoveToId method. Querying by ID is extremely effective becuse in fact it doesn't require traversing the XML document, instead almost every XPath implementation I've ever seen just keeps internal hashtable of IDs, hence querying by ID is merely a matter of getting a value from a hashtable by a key.
XPath 1.0 Recommendation published back in 1999 of course says nothing about XML Schema, which was published in year 2001. May be that's the reason why XmlDocument and XPathDocument (and therefore XslTransform) classes in .NET don't support above tasty functionality when XML document is defined using XML Schema. Only DTD is supported unfortunately. Even if you have defined xs:ID typed attribute in your schema and validated document reading it via XmlValidatingReader it won't work. As a matter of fact it does work in MSXML4 though. Whether it's right or wrong - I have no idea, it's quite debatable question. On the one hand XPath spec explicitly says "If a document does not have a DTD, then no element in the document will have a unique ID.". On the other hand XML Schema was published 2 years after XPath 1.0 and provides semantically the same functionality as DTD does, so XPath 2.0 is now deeply integrated with XML Schema. And it works in MSXML4... I'm wondering what people think about it? Anyway, here is another act of hackery: how to force XmlDocument and XPathDocument classes to turn on id() and XPathNavigator.MoveToId support when document is validated against XML Schema and not DTD. public class IdAssuredValidatingReader : XmlValidatingReader { private bool _exposeDummyDoctype; private bool _isInProlog = true; public IdAssuredValidatingReader(XmlReader r) : base (r) {} public override XmlNodeType NodeType { get { return _exposeDummyDoctype ? XmlNodeType.DocumentType : base.NodeType; } } public override bool MoveToNextAttribute() { return _exposeDummyDoctype? false : base.MoveToNextAttribute(); } public override bool Read() { if (_isInProlog) { if (!_exposeDummyDoctype) { //We are looking for the very first element bool baseRead = base.Read(); if (base.NodeType == XmlNodeType.Element) { _exposeDummyDoctype = true; return true; } else { return baseRead; } } else { //Done, switch back to normal flow _exposeDummyDoctype = false; _isInProlog = false; return true; } } else return base.Read(); } }And proof of concept: source.xml <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="D:\Untitled1.xsd"> <file id="F001" title="abc" size="123"/> <file id="F002" title="xyz" size="789"/> <notification id="PINK" title="Pink Flowers"/> </root>In Untitled1.xsd schema (elided for clarity) id attributes are declared as xs:ID. The usage: public class Test { static void Main(string[] args) { XmlValidatingReader vr = new IdAssuredValidatingReader( new XmlTextReader("source.xml")); vr.ValidationType = ValidationType.Schema; vr.EntityHandling = EntityHandling.ExpandEntities; XmlDocument doc = new XmlDocument(); doc.Load(vr); Console.WriteLine( doc.SelectSingleNode("id('PINK')/@title").Value); } }Another one: public class Test { static void Main(string[] args) { XmlValidatingReader vr = new IdAssuredValidatingReader( new XmlTextReader("source.xml")); vr.ValidationType = ValidationType.Schema; vr.EntityHandling = EntityHandling.ExpandEntities; XPathDocument doc = new XPathDocument(vr); XPathNavigator nav = doc.CreateNavigator(); XPathNodeIterator ni = nav.Select("id('PINK')/@title"); if (ni.MoveNext()) Console.WriteLine(ni.Current.Value); } }In both cases the result is "Pink Flowers". I'm not sure which semantics this hack breaks. The only deficiency I see is that the dummy emulated DocumentType node becomes actually visible in resulting XmlDocument (XPathDocument is not affected because XPath data model knows nothing about DocumentType node type). Any comments?
An interesting question has been raised in microsoft.public.dotnet.xml newsgroup: how to compile XPath expression without a XML document at hands? XPathNavigator class does provide such functionality via Compile() method, but XPathNavigator is abstract class hence this functionality is available only to its implementers, such as internal DocumentXPathNavigator and XPathDocumentNavigator classes, which are accessible only via corresponding XmlDocument and XPathDocument.
Therefore obvious solutions are: using dummy XmlDocument or XPathDocument object to get XPathNavigator and make use of its Compile() method or implement dummy XPathNavigator class. Dummy object vs dummy implementation, hehe. Well, dummy implementation at least doesn't allocate memory, so I'm advocating this solution. Below is the implementation and its usage: public sealed class XPathCompiler { private sealed class DummyXpathNavigator : XPathNavigator { public override XPathNavigator Clone() { return new DummyXpathNavigator(); } public override XPathNodeType NodeType { get { return XPathNodeType.Root; } } public override string LocalName { get { return String.Empty; } } public override string NamespaceURI { get { return String.Empty; } } public override string Name { get { return String.Empty; } } public override string Prefix { get { return String.Empty; } } public override string Value { get { return String.Empty; } } public override string BaseURI { get { return String.Empty; } } public override String XmlLang { get { return String.Empty; } } public override bool IsEmptyElement { get { return false; } } public override XmlNameTable NameTable { get { return null; } } public override bool HasAttributes { get { return false; } } public override string GetAttribute(string localName, string namespaceURI) { return string.Empty; } public override bool MoveToAttribute(string localName, string namespaceURI) { return false; } public override bool MoveToFirstAttribute() { return false; } public override bool MoveToNextAttribute() { return false; } public override string GetNamespace(string name) { return string.Empty; } public override bool MoveToNamespace(string name) { return false; } public override bool MoveToFirstNamespace(XPathNamespaceScope namespaceScope) { return false; } public override bool MoveToNextNamespace(XPathNamespaceScope namespaceScope) { return false; } public override bool HasChildren { get { return false; } } public override bool MoveToNext() { return false; } public override bool MoveToPrevious() { return false; } public override bool MoveToFirst() { return false; } public override bool MoveToFirstChild() { return false; } public override bool MoveToParent() { return false; } public override void MoveToRoot() {} public override bool MoveTo( XPathNavigator other ) { return false; } public override bool MoveToId(string id) { return false; } public override bool IsSamePosition(XPathNavigator other) { return false; } public override XPathNodeIterator SelectDescendants(string name, string namespaceURI, bool matchSelf) { return null; } public override XPathNodeIterator SelectChildren(string name, string namespaceURI) { return null; } public override XPathNodeIterator SelectChildren(XPathNodeType nodeType) { return null; } public override XmlNodeOrder ComparePosition(XPathNavigator navigator) { return new XmlNodeOrder(); } } private static XPathNavigator _nav = new DummyXpathNavigator(); public static XPathExpression Compile(string xpath) { return _nav.Compile(xpath); } } public class XPathCompilerTest { static void Main(string[] args) { //Document-free compilation XPathExpression xe = XPathCompiler.Compile("/foo"); //Usage of the compiled expression XPathDocument doc = new XPathDocument(new StringReader("<foo/>")); XPathNavigator nav = doc.CreateNavigator(); XPathNodeIterator ni = nav.Select(xe); while (ni.MoveNext()) { Console.WriteLine(ni.Current.Name); } } }
I managed to transfer my xsl.info domain from NetworkSolutions (what an annoying registrar! terrible! very expensive!) to GoDaddy.com. Gosh, finally.
Any ideas how to build it welcome. Meanwhile XPath.info got a chance to get out its permanent under construction stage, more info coming soon! |
Recent Comments