October 2004 Archives

I got a problem. It's .NET problem. In XInclude.NET I'm fetching resources by URI using WebRequest/WebResponse classes. Everything seems to be working fine, the only problem is as follows: when the URI is file system URI, the content type property is always "application/octet-stream". Looks like it's hardcoded in System.Net.FileWebResponse class (sic!). I mean - when I open Windows Explorer the file's properties are: "Type of the file: XML File" and "Opens with: XMLSPY". So the Windows definitely knows it's XML and in the registry I can see .xml file extension is associated with "text/xml" content type, so why FileWebResponse always says "application/octet-stream"? Am I doing something wrong or it's soo limited in that matter? Any workarounds?

W3C has published fresh working drafts for XQuery/XPath/XSLT. XQuery 1.0: An XML Query Language, XML Path Language (XPath) 2.0, XQuery 1.0 and XPath 2.0 Data Model, XQuery 1.0 and XPath 2.0 Functions and Operators, XSLT 2.0 and XQuery 1.0 Serialization. These address comments received on previous drafts.

XQuery 1.0. What's new:

This working draft includes a number of changes made in response to comments received during the Last Call period that ended on Feb. 15, 2004. The working group is continuing to process these comments, and additional changes are expected. This document reflects decisions taken up to and including the face-to-face meeting in Redmond, WA during the week of August 23, 2004. These decisions are recorded in the Last Call issues list (http://www.w3.org/2004/10/xquery-issues.html). Some of these decisions may not yet have been made in this document. A list of changes introduced by this draft can be found in I Revision Log. The

Note:

A proposal that is currently under discussion would introduce a new form of type promotion, similar to numeric type promotion. Under this proposal, values of type xs:anyURI would be promotable to the type xs:string and could therefore be passed to functions such as fn:substring. One problem with this proposal is that values of type xs:anyURI are compared on a code-point basis, whereas values of type xs:string are compared using a collation. For this reason, promotion of xs:anyURI to xs:string might cause value comparison operators such as eq and gt to lose their transitive property. This proposal is pending further discussion and is not reflected in this document. However, the signatures of certain functions in [XQuery 1.0 and XPath 2.0 Functions and Operators], such as fn:doc and fn:QName, were written with the expectation that xs:anyURI would be promotable to xs:string. The signatures of these functions may change when this issue is resolved.
Still evolving, still too far from RTM...

From the Microsoft Research:

Comega is an experimental language which extends C# with new constructs for relational and semi-structured data access and asynchronous concurrency.
Cw is an extension of C# in two areas:
- A control flow extension for asynchronous wide-area concurrency (formerly known as Polyphonic C#).
- A data type extension for XML and table manipulation (formerly known as Xen and as X#).
The preview download includes Cw command line compiler, Visual Studio .NET 2003 package which extends VS.NET ti support Cw (really nice integration) and lots of samples. Cw supports XML as native data type, so you can write something like
// This class returns the sample bib.xml data as the above Comega objects.
public class BibData
{
  public static bib GetData() {    
    return <bib>
                <book year="1994">
                    <title>TCP/IP Illustrated</title>
                    <author><last>Stevens</last><first>W.</first></author>
                    <publisher>Addison-Wesley</publisher>
                    <price> 65.95</price>
                </book>             
            </bib>;
  }
}
And what's more interesting - Cw partially supports XQuery-like constructs natively:
public static results RunQuery(prices ps)
{
  return <results>{
               foreach(t in distinct(ps.book.title))
               {
                 yield return
                 <minprice title="{t}">
                   <price>{Min(ps.book[it.title == t].price)}</price>
                 </minprice>;
               }   
             }</results>;
}
Online documentation is available here. I wish I had some spare time to play with it...

XSL-FO to WordML stylesheet

| No Comments | No TrackBacks |

Jirka Kosek has announced a tool (XSLT stylesheet actually) for converting XSL-FO documents to WordML. Get it at http://fo2wordml.sourceforge.net.

Implementing XML Base in .NET

| 8 Comments | No TrackBacks |

XML Base is a tiny W3C Recommendation, just couple of pages. It facilitates defining base URIs for parts of XML documents via semantically predefined xml:base attribute (similar to that of HTML BASE element). It's XML Core spec, standing in one line with "Namespaces in XML" and XML InfoSet. Published back in 2001. Small, simple, no strings attached or added mind-boggling complexity. Still unfortunately neither MSXML nor System.Xml of .NET support it (Dare Obasanjo wrote once on the reasons and plans to implement it). Instead, XmlResolver is the facility to manipulate with URIs. But while XmlResolvers are powerful technique for resolving URIs, they are procedural facility - one has to write a custom resolver to implement resolving per se, while XML Base is a declarative facility - one just has to add xml:base attribute on some element and that's it, base URI for this element and the subtree is changed. So now that you see how it's useful, here is small how-to introducing amazingly simple way to implement XML Base for .NET.

I missed that point somehow:

The trouble is that XSLT allows regions of a stylesheet to belong to different versions. In XSLT 1.0, you can put an xsl:version attribute on any literal result element to indicate the version of XSLT used in the content of that element. In XSLT 2.0, any XSLT element can have a version attribute, and any other element can have a xsl:version attribute that does the same thing.

The rationale is that it allows you to upgrade part of your stylesheet without having to upgrade all of it. The parts of an XSLT 2.0 stylesheet that are marked as XSLT 1.0 run under backwards-compatibility mode, which means that (in general) things work as they did under XSLT 1.0 (e.g. you have weak typing, first-item semantics, numeric comparisons). This is handy if you have a big XSLT 1.0 stylesheet, and you want a little bit of XSLT 2.0 functionality but don't want to upgrade the entire thing just now.

Jeni Tennison in xsl-list.
It can be quite useful when upgrading stylesheets step by step, but I don't think such mix is useful otherwise provided huge difference in XPath 1.0 and XPath 2.0 data models and XSLT 1.0 and XSLT 2.0 behaviours (even in backwards compatible mode). And it's a disaster for anyone impementing XSLT 2.0 from scratch. Now I wonder how are we going to implement this feature in the XQP project?

XmlTextWriter in .NET 1.X only supports indentation of the following node types: DocumentType, Element, Comment, ProcessingInstruction, and CDATA. No attributes. So how to get attributes indented anyway? If you can - wait .NET 2.0 with cool XmlWriterSettings.NewLineOnAttributes, otherwise - here is a hack how to get attributes indented with XmlTextWriter in .NET 1.X.

Samples are templates

| No Comments | No TrackBacks |

DonXML writes on viral coding examples in presentations on using XML in .NET:

Joe Fawcett (fellow XML MVP) came across a great example (from the Microsoft.Public.Xml newsgroup) of one of my biggest pet peeves, "We (the community) are doing a very poor job teaching the average developer how to use XML properly in .Net".

I want to draw your attention to a line from the original post:

"So, is it possible to directly modify the xml file instead of using the dataset."

And the first response was:

"you can do it using Data Island"

Why does thing question bug me so much? Because we (the community) have done a very bad job using XML correctly in our articles and presentations.
Yeah, samples are in fact templates, a stuff to copy-n-paste. Keep that om mind while preparing samples for your presentation/article/blog.

SAX for .NET 1.0 released

| No Comments | No TrackBacks |

Karl Waclawek has announced the first production release of the SAX for .NET library - open source C#/.NET port of the SAX API. It contains API and Expat-based implementation. AElfred-based implementation is expected soon.

OPath language intro

| No Comments | No TrackBacks |

"An Introduction to "WinFS" OPath" article by Thomas Rizzo and Sean Grimaldi has been published at MSDN. Summary:

WinFS introduces a query language that supports searching the information stored in WinFS called WinFS OPath. WinFS OPath combines the best of the SQL language with the best of XML style languages and the best of CLR programming.
Necessary update:
In spite of what may be stated in this content, "WinFS" is not a feature that will come with the Longhorn operating system. However, "WinFS" will be available on the Windows platform at some future date, which is why this article continues to be provided for your information.

Yeah, I know it's an old problem and all are tired of this one, but it's still newsgroups' hit. Sometimes XSLT is the off-shelf solution (not really perf-friendly though), but <xsl:output indent="yes"/> is just ignored in MSXML. In .NET one can leverage XmlTextWriter's formatting capabilities, but what in MSXML? Well, as apparently many forgot MSXML implements SAX2 and includes MXXMLWriter class, which implements XML pretty-printing and is also SAX ContentHandler, so can handle SAXXMLReader's events. That's all needed to pretty-print XML document in a pretty streaming way:

<html>
   <head>
      <title>MXXMLWriter sample.</title>
      <script type="text/javascript">
      var reader = new ActiveXObject("Msxml2.SAXXMLReader.4.0");
      var writer = new ActiveXObject("Msxml2.MXXMLWriter.4.0");        
      writer.indent = true;
      writer.standalone = true;
      reader.contentHandler = writer;            
      reader.putProperty("http://xml.org/sax/properties/lexical-handler", writer);
      reader.parseURL("source.xml");
      alert(writer.output);           
      </script>
   </head>
   <body>
      <p>MXXMLWriter sample.</p>
   </body>
</html>

Dare's The XML Litmus Test

| No Comments | No TrackBacks |

MSDN has published "The XML Litmus Test - Understanding When and Why to Use XML" article by Dare Obasanjo. Cool and useful stuff. But an example of inappropriate XML usage I believe is chosen quite poorly - in such kind of articles samples must be clear and clean, while sample of using XML as a syntax for programming languages is rather debatable and dubious. Sure, o:XML syntax is terrible, but there is another highly succesful for years now programming language, whose syntax is pure XML and which was created in just one year and which just rocks. After all choosing non-XML syntax for XML-processing language is not a trivial decision too and in a recent wave of the "Why *is* XQuery taking so long?" permathread in the xml-dev it was clearly stated that one of the reasons XQuery is being developed so many years was the complexity brought by the choice of a non-XML syntax:

2. Syntax issues. The mix of an XML syntax for construction with a keyword syntax for operations is intuitive for users, but has required a lot of work on the grammar side.
Jonathan Robie, http://lists.xml.org/archives/xml-dev/200410/msg00129.html

Derek Denny-Brown is blogging

| No Comments | No TrackBacks |

That's sort of news that make my day - Derek Denny-Brown is finally blogging. Derek is working on XML/SGML last 9 years and currently is dev lead for both MSXML & System.Xml.

Here is his atom feed if you can't find it on that dark-colored page. Subscribed.

[Via Dare]

Ok, last one:

Consider a function which, for a given whole number n, 
returns the number of ones required when writing out all numbers 
between 0 and n. For example, f(13) = 6. Notice that f(1) = 1. 
What is the next largest n such that f(n) = n? 
Again I failed to solve it with no help from my old good Pentium :( I came up with some sort of formula, which I believe is right, but the number seemed to be quite big, so... 5-lines of code solved it. Can anybody show how it can be deducted in mind?

PS. Oh, forgot to mention - the puzzles were taken from the GLAT.

F# Compiler Preview

| No Comments | No TrackBacks |

Interesting news from Microsoft Research:

The F# compiler is an implementation of an ML programming language for .NET. F# is essentially an implementation of the core of the OCaml programming language (see http://caml.inria.fr). F#/OCaml/ML are mixed functional-imperative programming languages which are excellent for medium-advanced programmers and for teaching. In addition, you can access hundreds of .NET libraries using F#, and the F# code you write can be accessed from C# and other .NET languages.
Find more on F# homepage.

Yet another google puzzle

| 16 Comments | No TrackBacks |

And what about this one:

              
               1 
             1   1
             2   1
          1  2   1   1
       1  1  1   2   2   1

What is the next line?
I found several solutions, one better and couple of not really, but all of them don't match another property this sequence looks like to be following. Hmmm.

XEP 4.0 released

| No Comments | No TrackBacks |

RenderX has released new major version of their famous XSL-FO Formatter - XEP 4.0, "with many more features and performance improvements".

The engine supports the XSL Formatting Objects (XSL FO) Recommendation and the Scalable Vector Graphics (SVG) Recommendation for uniform, powerful, industry standard representation of source documents. XEP renders multi-media content in Adobe's Portable Document Format (PDF) and Adobe Postscript, the de-facto standards for digital typography. It conforms to Extensible Stylesheet Language (XSL) Version 1.0, a W3C recommendation. It also supports a subset of the Scalable Vector Graphics (SVG) 1.1 Specification. XEP outputs formatted documents in Adobe's PDF version 1.3 (with optional support for features from new versions) and PostScript level 2 or 3 formats.

Dare writes about "Upcoming Changes to System.Xml in .NET Framework 2.0 Beta 2". In short:

  • No XQuery (only in SQL Server 2005 aka Yukon)
  • New - push model XML Schema valiadtor - XmlSchemaValidator.
  • XPathDocument is reverted the XPathDocument to what it was in version 1.1 of the .NET Framework.
  • XmlReader - added methods for reading large streams of text or binary data embedded in an XML document in a streaming fashion.
  • The XPathEditableNavigator has been merged into the XPathNavigator, making it an editable XML cursor model API.
  • XmlDocument can be edited via new editable XPathNavigator, XPathDocument - no.
  • XmlDocument now supports in-memory validation.
  • XslTransform is obsolete, XslCompiledTransform is the replacement.
  • XslCompiledTransform compiles XSLT to MSIL for best perf and implements MSXML4 extension functions.
  • XPathExpression has a static method to compile XPath expressions.
  • XmlArgumentList is removed - stick with XsltArgumentList.

The question raised in the microsoft.public.dotnet.xml newsgroup today: "How to retrieve the namespace collection of all the document namespaces for which there is at least one element in the document". The purpose is a validation against different schemas. Well, the most effective way of doing it is during XML document reading phase, not after that. In .NET it means to add a slim layer into XML reading stack (aka SAX filter in Java world). In this layer, which is just custom class extending XmlTextReader, one can have a handler for an element start tag, whose only duty is to collect element's namepace and delegate the real work for the base XmlTextReader. Here is how easy it can be implemented.

public class NamespaceCollectingXmlReader : XmlTextReader 
{
    private Hashtable namespaces = new Hashtable();

    //Add constructors as needed
    public NamespaceCollectingXmlReader(string url) : base(url) {}

    public Hashtable CollectedNamespaces 
    {
        get { return namespaces; }
    }

    public override bool Read()
    {
        bool baseRead = base.Read();
        if (base.NodeType == XmlNodeType.Element && 
              base.NamespaceURI != "" &&
              !namespaces.ContainsKey(base.NamespaceURI))
            namespaces.Add(base.NamespaceURI, "");
        return baseRead;
    }
}
And here is how it can be used to collect namespaces while loading XML into a XmlDocument.
XmlDocument doc = new XmlDocument();
NamespaceCollectingXmlReader ncr = new NamespaceCollectingXmlReader("foo.xml");
doc.Load(ncr);
foreach (object ns in ncr.CollectedNamespaces.Keys)       
  Console.WriteLine(ns);

Another google puzzle

| 14 Comments | No TrackBacks |

Here is another cool puzzle from google:

Solve this cryptic equation, realizing of course that values 
for M and E could be interchanged. No leading zeros are allowed.

WWWDOT - GOOGLE = DOTCOM

Should admit I failed to solve it with just a pen and a piece of paper. Or I'm stupid or was too busy, but I wrote a small C# program in just two minutes and my computer cracked it down in another couple of minutes by a brute force. Viva computers - no brains are needed anymore :)

Aggregated by the Planet XMLhack

| No Comments | No TrackBacks |

Oh boy, I just realized my blog is aggregated by the Planet XMLhack. Wow. Thanks for that. Must stop writing narrow-minded rubbish and start focusing on XML hacking.

While old gray XPath 1.0 supports only DTD-determined IDs, XPointer Framework also supports schema-determined IDs - an element in XML document can be identified by a value of an attribute or even child element, whose type is xs:ID. I've been implementing support for schema-determined IDs for the XPointer.NET/XInclude.NET library (has no online presence currently after gotdotnet's workspace crashed and I moved the code to the mvp-xml.sf.net).

I was using my old hack - custom XmlReader that emulates dummy DOCTYPE to enforce .NET's System.Xml (both XmlDocument and XPathDocument) to use ID info collected from a schema. But you know, as it turned out, that hack is quite limited - System.Xml is only recognizing ID attributes (not elements) and only for globally defined elements! Oh. Looks like that piece of code in System.Xml was designed only for DTDs, where all elements are globally defined and it works for schemas too only due to the unified implementation in System.Xml.Schema. Ok, so XPointer.NET's support for schema-determined IDs is going to be rather limited - only ID typed attributes and only for gloablly defined elements.

Dan Wahlin is blogging

| No Comments | No TrackBacks |

Dan Wahlin, author of the "XML for ASP.NET Developers" book and xmlforasp.net portal, Microsoft MVP for XML Web Services, etc, is finally blogging. Really better late than never.

Steve Ball announced XSLT Standard Library version 1.2.1 - an open source, pure XSLT (extensions free) collection of commonly-used templates. New stuff includes new SVG and comparison modules and new functions in string, date-time and math modules.

Mark Fussell:

In between re-writing and updating the chapters for the beta version of the my book A First Look at ADO.NET and System.Xml V2.0, I found some time to write an article on Building Custom XmlResolvers for MSDN.
It's really good artilce, highly recommended reading for those who still don't feel the magic power of XmlResolvers.

Edd Dumbill has announced planet.xmlhack.com - aggregating weblogs of the XML developer community.

The weblogs are chosen to have a reasonable technical content, but because this is as much about the community as it is about the tech, expect the usual personal ramblings and digressions as well. In short, Planet XMLhack's for you if you enjoy being around people from the XML community.
Aggregated blogs at the moment include: The RSS is - http://planet.xmlhack.com/index.rdf. Subscribed.

How to join XQP project

| No Comments | No TrackBacks |

Well, here are some clarifications on how to join XQP project. You have to be registered at the SourceForge.net (here is where you can get new user accout) and then send some free-worded request along with SourceForge user name to me. That's it. Oh, and subscribe to the xqp-development mail list - that's the meeting place (actually you don't have to be XQP team member to subscribe - it's an open list).

XInclude goes Proposed Rec

| 3 Comments | No TrackBacks |

W3C published XInclude 1.0 Proposed Recommendation. Now it's only one step left for XInclude to become W3C Recommendation.

That's what I call "just in time"! I just finished integrating XInclude.NET into the Mvp-Xml codebase, cleaning up the code and optimizing it using great goodies of Mvp-Xml such as XPathCache, XPathNavigatorReader etc and planned to align the code with recent XInclude CR - and here goes another spec refresh. As far as I can see, there is no new stuff or syntax changes, just editorials (such as mentioning XML 1.1 along with XML 1.0) and clarifications based on previous feedback. Comments are due to 29 October 2004. I expect to release renowned XInclude.NET next week.

PS. For those unfamiliar with XInclude - "Combining XML Documents with XInclude" MSDN article waits for you.