Signs on the Sand: November 2003 Archives

The power of XmlResolver

November 27, 2003 7:20 PM | No TrackBacks | Tags: : XML

Finally I got a time to fully implement support for XmlResolver in XInclude.NET (see Extending XInclude.NET). Wow, this stuff looks so powerful! A friend of mine is writing an article about using resolvers in System.Xml, so no spoilers here, all I wanted is to illustrate what can be done now using XInclude.NET and custom XmlResolver.

So, somebody wants to include a list of Northwind employees into a report XML document. Yeah, directly from SQL Server database. Here comes XInclude.NET solution: custom XmlResolver, which queries database and returns XmlReader (via SQLXML of course).

report.xml:

<report>
  <p>Northwind employees:</p>
  <xi:include 
href="sqlxml://LOCO055/Northwind?query=
SELECT FirstName, LastName FROM Employees FOR XML AUTO"
xmlns:xi="http://www.w3.org/2001/XInclude"/>
</report>

sqlxml:// URI schema is a proprietary schema, supported by my custom XmlResolver. LOCO055 is my SQL Server machine name, Northwind is the database I want to query and query is the query.
Here goes SqlXmlResolver class:

public class SqlXmlResolver : XmlUrlResolver {
  static string NorthwindConnString = 
    "Provider=SQLOLEDB;Server={0};
     database={1};Integrated Security=SSPI";
  public override object GetEntity(Uri absoluteUri, 
          string role, Type ofObjectToReturn) {
    if (absoluteUri.Scheme == "sqlxml") {
      //Extract server and database names from the URI
      SqlXmlCommand cmd = 
        new SqlXmlCommand(string.Format(NorthwindConnString, 
        absoluteUri.Host, absoluteUri.LocalPath.Substring(1)));
      cmd.RootTag = "EmployeesList";
      //Extract SQL statement from the URI
      cmd.CommandText = 
        absoluteUri.Query.Split('=')[1].Replace("%20", " ");
      return cmd.ExecuteXmlReader();
    } else
      return base.GetEntity(absoluteUri, role, ofObjectToReturn);
    }
  }
}

Not really a sophisticated one, just checks if the URI schema is sqlxml:// and then extracts the data from the URI and runs the query via SQLXML plumbing. Then we can read report.xml via XIncludingReader:

XIncludingReader reader = new XIncludingReader("report.xml");
reader.XmlResolver = new SqlXmlResolver();
XPathDocument doc = new XPathDocument(reader);
...

And finally the result is:

<report>
  <p>Northwind employees:</p>
  <EmployeesList>
    <Employees FirstName="Nancy" LastName="Davolio"/>
    <Employees FirstName="Andrew" LastName="Fuller"/>
    <Employees FirstName="Janet" LastName="Leverling"/>
    <Employees FirstName="Margaret" LastName="Peacock"/>
    <Employees FirstName="Steven" LastName="Buchanan"/>
    <Employees FirstName="Michael" LastName="Suyama"/>
    <Employees FirstName="Robert" LastName="King"/>
    <Employees FirstName="Laura" LastName="Callahan"/>
    <Employees FirstName="Anne" LastName="Dodsworth"/>
  </EmployeesList>
</report>

That magic is supported by XInclude.NET version 1.2, which I'm going to release right now. Well, actually I don't think including SQL into URI was a good idea, but bear in mind, that's just a dummy sample to illustrate the power of XmlResolvers. Enjoy!

Mantra of the day

November 26, 2003 1:25 PM | No Comments | No TrackBacks | Tags: : XML

XML is syntax, and only Unicode in angle brackets is real XML.
Elliotte Rusty Harold

Close your eyes and repeat it 100 times to yourself, then feel free to read xml-dev mail.

Exhausted

November 25, 2003 7:12 PM | No Comments | No TrackBacks | Tags: : Personal

8 hours of meeting on extremely boring topic... Oooooooh, I feel like I'm in dead message queue.

Extending XInclude.NET

November 24, 2003 11:56 AM | No Comments | No TrackBacks | Tags: : XML

It turned out people do use XInclude.NET already and even more - now they want to extend it! First one user wanted to be able to resolve URIs himself, via custom XmlResolver. I did that yesterday (download XInclude.NET v1.2beta if you're interested in such behaviour), but I didn't go beyound call to XmlResolver.ResolveUri().

New user case is about including XML documents generated on-the-fly. To avoid any interim layers like temporary files or HTTP calls this can be implemented by further unrolling of XmlResolver support - now to call XmlResolver.GetEntity() method on custom resolvers. This way custom XmlResolver may generate XML on the fly and return it say as XmlReader for best performance. Sounds interesting, will do.

Bookworm's joy

November 24, 2003 11:12 AM | No Comments | No TrackBacks | Tags: : Ramblings

By the way, Fawcette XML and Web Services Magazine has piblished a free book chapter of the "A First Look at ADO.NET and System.Xml v. 2.0" book by Alex Homer, Dave Sussman, and Mark Fussell. I've devoured the chapter last night and now I think I'm going to buy the book to be prepared for the future. As per my taste it's too data-oriented, but that's exactly what document-oriented guy with HTML/Docbook/XSL-FO past like me really needs.

Oh, and recently published "The C# Programming Language" by Anders Hejlsberg et al of course!

Sued for antispam

November 23, 2003 10:45 AM | No Comments | No TrackBacks | Tags: : Personal

This Wired report is overwhelming: the guy who has been sending threatening messages back to the spammers, which refused to unsubscribe him from their spam mail list now faces up to five years in prison and a $250,000 fine. /. discussion here. Mark Pilgrim's prediction has been proven.

[Agitprop rant] On the Geneva Draft

November 22, 2003 10:29 PM | No Comments | No TrackBacks | Tags: : Personal

Well, in fact I want my blog to be free of agitprop, I really got fed up enough with that stuff being born and grown up in the USSR. But today I feel tired after jogging on the beach and then shopping with my wife too much time so forgive me this one.
Note: if you happily have no idea what Geneva Draft is - just skip this rant out.

Tim Bray has ranted about Geneva Draft and even published it on his site. Seems like he likes it so much it makes him just blind. I know, Tim has been living in Lebanon some time and the permanent Middle East Crisis makes his heart bleeding, but this is just ridiculous if not rude:

Murderous Warmongering Scum This document has been denounced by the current government of Israel (no surprise there) and the Swiss Government has received complaints from the World Jewish Congress for sponsoring it (astounding). The title of this section expresses my feelings for anyone who stands in the way of the hope of peace.

Oh boy, how familiar that smells, just like from an editorial of some soviet newspaper in the middle of the 1970s. I mean, really, as a matter of fun, "israeli warmongers" was the most typical slogan in the soviet agitprop since Israel won the War of Independence. It's funny Tim is using this slogan too, but it's sad in the same time, because as per his words the current government of Israel and probably all Israelis voted for them (70% by the way, /me included) are just murderous warmongering scums.
Well, may be I'm murderous warmonger too, but I has just thrown my copy of the Geneva Draft to the trash once I got it because as per my understanding it is a trash.
Beside that I believe the national losers (authors of the Geneva Draft lose couple of the last elections) don't have any legitimate rights to carry on any state-level negotiations, I just don't believe in peace with terrorists, sorry Tim, just like many didn't believe in peace with Hitler. "Land-for-hope-for-piece" deal doesn't work, the WWII and the last "intifada" proved that perfectly. Terrorists must be stopped, otherwise one day you will see a suicide bomber parking his heavy loaded car near your office.

An appeaser is one who feeds a crocodile, hoping it will eat him last.

That was said by Sir Winston Churchill.

Update: Don't get me wrong, I'm not a radical, I'm (and vast majority of Israelis) against "Let's give up everything just for hope for peace" approach, which the Geneva Draft represents. The road map plan is a way more realistical - first stop terrorism, destroy Hamas and friends and then get the state, but not vice versa.

Don't think XQuery is like XSLT

November 18, 2003 6:10 PM | No Comments | No TrackBacks | Tags: : XML

Interesting finding on XQuery from Elliotte Rusty Harold:

In XSLT 1.0 all output is XML. A transformation creates a result tree, which can always be serialized as either an XML document or a well-formed document fragment. In XSLT 2.0 and XQuery the output is not a result tree. Rather, it is a sequence. This sequence may contain XML; but it can also contain atomic values such as ints, doubles, gYears, dates, hexBinaries, and more; and there's no obvious or unique serialization for these things. For instance, what exactly are you supposed to do with an XQuery that generates a sequence containing a date, a document node, an int, and a parentless attribute? How do you serialize this construct? That a sequence has no particular connection to an XML document was very troubling to many attendees.

Looking at it now, I'm seeing that perhaps the flaw is in thinking of XQuery as like XSLT; that is, a tool to produce an XML document. It's not. It's a tool for producing collections of XML documents, XML nodes, and other non-XML things like ints. (I probably should have said it that way last night.) However, the specification does not define any concrete serialization or API for accessing and representing these non-XML collections. That's a pretty big hole left to implementers to fill.

Hmmm, that's kinda confusing. Let's see. Formally speaking what XQuery produces is one(~~zero~~) or more instances of XPath 2.0 and XQuery 1.0 Data Model (DM), which then are subject to the serialization process, defined in XSLT 2.0 and XQuery 1.0 Serialization spec. The problem (typo?) is that XQuery spec says:

Serialization is the process of converting a set of nodes from the data model into a sequence of octets...

and thus doesn't mention what happens with items in the resulting DM, which are not nodes, but atomic values. I believe that's a mistake in XQuery spec, because XSLT 2.0 and XQuery 1.0 Serialization handles that pretty well - it defines serialzation of DM including all it can contain, particularly, atomic values are converted to their string representations.

Mark Pilgrim on weblog spam

November 18, 2003 5:21 PM | No Comments | No TrackBacks

Oh boy, that's even worse than I suspected:

Savor this moment, folks. You can tell your children stories of how, back in the early days of weblogging, you could print out the entire spam blacklist on a single sheet of paper. Maybe with two or three columns and a smallish font, but still. Boy, those were the days.
And they won’t last. They absolutely won’t last. They won’t last a month. The domain list will grow so unwieldy so quickly, you won’t know what hit you. It’ll get so big that it will take real bandwidth just to host it. Keeping it a free download will make you go broke. Code is free, but bandwidth never will be. Do you have a business plan? You’ll need one within 6 months.

A second year in Wonderland

November 18, 2003 2:50 PM | No Comments | No TrackBacks | Tags: : Personal

By the way I've rummaged a bit in Google archive and found my first posting into microsoft.public.xsl newsgroup. It was 2002-11-02, more than year ago. I was totally Java-oriented guy at that time, just started learning .NET and feeling like entering a new world. And the new world hooked me on its drugs rapidly and cruelly. That's all.

WordML is free

November 17, 2003 5:13 PM | No Comments | No TrackBacks | Tags: : Office

Microsoft Announces Availability of Open and Royalty-Free License For Office 2003 XML Reference Schemas :

To ensure broad availability and access, Microsoft is offering the royalty-free license using XML Schema Definitions (XSDs), the cross-industry standard developed by the W3C. The license provides access to the schemas and full documentation to interested parties and is designed for ease of use and adoption. The Microsoft Office 2003 XML Reference Schemas include WordprocessingML (Microsoft Office Word 2003), SpreadsheetML (Microsoft Office Excel 2003) and FormTemplate XML schemas (Microsoft Office InfoPath 2003).

Wow, respect. I hope next step will be standardizing schemas just how it was done with CLI and C#. By the way "Generating Word documents using XSLT" approach I was talking about back in May is completely legal now and even kinda encouraged.

Funny enough, WordML is now called WordprocessingML, probably the longest ML-related acronym ever. Download WordprocessingML schema and documentation now and get back to that link 12/5/2003 to grab SpreadsheetML (Microsoft Office Excel 2003) and FormTemplate XML schemas (Microsoft Office InfoPath 2003).

Idee fixe

November 17, 2003 2:25 PM | 2 Comments | No TrackBacks | Tags: : XML

The whole morning I'm trying to get rid of the idee fixe of writing XmlReader/XmlWriter based XML updater. The aim is to be able to update XML without loading it to DOM or even XPathDocument (which as rumored is going to be editable in .NET 1.2). Stream-oriented reading via XmlReader, some on-the-fly logic (quite limited though - filtering, values modifying) in between and then writing to XmlWriter. Cache-free, forward-only just as XmlReader is. If you're aware of SAX filters you know what I'm talking about. But I want the filtering/updating logic (hmmm, did you note I'm avoiding "transforming" term?) to be expressed declaratively.

Obviously the key task is how to express and detect nodes to be updated. If we go XPath patterns way we generally can get limited to single update per process, due to forward-only restriction. Subsetting XPath can help though. The only way to evaluate XPath expression without building tree graph is so-called ForwardOnlyXPathNavigator aka XPathNavigator over XmlReader. This beast is mentioned sometimes in articles, but I'm not aware of any implementation availble online yet. Btw, a friend of mine did that almost a year ago, may be I can get him to publish it. As per name it limits XPath to forward axes only (the subset seems to be the same as Arpan Desai's SXPath) and of course can't evaluate more than one absolute location path. But it can evaluate multiple relative location pathes though, e.g. /foo/a, then b/c in

<foo>
    <a>
        <b>
            <c/>
        </b>
    </a>
</foo>

tree. Another way to express which nodes are to be updated is [NodeType][NodeName] pattern, probably plus some simple attribute-based predicates. Sounds ugly, I know, but limiting scope to a node only fits better to forward-only way I'm trying to think.

Another problem is how to express update semantics. I have no idea how to avoid inventing new syntax. Something like:

<update match="/books/book[@title='Effective XML']">
    <set-attribute name="on-load" value="Arthur"/>
</update>

I have no idea if it's really feasible to implement though. All unmatched nodes should be passed untouched forward to the result, on the matched one the update logic should be evaluated and then go on.

Yes, I'm aware of STX, but I feel uneasy about this technology. Too coupled to SAX (CDATA nodes in data model ugh!), assignable variables etc. No, I'm talking about different thing, even more lightweight one (thought even more limited).

Does it make any sense, huh ?

Daily asana for efficient coding

November 16, 2003 2:30 PM | 3 Comments | No TrackBacks | Tags: : Personal

Here are the sacral list of simple exercises to improve your karma and become a real guru. Just for neophytes and those who missed this practice somehow:

Incremental XSLT

November 13, 2003 5:37 PM | No Comments | No TrackBacks | Tags: : XSLT

Interesting article about incremental XSLT. I only wish it comes true some day.

The great nine get updated - something to read at night

November 13, 2003 12:58 PM | No Comments | 1 TrackBack | Tags: : XML

Last Call Working Drafts for XSLT/XPath/XQuery have been published. Last Call period ends 15 February 2004. Oh my, when I'm going to read it?

Quote of the Day

November 13, 2003 12:36 PM | No Comments | No TrackBacks | Tags: : Ramblings

From saxon-love-in-department:

>>
>> How did Michael do it .
>>

The biggest factors are a total absence of project managers, marketeers, junior programmers, and paying customers who think they know best.

Michael Kay

One more surprise from Longhorn team - OPath query language

November 12, 2003 11:21 AM | 1 Comment | No TrackBacks | Tags: : .NET

Just found new beast in the Longhorn SDK documentation - OPath language:

The OPath language is the query language used to query for objects using an ObjectSpace. The syntax of OPath also allows you to query for objects using standard object oriented syntax. OPath enables you to traverse object relationships in a query as you would with standard object oriented application code and includes several operators for complex value comparisons.

Orders[Freight > 5].Details.Quantity > 50 OPath expression should remind you something familiar. Object-oriented XPath cross-breeded with SQL? Hmm, xml-dev flamers would love it.

The approach seems to be exactly opposite to ObjectXPathNavigator's one - instead of representing object graphs in XPathNavigable form, brand new query language is invented to fit the data model. Actually that makes some sense, XPath as XML-oriented query language can't fit all. I wonder what Dare think about it. More studying is needed, but as for me (note I'm not DBMS-oriented guy though) it's too crude yet.

XInclude is Working Draft again

November 11, 2003 9:44 PM | No Comments | No TrackBacks | Tags: : XML

The day started with bad news from W3C - XInclude 1.0 has been whithdrawn back to Working Draft maturity level. Actually Last Call WD, but anyway the step backward. The main reason is most likely primarily architectural one - seems like URI syntax with XPointers in fragment identifier part has been considered too revolutionary and now they broke it up to two separate attributes - href attribute contains URI or the resource to include and xpointer attribute - XPointer identifying the target portion of the resource. So instead of

<xi:include href="books.xml#bk101/>

another syntax should be used:

<xi:include href="books.xml" xpointer="bk101"/>

While it sounds good from "Make structure explicit through markup" perspective, it does smell bad with regard to URI syntax, which allows fragment identifiers for years.

Another new feature - now it's possible to control HTTP content negotiation via new accept, accept-charset and accept-language attributes. Well, again quite dubious stuff. And possible security hole as Elliotte pointed out.

Also XInclude namespace is now "http://www.w3.org/2003/XInclude", but old one should be supported somehow too.

Anyway I have to update XInclude.NET library now. No big changes fortunately, so I'm going to release it in a couple of days.

20 minutes of real fun

November 11, 2003 5:13 PM | No Comments | No TrackBacks | Tags: : Personal

Via Carnage4Life: Top 50 IRC Quotes
My favorite one:
*** Quits: TITANIC (Excess Flood)

"How to XSLT CSV file" revisited

November 10, 2003 3:37 PM | 10 Comments | No TrackBacks | Tags: : .NET

Well, it's extremely well-chewed topic well-covered by many posters, but provided people keep asking it I feel I have to give a complete example of the most effective way (IMO) of solving this old recurring question - how to transform CSV or tab-delimited file using XSLT?

The idea is to represent non-XML formatted data as pure XML to be able to leverage many's favorite XML hammer - XSLT. I want to make it clear that approaching the problem this way doesn't abuse XSLT as XML transformation language. Non-XML data is being represented as XML and XSLT operates on it via XPath data model prism actually having no idea it was CSV file on the hard disk.

Let's say what's given is this tab-delimited file, containing some info such as customer ID, name, address about some customers. You need to produce HTML report with customers grouped by country. How? Here's how: all you need is XmlCSVReader (cudos to Chris Lovett), XSLT stylesheet and couple lines of code to glue the solution:

Code:

using System;
using System.Xml;
using System.Xml.XPath;
using System.Xml.Xsl;
using System.IO;
using Microsoft.Xml;

public class Sample {
    public static void Main() {
        //XMLCSVReader setup
        XmlCsvReader reader = new XmlCsvReader();
        reader.Href = "sample.txt";
        reader.Delimiter = '\t';
        reader.FirstRowHasColumnNames = true;
		
        //Usual transform
        XPathDocument doc = new XPathDocument(reader);
        XslTransform xslt = new XslTransform();
        xslt.Load("style.xsl");
        StreamWriter sw = new StreamWriter("report.html");
        xslt.Transform(doc, null, sw);
        sw.Close();
    }
}

XSLT stylesheet

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="countryKey" match="/*/*" use="country"/>
    <xsl:template match="root">
        <html>
            <head>
                <title>Our Customers Worldwide</title>
            </head>
            <body>
                <table style="border:thin solid orange;">
                    <xsl:for-each select="*[count(.|key('countryKey', 
						country)[1])=1]">
                        <xsl:sort select="country"/>
                        <tr>
                            <th colspan="2" 
                                style="text-align:center;color:blue;">
                                <xsl:value-of select="country"/>
                            </th>
                        </tr>
                        <tr>
                            <th>Customer Name</th>
                            <th>Account Number</th>
                        </tr>
                        <xsl:apply-templates 
                            select="key('countryKey', country)"/>
                    </xsl:for-each>
                </table>
            </body>
        </html>
    </xsl:template>
    <xsl:template match="row">
        <tr>
            <xsl:if test="position() mod 2 = 1">
                <xsl:attribute name="bgcolor">silver</xsl:attribute>
            </xsl:if>
            <td>
                <xsl:value-of 
                select="concat(fname, ' ',mi, ' ', lname)"/>
            </td>
            <td>
                <xsl:value-of select="account_num"/>
            </td>
        </tr>
    </xsl:template>
</xsl:stylesheet>

Resulting HTML:

Canada
Customer Name	Account Number
Derrick I. Whelply	87470586299
Michael J. Spence	87500482201
Brenda C. Blumberg	87544797658
Mexico
Customer Name	Account Number
Sheri A. Nowmer	87462024688
Rebecca Kanagaki	87521172800
Kim H. Brunner	87539744377
USA
Customer Name	Account Number
Jeanne Derry	87475757600
Maya Gutierrez	87514054179
Robert F. Damstra	87517782449
Darren M. Stanz	87568712234

Main virtue of this approach is that all transformation and presentation logic is concentrated in only one place - XSLT stylesheet (add CSS according to your taste), C# code is fully agnostic about data being processed. In the same fashion CSV file can be queried using XQuery or XPath. Once the data is represented as XML, all doors are open.

XML 1.1 is coming

November 6, 2003 1:03 PM | No Comments | No TrackBacks | Tags: : XML

In W3C news:

5 November 2003: W3C is pleased to announce the advancement of Extensible Markup Language (XML) 1.1 and Namespaces in XML 1.1 to Proposed Recommendations. Comments are welcome through 5 December. XML 1.1 addresses Unicode, control character, and line ending issues. Namespaces 1.1 incorporates errata corrections and provides a mechanism to undeclare prefixes.

For those from another planet, here is a summary of changes:

Namespaces can be undeclated now, using xmlns:foo="" syntax
Namespace IRIs instead of namespace URIs
Change in allowed--in-names-characters pholisophy - in XML 1.1 everything that is not forbidden (for a specific reason) is permitted, including those characters not yet assigned
Two more linefeed characters - NEL (#x85) and the Unicode line separator character, #x2028
Control characters from #x1 to #x1F are now allowed in XML 1.1 (provided they are escaped as character references)

Dreams come closer

November 6, 2003 12:35 PM | No Comments | No TrackBacks | Tags: : .NET

Seems like old dreams about deep extending VisualStudio.NET up to adding new languages, editors and debuggers without funny-not-for-me COM programming but using beloved C# finally come true! Microsoft is inviting beta testers to VSIP Extras Beta program. The killer feature:

.NET Framework support. Interop assemblies are provided to allow VSIP packages to be developed in C#, managed extensions for C++, or Visual Basic. New samples have been provided in managed languages and the documentation has been updated to include information about managed code development.

Go to fill Beta Nomination Survey, may be you are lucky enough to be choosen.

I've got a bunch of ideas, from XSLT debugger to XQuery editor, postponed till this can be done in C#, because I'm really weak in COM.

Everything that has a beginning has an end

November 5, 2003 6:55 PM | 2 Comments | No TrackBacks | Tags: : Personal

Well, it's over. Just came back from the Matrix Revolutions. Couple of spoilers - its' not really about revolution, but peace talks. Nothing unexpected, the Hero is sacrificing himself to save the Mankind from the Dragon, an eternal archetype...
Anyway, this installment is certanly a way better than the Reloaded one.

Rest in peace, DOM

November 4, 2003 2:04 PM | No Comments | No TrackBacks | Tags: : .NET

While Don Box is declaiming of the VB glory, Mark Fussel is busy with quite opposite bussiness - he's reading the burial service over XmlDocument aka DOM, worth to quote as a whole:

The XML DOM is dead. Long live the DOM. 

Dearest DOM, it is with little remorse,
to see that your API has run its course.
You expose your nodes naked and bare,
with no chance of any optimizations there.
Your (cough) data model is just to complex,
and causes developers to vex
over how to deal with CDATA, notations and entity refs. 

So it is with a small tear welling in my eye,
that I watch the completion of your demise.
In .NET the XPathDocument has now taken your throne,
as the king of the XML API-dom.
Goodbye DOM, just disappear and die,
I will not miss you with your unweildly API.
Goodbye DOM, goodbye.

RIP DOM. Viva XPath!

nxslt 1.3 released

November 2, 2003 12:30 PM | No Comments | No TrackBacks | Tags: : .NET

So, nxslt version 1.3 is at your service. New features include:

Support for XML Inclusions (XInclude) 1.0 Candidate Recommendation. Done by incorporating XInclude.NET library into nxslt. XML Inclusions are processed in both source XML and XSLT stylesheet, by default it's turned on and can be disabled using -xi option.
Improved EXSLT support. Now nxslt leverages EXSLT.NET implementation. That means more EXSLT extension functions supported with much better performance and compatibility.
Small advanced feature for EXSLT.NET developers - support for external EXSLT.NET assembly.

Download it here or here (GotDotNet). It's free of course. Thorough documentation is here.

November 2003 Archives