When working with XPath be it in XSLT or C# or Javascript, apostrophes and quotes in string literals is the most annoying thing that drives people crazy. Classical example is selections like "foo[bar="Tom's BBQ"]. This one actually can be written correctly as source.selectNodes("foo[bar=\"Tom's BBQ\"]"), but what if your string is something crazy as A'B'C"D" ? XPath syntax doesn't allow such value to be used as a string literal altogether- it just can't be surrounded with neither apostrophes nor quotes. How do you eliminate such annoyances?
The solution is simple: don't build XPath expressions concatenating strings. Use variables as you would do in any other language. Say no to
and say yes toselectNodes("foo[bar=\"Tom's BBQ\"]")
selectNodes("foo[bar=$var]")
How do you implement this in .NET? System.Xml.XPath namespace provides all functionality you need in XPathExpression/IXsltContextVariable classes, but using them directly is pretty much cumbersome and too geeky for the majority of developers who just love SelectNodes() method for its simplicity.
The Mvp.Xml project comes to rescue providing XPathCache class:
XPathCache.SelectSingleNode("//foo[bar=$var]", doc, new XPathVariable("var", "A'B'C\"D\""))
And this is not only stunningly simple, but safe - remember XPath injection attacks?
You can download latest Mvp.Xml v2.0 drop at our new project homepage at the Codeplex.
Or, you can hack it by replacing apostrophes with a single right quotation mark (Alt + 0146)
Yes, you are completely right Dimitre. I still live in the 1.0 world. I have to start think 2.0.
> The problem is that XPath allows string
> literals be only enclosed with only "" or ''
> with no escaping supported. you can't represent
> string A'B'C"D" in XPath as string literal.
More precisely, the above statement is true for XPath 1.0.
Not so in XPath 2.0, where quotes and apostrophes can be escaped by doubling:
[75] EscapeQuot ::= '""'
[76] EscapeApos ::= "''"
http://www.w3.org/TR/xpath20/#doc-xpath-EscapeQuot
Cheers,
Dimitre Novatchev
Nope. I have no idea if this is possible in regular expressions. But then - I'm neither regexp expert nor fan. Ask Roy Osherove, he must know.
nice trick! thanks for sharing
the problem of quotes is very annoying also when using the .NET regular expressions.. do you kwow a similar trick that I can use also in this cases?
Yes, I love ''' in Python. But the problem is not in host language be it Python, C# or Javascript. The problem is that XPath allows string literals be only enclosed with only "" or '' with no escaping supported. you can't represent string A'B'C"D" in XPath as string literal.
Python allows triple quotes to avoid this issue