Update: Fixed minor bug in implementation of IsOfClass.
In a previous post, I talked about how I'm hoping to be able to use Linq for XML to allow me to process XHTML, my current favorite data serialization format. At the end of that post, I wrote code more or less like this:
var alternates = from ul
in document.Element(xhtmlns + "html").Element(xhtmlns + "body").Elements(xhtmlns + "ul")
where ul.Attribute("class").Value == "alternates"
in ul.Elements(xhtmlns + "li")
The idea was to be able to pull the values out of a bunch of XHTML list items. The problem with this code is that it doesn't really give me what I want. If you were to look at the type of the object referred to by alternates, you'd discover that it's a
Which - if you can read that expression without going blind - indicates that what I've got is essentially "a sequence of a sequence of strings". No, that's not a typo: it's a sequence of sequences, and iterating over it with a nested loop is sort of annoying.
Fortunately, it appears that Don Box reads this blog, or at least read that post. :) He's the co-author of the rather excellent article found here, and had I read it I would have known the solution. But even though I hadn't (I have now - you should, too), he was kind enough to drop by with a comment that made everything work. Here's the code I'm using now:
var alternates =
I've made a few changes beyond just the Linq bits, but I'll explain those in a minute. The key to making the query work was removing the first "select". Having the second "from" clause follow without an intervening "select" results in a SelectMany method call (all the Linq keywords like select, from, where, etc. are just shorthand for method calls). And that's exactly what we want: SelectMany collapses the query to a single dimension. Read the article for a better explanation. With this change, the query now returns something we can iterate directly over with a "foreach (string alternate in alternates)". Nice.
As for the other changes, there were a couple. One was to create a class called Xhtml with a static method called Tag that creates my XNames. This just cleans up the code a little bit from all that "xhtmlns +" stuff I had before. I also created this extension method:
public static bool IsOfClass(this XElement element, string className)
// TODO: this should really account for the fact that the class
// attribute is multivalued - i.e. it's legal to have
// class="foo bar quux", and we should return true for any of
// foo, bar, or quux.
to let me use IsOfClass on an XElement - I just think the syntax is cleaner, and as I do more and more XHTML processing, stuff like this should help contribute to my goal of a reasonable syntax.