CraigBlog: April 2006

Thursday, April 27, 2006

Irony, At Least, Is Highly Available

It's a great irony that I'm spending today working with a client trying to design a system that must be highly available, and simultaneously struggling with my email server becoming unavailable. For added contrast, the problem appears to be with the company that hosts my backup mail servers - a service I pay for in the name of high availability. :p

At any rate, I hope this will get sorted out soon, because at the moment our email is bouncing immediately rather than simply queuing. If you're trying to reach me (or my wife) and you got a bounce, at least now you know what's happening. You can always use my gmail address instead.

Tuesday, April 25, 2006

It's All Free, Y'All

Phil Haack (one of my favorite bloggers, I'm not ashamed to admit) makes an excellent point. To that end, I hereby state the following:

Any code that appears on this blog, or in the sections of the Pluralsight wiki, for which I own the copyright, is hereby licensed under the MIT License, unless otherwise stated.

I'll try to make a practice of being more explicit in the future when I post code. In the meantime, you're covered.

Wednesday, April 19, 2006

The Perils of WSDL First (Again)

Tim and Aaron have both pitched in on this Perils of WSDL First thread. I just want to point out that I largely agree with what they are saying. Which is good, because disagreeing with either of them on matters pertaining to web services would be enough to make someone with even my huge ego question themselves! ;)

They advocate use of an abstraction layer between your domain types and the XmlSerializable types you use as your web service paramaeters. This is a fantastic idea, and it's the way I write web services myself, having been bitten more than once by direct exposure of domain classes. Generally speaking, this will go a long way towards preventing the sort of problems I warn about in my original post. In many cases, it will even be sufficient, especially when coupled with Tim's idea of having the serializable types be in a separate assembly owned by someone who understands the implications of code changes on the wire protocol. And the DCS stuff that Aaron talks about looks like an excellent further step in the right direction.

So if I still appear skittish, it's because I know that most people don't program this way. Most people follow the path of least resistance, and will hesitate to implement writing the boring, repetitive mapping code involved in the most straightforward implementation of such an abstraction layer. Yes, I realize more sophisticated implementations are possible, but I'm talking here about people and organizations that don't necessarily recognize the long-term benefits of such a setup. As a consultant, I've seen this more than once.

The other thing that makes me a bit nervous is that even with an abstraction layer, you still have to be very careful not to muck with it in ways that would change the contract - a fact that both Tim and Aaron are clearly aware of. This is obvious to people who've been doing web services for a while, but for a lot of developers the angle brackets are still invisible. Basically, I'm trying to point out that web service development needs to be both first. Fragile code sucks; the more things you have to remember, the more likely you are to make a mistake.

Whether or not any of this means your organization needs to go whole-hog and implement IXmlSerializable really depends on the dynamics of your development process. Probably you don't need to go that far, especially given XmlElementAttribute.Order. At the end of the day, just remember that when you let the infrastructure generate your XML automatically, seemingly innocuous changes in your code can create breaking changes in your wire protocol. Remember it, and use appropriate techniques to address the risk.

Tuesday, April 18, 2006

Overkill? I think not

It's good to see Tim blogging again. Hopefully he'll keep it up this time (nudge, nudge, Tim). At any rate, I'll blame his lack of practice for his post "Craig urges overkill, XmlSerializer sky not falling". Either that, or he had a high temperature/blood alcohol level. :)

Tim writes:

Craig got caught in a very particular set of circumstances. First, he started with WSDL. Then he hand-wrote his serializable types. Then he followed his preferred set of rules for ordering members of those types alphabetically.

The implication here is that unless you're doing all those things, you won't have the issues I describe in my post. This is simply not true. The problem I describe is an issue for anyone who does not explicitly control the order of serialization of their web service-visible types. Period. While I wouldn't exactly say the sky is falling, this is definitely a Big Deal.

I think the reason you don't see this occurring as a problem more often in the wild is that people tend to write .NET clients for their .NET web services, and XmlSerializer doesn't care about order. Or, more generally, schema validity. But if reach is important to your web service, you should.

I actually talked with Tim about this on the phone, and it came out during the conversation that the problem is even worse than I first thought. I had detected the issue with return types, but the fact of the matter is that if you reorder your type members for either input or output parameters, you change the generated schema in the WSDL. And that's a breaking change (from a schema standpoint).

Which really was my whole point: you need to be EXCEEDINGLY careful with your types unless you implement IXmlSerializable or use XmlElementAttribute.Order. Reordering type members is just too easy for a developer to do without thinking about it, and isn't the sort of thing that's easy to catch as critical even if your team reviews all source changes.

As Tim points out, implementing the read side of IXmlSerializable is a royal pain most of the time (you'll note I only said "consider" using it), but XmlElementAttribute.Order is pretty easy. Of course, it's only available in .NET 2.0. :p

Friday, April 14, 2006

FlexWiki 2.0 Progress - The Big Commit

One of the things I've been working on - a little at a time, usually about half an hour a day - is a fairly extensive rewrite of FlexWiki. We're calling it FlexWiki 2.0, and not coincidentally, it involves an upgrade to .NET/ASP.NET 2.0.

I've been working on it for what feels like a loooong time, and for various reasons (some good, some less so) I hadn't checked in until today. Well, today I checked in the code - woohoo! So now we have two branches in CVS, which will make life ever so slightly more complicated, but which will allow me to sleep better at night. I do nightly backups of all my data, but that's not really the same as having it under version control at SourceForge.

This partial rewrite was driven by the desire to give FlexWiki a better authorization model. I still haven't even started on that part of it - first I had to untangle the existing caching code from the storage engine, and separate out a bunch of special processing that deals with something called "backing topics". Don't ask.

Anyway, what I've got now is a much cleaner, more extensible design than what was there before. Of course, I still have to do performance testing once I finish coding the new security and caching features, and that will almost certainly drive some complexity back into the design. But I remain hopeful that the design will enable future work to be done without the sort of elbow-length-gloved-colonoscopy type of overhaul I did this time around.

In October, I started tracking the time I've put into developing this new version of FlexWiki. So far I'm at 75 hours, which is either a lot or not much depending on how you look at it. There's a lot left to do - the source in its current state doesn't even compile - but the main refactoring I was doing has unit tests that cover 94% of the class, so I've got a solid start. My to-do list activities include:

Getting the build reconfigured to deal with two branches.

Getting a build for the 2.0 code working.

A few more minor refactorings.

Re-implementing support for built-in topics

Implementing security support.

Re-implementing caching.

Establishing some sort of performance testing regimen.

Documentation, cleanup, and generally making sure it works after everything else I've done.

Taking some sort of serious break from working on FlexWiki. :)

Anyway, today was a major milestone for me, so I thought I'd share. If anyone has any comments on the code or the design, I'd love to hear them.

Thursday, April 6, 2006

The Perils of WSDL First

Update: Added the bit about XmlElementAttribute.Order. Thanks Tim!

You might recall that a while back, I posted my views about the whole "contract first" or "code first" debate. I'm not here to stir that debate again, but to relate a story from the trenches that reinforced my opinion.

Recently, I've been doing some work for a client that involved writing a web service under ASP.NET. For a variety of good reasons, we couldn't go the easy route of letting the system generate our WSDL for us. Instead, we wrote it by hand, and used this technique to make it show up in the right place. It was definitely harder than just writing a bunch of types, but I was pretty pleased with the result in the end.

Then I made the mistake of taking a shower while thinking about the code. I should know better, having had too many good ideas while standing naked under hot water. (I will now pause to give you a moment to poke out your mind's eye.) What I realized made me want to yell "Stop the presses!" and indeed will likely delay the release of said web service.

The issue, in a nutshell, is this: when writing custom WSDL, it's very, very easy to produce schema-invalid responses, unless you implement IXmlSerializable on all your return types. The problem is one of ordering. Allow me to explain.

One of the nice things about XmlSerializer is that it doesn’t really care about order. Let's say you have a class that looks like this:

public class Person {

public string Name;

public int Age;

}

It doesn't really matter whether you deserialize this from XML that looks like this:

<Name>Craig</Name>

</Person>

or this

<Name>Craig</Name>

</Person>

In the end, XmlSerializer will happily work with either flavor. And 99.99% of the time, this is fine, because you don't care. The problem arises because the schema that corresponds to this looks something like this:

<xs:element name="Person">

<xs:complexType>

<xs:sequence>

<xs:element name="Name" type="xs:string" />

<xs:element name="Age" type="xs:int" />

</xs:sequence>

</xs:complexType>

</xs:element>

Note that this uses a sequence, which means, "These elements must appear in this order." No big deal if the Person type is an input to our WebMethod, because we probably don't care in the code which one came first. But if Person is the return type of a WebMethod, then we've got trouble. Because if I ever saw that Person class, I'd rewrite it like this:

public class Person {

public int Age;

public string Name;

}

because I generally organize my files with members arranged alphabetically within visibility and member type groups. (Why? It makes it a snap to navigate the file in outline mode in VS.) But if I make that change and then serialize this type, it's going to serialize with the <Age> element first, and that's a schema violation.

If all your clients use .NET, or other tools that don't care about order, great. But I think producing schema-invalid messages is just plain wrong, and likely to lead to hard-to-diagnose issues down the road as the tools evolve and your client base expands. Not to mention that if one of your clients tries to schema-validate, they're going to get an error.

In short, writing web services that are fragile with respect to the order you define your members in your source code is a recipe for disaster. It's way too easy for someone to come along and make this sort of change. Of course, I think it's generally a bad idea to expose your domain entities directly via web services anyway, but even if you have the sort of serialization layer that avoids this problem, you still might get bit by this.

And for our system, the problem was even worse. We'd written the schema ourselves rather than letting the system generate it, so either we had to change the schema to match the serialization patterns of our types, or we had to change the serialization to match the schema.

Between these two choices, I believe the only realistic one is the latter. For one thing, you may not control the relevant parts of the schema…we didn't in the system I was working on. For another, there's no guarantee that Reflection (which the XmlSerializer relies on) will always return type members in the same order. It does now, and it would break a ton of things horribly if they ever did change it, but all the same the order is not documented nor guaranteed to remain stable version-over-version. But the real reason is the one I already mentioned: you just simply shouldn't have to rely on the order of properties and fields in a source file to produce a correct system. Too fragile.

Given that the only realistic choice is to change the serialization, how do we go about it? Well, if you're developing for .NET 1.1 you have to implement IXmlSerializable, the interface that gives you full control of the serialization process. And that's usually a pretty reasonable choice: hand-generating XML via XmlWriter is pretty straightforward, if a little tedious. You can punt the implementation of ReadXml (which is usually much harder) if the type is only used as a return value from your WebMethods (as opposed to being an input), since ReadXml will never be called in this case.

If, on the other hand, you're working with .NET 2.0, you can make use of the Order property on XmlElementAttribute, like so:

public class Person {

[XmlElement(Order=2)] public int Age;

[XmlElement(Order=1)] public string Name;

}

If I had to boil all this down to a set of rules, it would be these:

Don't serialize your domain objects directly. Use a separate set of serialization types. This is just a general best practice.

Consider implementing IXmlSerializable/XmlElementAttribute.Order on all WebMethod return types.

You really, really should implement IXmlSerializable/XmlElementAttribute.Order on WebMethod return types when you write the WSDL yourself.

All this just goes to show that you have to code "both first" - it is perilous to ignore either the XML or the tools. Take control!

Saturday, April 1, 2006

The Great Thing About the Definition of &quot;Architecture&quot;

The great thing about the definition of "architecture" is that everyone has one. Including me.

Responses to my post included comments from quite a few people - some private, some as comments on the blog post. Everyone seemed to have an opinion of what an architect does or what architecture means, although I don't think anyone actually contradicted my main (and, I guess, poorly-expressed) point: that the term "architecture" is very poorly understood.

What was interesting was that everyone seemed to agree that (a) one of an architect's responsibilities is design, and that (b) the key to being a good architect is having a solid connection with both the technologies involved and the customer. (In fact, Martin Fowler points out how this is exactly the role an architect should take in building construction. And often doesn't.)

Michael Platt breaks it down into seven (!) overlapping roles. I'm not denying that the project-management aspects of Michael's breakdown are important functions, but that seems to fall well outside the realm of software, since it would be true of any project of sufficient size of any type. Also, I should point out that I never meant to imply that I thought Michael's definition was wrong or that it somehow missed the point; just that it didn't increase my understanding of anything. Or, to put it another way, I'm still too stupid about exactly what "architecture" means to gain much from the fine points he put on it. :)

I think the main truths (such as they are) are this:

Nontrivial software needs a design.

There's a need for meta-design in larger organizations to constrain software designs to have commonality where this makes sense.

My only point was that I think "architecture" has come to mean both things. I think this makes it rather useless as a term, because the two are so different. But with these two things finally separate in my head, lots of other stuff makes more sense to me now.