Thursday, April 6, 2006

The Perils of WSDL First

Update: Added the bit about XmlElementAttribute.Order. Thanks Tim!

 

You might recall that a while back, I posted my views about the whole "contract first" or "code first" debate. I'm not here to stir that debate again, but to relate a story from the trenches that reinforced my opinion.

 

Recently, I've been doing some work for a client that involved writing a web service under ASP.NET. For a variety of good reasons, we couldn't go the easy route of letting the system generate our WSDL for us. Instead, we wrote it by hand, and used this technique to make it show up in the right place. It was definitely harder than just writing a bunch of types, but I was pretty pleased with the result in the end.

 

Then I made the mistake of taking a shower while thinking about the code. I should know better, having had too many good ideas while standing naked under hot water. (I will now pause to give you a moment to poke out your mind's eye.) What I realized made me want to yell "Stop the presses!" and indeed will likely delay the release of said web service.

 

The issue, in a nutshell, is this: when writing custom WSDL, it's very, very easy to produce schema-invalid responses, unless you implement IXmlSerializable on all your return types. The problem is one of ordering. Allow me to explain.

 

One of the nice things about XmlSerializer is that it doesn’t really care about order. Let's say you have a class that looks like this:

 

public class Person {

  public string Name;

  public int Age;

}

 

It doesn't really matter whether you deserialize this from XML that looks like this:

 

<Person>

  <Name>Craig</Name>

  <Age>34</Age>

</Person>

 

or this

 

<Person>

  <Age>34</Age>

  <Name>Craig</Name>

</Person>

 

In the end, XmlSerializer will happily work with either flavor. And 99.99% of the time, this is fine, because you don't care. The problem arises because the schema that corresponds to this looks something like this:

 

<xs:element name="Person">

   <xs:complexType>

    <xs:sequence>

      <xs:element name="Name" type="xs:string" />

      <xs:element name="Age" type="xs:int" />

    </xs:sequence>

   </xs:complexType>

</xs:element>

 

Note that this uses a sequence, which means, "These elements must appear in this order." No big deal if the Person type is an input to our WebMethod, because we probably don't care in the code which one came first. But if Person is the return type of a WebMethod, then we've got trouble. Because if I ever saw that Person class, I'd rewrite it like this:

 

public class Person {

  public int Age;

  public string Name;

}

 

because I generally organize my files with members arranged alphabetically within visibility and member type groups. (Why? It makes it a snap to navigate the file in outline mode in VS.) But if I make that change and then serialize this type, it's going to serialize with the <Age> element first, and that's a schema violation.

 

If all your clients use .NET, or other tools that don't care about order, great. But I think producing schema-invalid messages is just plain wrong, and likely to lead to hard-to-diagnose issues down the road as the tools evolve and your client base expands. Not to mention that if one of your clients tries to schema-validate, they're going to get an error.

 

In short, writing web services that are fragile with respect to the order you define your members in your source code is a recipe for disaster. It's way too easy for someone to come along and make this sort of change. Of course, I think it's generally a bad idea to expose your domain entities directly via web services anyway, but even if you have the sort of serialization layer that avoids this problem, you still might get bit by this.

 

And for our system, the problem was even worse. We'd written the schema ourselves rather than letting the system generate it, so either we had to change the schema to match the serialization patterns of our types, or we had to change the serialization to match the schema.

 

Between these two choices, I believe the only realistic one is the latter. For one thing, you may not control the relevant parts of the schema…we didn't in the system I was working on. For another, there's no guarantee that Reflection (which the XmlSerializer relies on) will always return type members in the same order. It does now, and it would break a ton of things horribly if they ever did change it, but all the same the order is not documented nor guaranteed to remain stable version-over-version. But the real reason is the one I already mentioned: you just simply shouldn't have to rely on the order of properties and fields in a source file to produce a correct system. Too fragile.

 

Given that the only realistic choice is to change the serialization, how do we go about it? Well, if you're developing for .NET 1.1 you have to implement IXmlSerializable, the interface that gives you full control of the serialization process. And that's usually a pretty reasonable choice: hand-generating XML via XmlWriter is pretty straightforward, if a little tedious. You can punt the implementation of ReadXml (which is usually much harder) if the type is only used as a return value from your WebMethods (as opposed to being an input), since ReadXml will never be called in this case.

 

If, on the other hand, you're working with .NET 2.0, you can make use of the Order property on XmlElementAttribute, like so:

 

public class Person {

  [XmlElement(Order=2)] public int Age;

  [XmlElement(Order=1)] public string Name;

}

 

If I had to boil all this down to a set of rules, it would be these:

 



  1. Don't serialize your domain objects directly. Use a separate set of serialization types. This is just a general best practice.

  2. Consider implementing IXmlSerializable/XmlElementAttribute.Order on all WebMethod return types.


  3. You really, really should implement IXmlSerializable/XmlElementAttribute.Order on WebMethod return types when you write the WSDL yourself.

 

All this just goes to show that you have to code "both first" - it is perilous to ignore either the XML or the tools. Take control!

7 comments:

  1. As you may expect WSCF 0.6 enables you to have the Order property generated from a WSDL for you.

    And of course you may want to model with the WSDL with WSCF itself :)

    ReplyDelete
  2. Nice.



    Of course, it wouldn't have helped in this case because we couldn't generate anything from the WSDL. The WSDL modeling piece might have been nice, though.

    ReplyDelete
  3. Debugging the ASP.NET worker process running at

    100% [Via: scott@hanselman.com (Scott

    Hanselman)...

    ReplyDelete
  4. If you'd gone code-first in this case, wouldn't the re-ordering of the fields have altered the WSDL anyway?



    Another possibility that could be applied in simple situations is use <xs:all> sections in your WSDL's XSD instead of <xs:sequence> - this allows the elements to appear in any order.



    If I remember right, wsdl.exe in 1.1 generates <xs:sequence> instead of <xs:all> sections so moving "Age" up would have altered the code-first interface too. Does it do the same in 2.0 or only when every field/property has an XmlElementAttribute.Order applied?

    ReplyDelete
  5. Yes - this is an issue even if you go code-first.



    I considered using <xs:all>, but it's best to avoid anything beyond the absolute core of XSD if you care about interop. ASP.NET 2.0 uses sequence just like ASP.NET 1.1, probably for the same reasons.

    ReplyDelete
  6. <sarcasm>It's great to see how all this new technology makes life easier for us.. </sarcasm>

    ReplyDelete
  7. WSDL being only one representation of the logical contract - a rather poor one since it doesn't support duplex, or pub/sub (rather pub - sub is an implementation detail). This post sparked by <a href="http://pluralsight.com/blogs/craig/archive/2006/04/06/21176.aspx">Craig Andera</a>.

    ReplyDelete