Thursday, July 8, 2004

XmlInclude - Not Quite as Useless as I Thought

Simon Horrell pointed something out to a bunch of us the other day that I thought was quite interesting. It has to do with the [XmlInclude()] attribute that's part of the System.Xml.Serialization stuff. Ordinarily, you use it on a base type to indicate that when serializing instances of that type, they might really be instances of one or more subtypes. This allows the serialization engine to emit a schema that reflects the possibility of really getting a Derived when the type signature is Base. For example, consider the following web service:

<%@ WebService class="Service" Language="C#" %>

public class Animal {
  public int legs;
}

public class Mammal : Animal {
  public int nipples;
}

public class Service {
  [WebMethod]
  public Animal GetAnimal() {
    return new Mammal();
  }
}

Note that GetAnimal is typed to return an Animal, but actually returns a Mammal. Well, if you were to look at the XML that came back from this method, you might be surprised to see that it looks like this:

<Animal>
  <legs>8</legs>
</Animal>

You'll observe that this result does not contain any mention of nipples - the “Mammalness” has been lost. Which makes sense - from an XML standpoint, you told me that the operation returns a type that just has legs.

One way we can “fix” this to give a result more in keeping with our OO expectations is to use the XmlInclude attribute on the base type, to list all the types that we might wind up substituting. It looks like this:

[System.Xml.Serialization.XmlInclude(typeof(Mammal))]
public class Animal {
  public int legs;
}

And if we run our web service now, we get this XML back:

<Animal xsi:type="Mammal">
  <legs>2</legs>
  <nipples>3</nipples>
</Animal>

Note the presence of the xsi:type attribute, which says, “This thing is actually a Mammal.” If you're generating a .NET proxy using VS.NET's Add Web Reference or the wsdl.exe command line tool, you'll even find that when you check the type of the object that comes back from the web service call, it's a Mammal, which derives from Animal. This is possible because in the schema for the web service, there's a bit that says that a Mammal is an extension (an XML schema term) of Animal.

If you're an OO person, however, you probably don't like the fact that you have to hang that XmlInclude attribute off the base type. The idea that a base type has to know about its derived types in advance is, well, just plain weird. This is where Simon comes in. He pointed out that you can actually put the XmlInclude attribute on the method instead of the base type. So if we change our web service to look like this instead:

<%@ WebService class="Service" Language="C#" %>

public class Animal {
  public int legs;
}

public class Mammal : Animal {
  public int nipples;
}

public class Service {
  [System.Web.Services.WebMethod]
  [System.Xml.Serialization.XmlInclude(typeof(Mammal))]
  public Animal GetAnimal() {
    return new Mammal();
  }
}

We'll get the same result (i.e. it'll have legs and nipples and use the xsi:type attribute) as putting the attribute on the Animal type. But putting it on the method is much more attractive from a code point of view, since it doesn't bake assumptions about type derivation into the base type. Rather, you put them on the method, where you probably have a good idea about what exact types are being returned.

Now, that said, I wouldn't write web services using this. Why not? Because if you care about interoperability, writing schemas that make use of extension elements and xsi:type isn't exactly “polite“. You don't know what technology your clients will be written in, and not all languages even have type derivation. Those that do, don't necessarily take the same view of it as C#. And even if they do, not all toolkits are smart enough yet to both map the XSD extension correctly and to look for the xsi:type attri

29 comments:

  1. The downside of this approach is that if you have multiple webmethods that return Animals, you need to be sure you annotate each one (it can be easy to forget - speaking from experience).



    Also, if you want to XMLSerialize the types directly for some reason (rather than using the ASMX framework to do it), including the XmlInclude on the base class is necessary.



    As for whether this appraoch is "polite", I'm not entirely convinced. Can't you assume that the client has full XML schema support? If not, which subsets are OK and which aren't?

    ReplyDelete
  2. Agreed - obviously there comes a point where it's harder to annotate the methods than it is to annotate the base type. But as I said, I'm no fan of this attribute in any form - at some point you cross the line that separates XML that's easy to convert to/from objects, and you're better off doing it by hand.



    Can you assume the client has full XSD support? Hell no. But which subsets? That's still shaking out. Some of the WS Basic Profile work is attempting to codify what's expected, but even there you've got lots of toolkits, and not all are conforming. It's a huge mess, and the rule remains "Know your clients". I don't expect that to change any time soon, so I tend to design web services that use pretty simple XML if I care about interoperability.

    ReplyDelete
  3. Just out of curiousity, do you know of any tools specifically that can't handle extensions?

    ReplyDelete
  4. Nope, nothing specifically.

    ReplyDelete
  5. One reason to use this technique is if you don't have access to the base type. That is, you want to take someone else's type and extend it, then write a WebMethod that can return either the base type or your extended type. If you don't have access to the base type source you can't add the [XmlInclude] attribute to it. But you can add it to your WebMethod.

    ReplyDelete
  6. Do you know if it takes a lot of performance using a long row of XmlInclude in a class?



    regards,



    Henrik

    ReplyDelete
  7. Performance questions are always answered the same way: it depends. :)



    In this case, it depends on how often you're using it, since there's extensive caching in the XmlSerializer infrastructure, and high setup costs.



    I'm not aware of any specific issues with XmlInclude, although I've never measured it. I'd be a bit surprised if it added significantly to the cost of using XmlSerializer.



    But there's no way around simply measuring the performance in your particular scenario. You just have to do it. Unless you're seeing massive slowdowns and are trying to figure out if this is the problem?

    ReplyDelete
  8. Can someone explain how I can get a class that is defined in an external assembly to be generated in the WSDL? I was experiencing a problem where none of the data encapsulated in this external class was being serialized down to the client. Looking at the WSDL shows an empty complexType node with the name of the external class. I've tried a combination of things to get it to serialize properly: Added an XmlType attribute to the external class to match the same namespace as the webservice; edited the asmx to make a reference to the assembly; used XmlInclude on the webservice method to try to force it to spit out the schema for the external type...NOTHING! Anyone have any pointers?



    ReplyDelete
  9. If I understand what you're trying to do, it's something I do all the time.



    Have you tried writing a simple test program that uses XmlSerializer to simply turn an instance of that class into XML? What happens when you do?

    ReplyDelete
  10. Craig,

    Thank goodness this may be something you do all the time! I'm sorry I haven't been able to get back to this blog, the weekend had me buried in snow. I'm going to write a quick program that will take an instance of the class and generate the serialized XML and I'll report back my findings.



    -Chris

    ReplyDelete
  11. Oh...my...god. The problem, it appears, isn't that it is in an external assembly. Rather, that the entire set of properites on the class is ReadOnly (only getters, no setters). I am guessing that since there is no support for denoting an attribute as read only in XSD, it doesn't translate well to web services. Mystery solved. Thanks for all your help.



    -Chris

    ReplyDelete
  12. Yes, XmlSerializer works by instantiating an instance of your class and then calling the property getters and setters. So if no setter exists, then XmlSerializer (and therefore WebServices) won't work.



    There is a way around this, however. You can implement an interface called IXmlSerializable, and take care of the serialization yourself. Depending on how complex the XML and the class are, this may be easy or difficult.

    ReplyDelete
  13. Brilliant, thank you so much. I've been working on this for awhiel now.

    ReplyDelete
  14. Hi, im passing a parameter to a webservice function which is like this



    [WebMethod]

    public void myfunc(MyClass abc)

    {

    .......

    }



    Now, when i create an instance of "MyClass" and try to call to the myfunc method from a windows app (after adding a reference to the webservice), i get the error saying "there was an error generating the xml document". I tried all the methods u have mentioned, but i still get the error. What should i do

    ReplyDelete
  15. The two likely causes are



    1) A misconfiguration of your web server. Be sure that you can hit it with a browser from the machine where you're running the windows app.

    2) A problem with the serialization of MyClass. You can test this by using XmlSerializer directly against an instance of MyClass, without going through web services. I think Chris Sells has a tool that will help you do this.

    ReplyDelete
  16. It strikes me that it would be far more useful to be able to annotate the derived class as opposed to the base class - the whole idea of a base class is for it to be unaware of its subclasses.



    A better syntax / concept would be:



    public class Mammal : Animal {

    [System.Xml.Serialization.XmlInclude(typeof(Animal)

    public int nipples;

    }



    Then the serializer would be able to check and see if this subclass was allowed to be serialized in the context of serializing the superclass.







    ReplyDelete
  17. The biggest problem I can see with this is that it screws up schema generation. If you're writing a WebMethod, and you want to be able to generate code at the other end that will deserialize as either a Mammal or an Animal, then the signature of the WebMethod has to include information about what other types can be returned. Which means annotating the base type.

    ReplyDelete
  18. I agree that decorating a base class with XMLInclude for all classes that inherit from it defeats the purpose of inheritance.



    Turns out you can use the XMLInclude attribte to decorate the WebService class, that way you don't have to decorate each method. Also, if you have a webmethod that either returns the base type, or accepts it as a parameter, you don't even have to mess with XMLInclude.

    ReplyDelete
  19. I don't know if anyone reads this, but I ran into problems with xmlinclude so here is my little type hack:



    //serialize data

    string str = Serialize(content, types);



    //create xml document

    XmlDocument doc = new XmlDocument();

    //load xml document from string data

    doc.LoadXml(str);

    if (doc.DocumentElement.Name != content.GetType().Name)

    {

    //create attribute

    doc.DocumentElement.SetAttributeNode("type", "http://www.w3.org/2001/XMLSchema-instance");

    //add xsi prefix

    doc.DocumentElement.Attributes["type"].Prefix = "xsi";

    //set type value (content is the object we are serializing)

    doc.DocumentElement.Attributes["xsi:type"].Value = content.GetType().Name;

    }



    and you end up with xsi:type="myobjtype" in your xml header

    ReplyDelete
  20. Nipples thats funny....

    ReplyDelete
  21. You can milk anything with nipples.

    ReplyDelete
  22. Thanks for this, and for the caption. When I saw in the google results 'not quite as useless as i thought' I immediately knew it was addresses the problem I'd just come against.

    cheers,

    Ian.

    ReplyDelete
  23. This does not work if "Animal" Class inherits a collectionbase or List

    any idea if that was the case?

    ReplyDelete
  24. I've had no many problems trying to XmlSerialize a List<_ collection="collection" from="from" derive="derive" usually="usually" i="i" that="that"> instead.

    ReplyDelete
  25. I intend to use the same solution. When I was searching for one I didn't realise you can actually put <_.XmlInclude..> infront of a class. My solution incorporates it on the WebMethod itself.

    With all that being solved - did anyone measure the performance (a comment from VisualCron above)?

    ReplyDelete
  26. I think there is an other way to do this:
    1) you create a new XmlIncludeAttribute:
    XmlIncludeAttribute incAttribute = new XmlIncludeAttribute(typeof(DerivedClass));
    2) you create a new XmlAttributeOverrides:
    XmlAttributeOverrides ovsAttributes = new XmlAttributeOverrides();
    3) you add the IncludeAttribute to the attributes in ovsAttributes
    ovsAttributes.Add(typeof(BaseClass), incAttribute);
    4) XmlSerializer ser = new XmlSerializer(typeof(classToBeSerialized), ovsAttributes);

    ReplyDelete
  27. Sorry, I was wrong. This doesn't work... It is usefull for another types of attributes. however you can pass all the derived classes (that you probably don't know when you compile the base class) on the constructor of the class XmlSerializer.

    For example:
    XmlSerializer serializador = new XmlSerializer(typeof(OneClass), null, new Type[] { typeof(DerivedClass1), typeof(DerivedClass2) }, null, null);

    ReplyDelete
  28. Hey Craig, funny I was just googling this XMLSerializer error and your page turned up in the first page of results. Small world.

    I'm running across this issue just dumping an object Graph to XML. Unfortunately, neither annotating the base class nor a method will help me. Martin's solution is the only one available to me. Unfortunately something in the obnoxious business model library I'm forced to use is overriding the behavior. As it's just a piece of my test harness, it's not worth too large a time investment.

    In the troubleshooting process however, I discovered the reference below that describes how to turn on preservation of temporary serialization assemblies, and debug through them. Might help somebody else that ends up here.

    Cheers,
    Rob

    http://msdn.microsoft.com/en-us/library/aa302290.aspx

    http://msdn.microsoft.com/en-us/library/aa302290.aspx

    ReplyDelete
  29. Have you guys tried with JSon serialization? - This might not be an issue if you use this serialization mechanism instead of XML. As for myself, I have always dealt with this issue in XML serialization the same way (anticipating all the derivate types in the base class attributes is certainly not very elegant). but I've never verified if JSON has the same issue. That's just an idea...

    Regards.

    ReplyDelete