Wednesday, April 21, 2004

Constructor or Factory

I had an aha moment the other day. It's nothing that lots of people haven't already figured out, but it helped me see things more clearly, so I'm posting it here.


We were doing some design on some classes for the MSDN project I'm working on, and we were debating how to model a piece of content moving through the system. In the system we're working on, a content set (our term for a page and any other associated files or metadata) flows from the content creator, through a publishing pipeline, and into a database. From there, it's rendered out to the MSDN website.


The problem is that, when content is coming into the system, it starts life as a lump of XML. But when the content is being rendered, it starts life as a database record. Further, there are some properties of the content set that are only present at render time, and some that are only present when the content is being loaded into the the database. We were torn whether to have something like this:


public class ContentSetBase { }
public class DbContentSet : ContentSetBase {
  public DbContentSet() { }
}
public class XmlContentSet : ContentSetBase {
  public XmlContentSet() { }
}
public class ContentSetFactory {
  public DbContentSet CreateContentSetFromDatabase(db args) {}
  public XmlContentSet CreateContentSetFromXml(xml args) {}
}


or something like this:


public class ContentSetBase { }
public class DbContentSet : ContentSetBase {
  public DbContentSet(db args) { }
}
public class XmlContentSet : ContentSetBase {
  public XmlContentSet(xml args) { }
}


The debate was interesting. I was all for the second approach, since I didn't like mixing XML and Database goo in the same class - it just feels like bad separation of concerns. But there was that word...Factory...and it seemed like that's what it was, so it sort of made sense.


I'll admit, I'm not even remotely knowledgable about patterns, but the one I do have sort of a clue about is the factory pattern. You see it all over the place, and it's hugely useful. But here's the trick: it doesn't apply here. Why not? Because the calling code already knows what type of object it wants. Our database loading code always creates an XmlContentSet, and our web page code always creates a DbContentSet. The factory would only be really useful if we were doing something like this:


ContentSetBase cs = ContentSetFactory.Create(args);


and we didn't care whether we got back a DbContentSet or an XmlContentSet. In other words, an important part of the factory pattern is that the factory makes the decision about what type of object to create. In the factory above, the caller was making the decision by calling one or the other method.


Now, that's not to say that the code wouldn't have worked either way: it would have. But it makes more sense to me the second way, and realizing that rule is going to help me figure out in the future when a factory is appropriate, and when it isn't. And that's a useful trick to have in my pocket.


Oh, and in case you were wondering, in the end we wound up doing something else entirely, after figuring out a bunch of other constraints. :)

9 comments:

  1. Craig

    You just gave a new jolt of life to this poor pattern-challenged soul. All along in my 7+ years of programming I tended to use only the factory and singleton pattern heavily. The way the glitterati of the IT world talk about patterns I felt like I would be relegated to some leper colony. Fortunately I have found someone who is courageous enough to admit knowing not-so-much about it and who has been in the trenches much longer than I am. Thanks for lighting a fire in my life :-)

    ReplyDelete
  2. Hi Craig,

    VERY interesting post! I wrote some about my factory-"adventures" the other day here:
    http://www.jnsk.se/weblog/posts/decadefactories.htm

    Not that I know extremely much about patterns, but as I see it, a factory isn't *only* useful to hide the concrete type. For example, it's very useful for setting up complex types.

    And talking about factories, I like the view Eric Evans presents in his book "Domain Driven Design" where he says that factories are for creating new entities, repositories are for getting entities back after having saved them to XML or a database for example. It's just another matter of separation of concern, creation versus "fetching".

    Best Regards,
    Jimmy
    www.jnsk.se/weblog/
    ###

    ReplyDelete
  3. Dilip - Yeah, I'm pretty much a pattern illiterate. :) One of these days I hope to learn more.

    Jimmy - thanks for the pointers. Looks like interesting reading!

    ReplyDelete
  4. Great Post. It was definitely an Aha moment for me too :)

    ReplyDelete
  5. Glad you enjoyed it. :)

    ReplyDelete
  6. Another benefit of factories is that they give you control over how objects are allocated. Whereas a constructor always allocates a single new object, a factory may decide to preallocate objects in blocks or pull objects from a pool.

    FYI, I believe that Joshua Bloch's "Effective Java" book discusses the pros/cons of factories and ctors. (Not sure, but it might be the first item in the book.)

    ReplyDelete
  7. Very true. Another good rule for the ol' mental rulebook.

    I'd describe it as a "property" of a factory, rather than a "benefit", but I've been accused of being pedantic before. ;)

    ReplyDelete
  8. It seems that the DbContentSet and XmlContentSet are both being used in two different stages of your pipeline and really don't interact with each other. If this is true, what benefit do DbContentSet and XmlContentSet get from inheriting from ContentSetBase especially if the calling code already knows what type of object it wants?
    The only suitable use I could possibly see is code reuse. But inheritance for just code reuse shoves an unnecessary tight coupling between your classes. Changes to the interface of the DbContentSet might propagate to the ContentSetBase which in turn propagates changes to the XmlContentSet. Maybe you should look into using partial classes consolidating the similar code to just one file that both classes keep separate from the rest of their functionality.

    ReplyDelete
  9. Wow, John - a guitar god *and* a coder. Very impressive! :)

    A few comments on your comment:

    1) We ultimately went a different direction that involves a single ContentSet class for data storage and two completely separate "factories" that have no relationship to each other.
    2) Inheritance is a perfectly reasonable way to get code reuse. The fragile base class problem you're describing is generally only an issue when you don't control all the code - i.e. you're letting someone else derive from a base class you wrote, or you're deriving from a base class that someone else wrote. In this case, we're writing all the code.
    3) Using partial classes seems like a massive hack as - among other things - it requires that we keep the name of the class the same in both assemblies. That's pretty fragile.

    But keep up the good work in Dream Theater. ;)

    ReplyDelete