Saturday, June 7, 2003

Side Effects and Code Generation

I’ve been working with Tim Ewald this week here in not-so-sunny Minnesota. As is usual when I hang out with Tim, my brain hurts from trying to keep up.

We had an interesting discussion over lunch yesterday. We were discussing – among other things – how the CLR has made things so much easier than they used to be that programmers have fallen into a death spiral of trying to abstract everything away. The fact that you can write the same application with about 4x less code leads us to try to cut things even further: a whole app in zero lines of code being the ultimate experience. Obviously I’m exaggerating, but I think the phenomenon of seeking to do things with fewer and fewer lines of code is a real one. And not a new one – if you’ve ever seen a C++ program that got a little bit too clever with preprocessor macros, you know what I mean.

The latest incarnation of “less is more” is the declarative programming model popularized (in the Microsoft world) by MTS and then COM+. The idea there was that you’d simply write your code, flip a few switches, and suddenly all sorts of magical services would appear at your fingertips. Transactions! Synchronization! Security! Just check a box. Or, if you’re programming in the CLR, just add an attribute. Clemens Vasters has been making the rounds lately showing how he has embraced this model to allow you to do all sorts of things by simply adding attributes to your code. It sounds like a great idea.

The problem is, it doesn’t work. It never has.

For a while, I tried to take the other side against Tim. Probably mostly because I’d recently done a bunch of this “hide everything from the programmer” work for a client of mine. But Tim has traveled this road for too long – after all, he wrote the best COM+ book there is. He made mincemeat out of my arguments. Which is hardly surprising – I studied COM+ under Tim for years, and already agreed with him. I just needed to be reminded. Tim pointed out a few things I already knew:

·        How COM+ is fundamentally broken around synchronization and STAs

·        How switching off one service in COM+ often requires you to switch off several others (e.g. transactions and synchronization)

The basic problem here is that services are not generally orthogonal. What that means is that it’s impossible to simply slap a new behavior onto code without understanding all the other behaviors that are already there. Don’t believe me? Then ask yourself why the CLR’s context infrastructure has IsContextOK. This is the method that – when implementing a new service – lets you look around your environment and decide if you’re compatible with all the other services that are already present. This has two basic problems:

1.      How do you know if you’re compatible with a given service that didn’t even exist when you were created?


  1. I'm on the codegen train these days, but primarily for generating the low-level parts of my data architecture. In my case, I'm using it to tackle my object/relational mapping. Instead of writing thousand of lines of code for the object definitions and to handle moving my objects between the CLR and SQL worlds, I write hundreds of lines of XML describing my objects and use a script to generate the code. My own codegen engine is written with extensibility in mind - if you want different behavior in a couple of places, you can instruct the codegen engine to use a custom implementation (in my case, it's a static method on a type that is local to the assembly being generated and resides in a manually-generated code file). I have already seen the benefit of this approach first-hand: when we wanted to change the error-handling in the data access layer, I simply made a change to the codegen script and re-genned the code.

    I have considered using this approach to also generate the business-logic layer, but I am currently of the opinion that the business rules require more flexibility than codegen can realistically provide. Business rules tend to be app-specific, and you'd probably spend as much time writing the codegen code to handle all of the possible combinations of rules as you would just writing the code. There is an ease-of-maintenance case to be made, though, so I'm not closing the door on it.

    As far as the "the codegenned code had problems and we had to fix them by hand" argument goes, I can speculate: this was using a third-party's "black box" code generation tool. My golden rule is: never ever hand-modify automatically generated code. Doing so is 1) a recipe for a very unhappy developer and 2) very inefficient. If you find that the generated code needs fixing, fix the engine used to generate the code. If you don't personally have the source for the engine, rain fire upon the developer to fix the issue. I personally favor writing the engine yourself. If you are evaluating codegen engines, be very cautious about using shrink-wrapped packages - there's a good chance that you'll be stuck shoehorning your problem into *their* idea of how the problem should be solved.

    Incidentally, templates (aka generics) will still have their place - there will always be some low-level stuff (custom typed collections being the classic example) that is served well by templates. In general, I'll be using them to make my codegen scripts shorter.

    And yes, I believe you can do a LOT more with this approach to solve the problems that AOP is trying to solve.

    Whoa. This comment got a lot longer than I anticipated it would.

  2. I couldn't agree more Craig. Here's a manual TrackBack:

    AOP and Code Generation


  4. The trouble I have with the current practice of code generation is that it seems to often lead to those solutions which...require the most code! That is not to say that it can't be a very effective tool - generating CRUD stored procedures seems to be a good fit, for instance. But I'm wary of various code-generation tools that suggest "tens of thousands of lines of code" can be generated for you. Often, when you see the generated code, you will ask the question "Would I have solved this problem in this way? In a way that would require this much code? Even if generation of the code was 'free'? "

  5. Wow! Great comments. I was going to respond here, but I think I need a followup post on the main blog to address these.

  6. Ping..

  7. I too, cannot wait to hear what Neward has to say. My thoughts are still forming on this subject, and I have yet to hear a good counterargument. I hope that Ted can provide one.

  8. I've gone from code gen to AOP. and now you got me pondering codegen again. Ack!!!! : ) . . . I've found good success with AOP, even with orthogonal AOP with keeping aspects transparent to each other. Though it's not as simple as slapping some attributes on a class interface. Basically I create an Aspect "pipeline" using a context bound object and coresponding attribute. This AopPipleineAttribute is the only ContextAttribute I use. The pipeline chains the aspects together and initiates the interception processes in pretty much the same way .Net does, except I added the additional capablitiy of the pipeline being able to handle the processing order of different aspects via a tiny xml configuration file cached in memory. I have yet to see a real world senario that this configuration can not handle. Even if I need to customize it, I can always inject the aspect into the pipeline and expose overriding capabilities of the pipeline manager. I've done tons of code gen and have to agree that the amount of code files that I can replace with a touch of reflection and a well thought out aspect, by far, seems to make investing time in evolving my aspect pipeline worthwhile. especially considering the orthogonal issues and contextual significance of individual aspects can be dealt with using a knowledgable installer or administrator with a configuration tool.

    of course, that's just my opinion and I could be wrong ;)


  9. In reality I use both code gen and aop. . . the code gen creates the pipeline objects, attributes, and mappings. I use a custom drag and drop designer to map entities from schemas/db tables to class objects. It then creates the completed abstract classes for the business model and the concrete classes for overriding business logic. The model is a self contained business layer that is free from specific bindings. The aspects and attributes control the security, logging, workflow bindings, validation, and persistence of the object model.

  10. I know, this is a *bit* late.

    Anyway, here is a blog post, commenting on your post: