CraigBlog: June 2003

Monday, June 30, 2003

Clemens Weighs in on AOP

I was out at The Daily Grind today (ego surfing my referrers log, of course) when I noticed this link to Clemens Vasters’ blog. In it, he discusses his experiences over the last year with his COM+-based aspect-like framework. A lot of it is about the guts of what he did, but towards the end he has some very interesting things to say about how hard it is to compose what are non-orthogonal services in an orthogonal manner. Sound familiar? J

"I told you so" implications aside, I have to say I find it very interesting to hear Clemens' viewpoint, since he's been hammering so hard on this problem for a year. I think his ideas around providing rollback semantics for aspects sounds like COM+ all over again, but he's been spending a lot of time in this space, so I'll be curious to see what he comes up with. Good luck Clemens!

Sign then Encrypt or Encrypt then Sign?

I was doing some research online at work today, trying to optimize
some of the security infrastructure I’ve been working on for my client. I ran
across this article in
my search to remember whether, when both signing and encrypting a message, one is
supposed to sign first or encrypt first. I knew that doing it one way was bad, but
couldn’t remember which was which.

The article goes on at length about the many issues around signature
and encryption, but I mostly want to point out the first part, where it talks about
why you should always sign first, then encrypt second.

Sunday, June 29, 2003

I started a new book today

Don has started a new book. His first sentence is a good one.
I have some idea what he’s writing about (but I’ll let him reveal that
in his own time), and I think it has the potential to shake up the way a lot of people
think. I’m looking forward to reading it!

Here's the first sentence:

Software lives at the boundary between objective
and subjective reality.

More to follow.

[Don
Box's Spoutlet]

Escaping and Learning

The discussion over on Sam’s
Wiki about the future of RSS continues. I wish I had the cycles to follow this
more closely, but Tim Bray has this writeup about one of the interesting issues being
discussed:

This entry is specifically about a particular
technical issue in the next-gen syndication-format design
exercise, but more generally about the wonderful experience of getting a better
understanding of complicated things. The lesson continues... update #1 (skip to end)....

[ongoing]

Friday, June 27, 2003

NGEN Improvements

Jim Hogg of Microsoft posted this little nugget on the DOTNET-CLR list in response to a question about how to get NGEN running as part of a setup script.

NGEN stores its results into the "Native Image Cache" -- physically, a collection of directories below \windows\assembly -- it's not actually the GAC, but brought to us by the same dev team.

Note that there's a couple of items to watch out for with using Custom Actions to trigger ngen: they need to run during a setup phase where it has SysAdmin privs; and as a consequence, there's a gotcha if the user disables rollback, yet ngen fails. (I don't understand the details: my info is second-hand).

Also, beware of "servicing" issues. For example, the user upgrades his machine; or changes security policy. In these cases, your NGEN'd images may be rendered unusable. In which case, the CLR will silently fall back to JIT'ing -- so it's not a fatal disaster -- all your Apps still work. To guard against this, you might think about providing a small command file that the user can run to re-NGEN your assemblies, putting Humpty Dumpty back together again.

(As use/importance of NGEN grows, so we're working on making NGEN "self-healing" in the face of servicing -- for the Whidbey release)

I think the most interesting part about his (very nice) response is the last paragraph. Seeing hints like this about Whidbey is always interesting.

What’s especially interesting in this case is that it sounds like they’re about to make NGEN (potentially) a whole lot more useful. If you don’t already know, NGEN is the tool that pre-compiles your assemblies, to save the overhead of the JIT compilation that occurs at runtime. Today, NGEN isn’t very interesting, because any change to the system rightfully invalidates the compiled image and causes it to fall back on the regular JIT compiler. If NGEN images were self-recreating, it could provide a performance improvement, especially around startup times, which is where most of the JIT cost is. The tradeoff is a larger working set, but that might be worthwhile in some scenarios.

Thursday, June 26, 2003

Pair Programming

Sam posts
a link to this
article which defends the use of pair programming, where two programmers collaborate
in front of a single computer to write software. I’ve never tried this
formally in a professional setting, but I do have some experience with the phenomenon:
as an instructor I routinely write small programs in front of an audience. Anyone
that’s done this knows that the viewers don’t have to know the technology
you’re using very well to catch mistakes that you make.

In addition to this, I’m involved in a project right now where
the team is geographically distributed. I spend a lot of time on the phone, and even
with that relatively poor communication mechanism, we’ve had good luck working
through code together, both authoring and debugging. We got together in Minnesota
a few weeks ago, and had even better luck when we did the same thing with the code
up on a projection screen.

Of course, I think one of the contributors to the productivity increase
is that it makes you focus more. If I could write code for two hours straight on my
own without checking email or the web – something I wouldn’t do when I’m
using someone else’s time, too – I suspect I’d get more done. The
paper Sam references has some quantitative results, but I’d be curious to see
something more rigorous. And to try this approach myself some time, on some real project,
end-to-end.

Tuesday, June 24, 2003

I Like Pie

Tim Bray posted this recently:

Sam Ruby has, over the last week or so, been quietly
at the center of a lot of intense discussion with the goals of clarifying
what a “log entry” is, and now building
a roadmap around it. Now they’re asking people to put up their hands and
say whether they support this or not. I support it strongly, with (a typically lengthy
list of) caveats, amplifications and digressions....

[ongoing]

The gist of it is, it might be time to pitch RSS, take the best ideas
out of it, and start over. Given the tarball that RSS is right now, I don’t
think this is a terrible idea. Had we done the same thing with HTML in about 1995
(and perhaps again around 1998), we might be in better shape today.

Certainly, the discussion on Sam’s
Wiki is interesting enough to be worth noting.

Sunday, June 22, 2003

Managed Direct3D: Z-Buffers

It’s been a long hiatus, but now that I have my writing infrastructure working, I’m able to continue my managed Direct3D tutorial series.

The latest issue is about z-buffers, a technique for making sure that things that are behind other things look like they’re behind those things. You can find it here.

I’ll hopefully be writing the next in the series this week, when I’ll deal with meshes. I’m on the hook to write more of these, as the series has been picked up by the German magazine DotNetPro. It was pretty funny to receive the issue in the mail, and not be able to read a word of the article.

CraigWriter.Write 2.1 in place

I’ve been working on completely overhauling my long-form writing infrastructure. It’s taken a lot longer than I wanted to, mostly due to my desire to do make use of CSS, and my inexperience with that technology. But I finally have something that I think will work for me and for you (for now).

You’ll notice a new link on the left: Writing 2.1. This will take you to where I’ll be putting all of my long form writing. I’ve already moved the managed Direct3D tutorial there. Although I’ll keep it where it is, all future updates and bugfixes will go in the new site.

At some point in the near future, I’ll post the code that I’m using to do this, but I’ve already spent way too much of my Sunday getting this stuff deployed. Still, I’m pretty happy with it – it already integrates with BlogX’s referrer tracking, and soon I hope to add features like RSS feeds for the long-form writing and deeper integration with the BlogX infrastructure.

If you see anything wrong with the HTML I’m producing (and I know you will – it’s got issues) please email me at candera@develop.com and I’ll try to fix it. I’d be especially appreciative if those of you who are more HTML-savvy than I could give it a once-over and tell me where I’m doing something stupid.

The XML Big Picture

If you were confused about XML, this
picture will help…or make it worse. :)

But the cool thing is, the pictures are linked to the specs.

By way of Cedric Beust's
weblog.

Friday, June 20, 2003

Tim's Thoughts on AOP

I’ve been bothering Tim Ewald for a while to post a followup to my comments about AOP and code generation, since he was the one that inspired me to write about it in the first place. Well, he’s having trouble with his blog, so I’m posting it here. Now if only Ted Neward would get off his butt and post his response. :)

From Tim Ewald:

I've been hanging out with Craig a lot lately, he's doing some work for our team. He posted his thoughts on AOP a couple weeks ago and mentioned my work in that area. We've been carrying on the conversation ever since, and I promised him an online response.

I did a ton of work with AOP when I wrote Transactional COM+. By the end of that project, I had reached one main conclusion: call-interception is not a good, general purpose extensibility point. COM+ (at the time) was an entirely closed system with a fixed set of services that could be used in conjunction with one another. Even in that very finite universe, the interplay between the services is quite complex. Making that universe infinitely large by allowing third parties to mix in additional interceptors with arbitrary behaviors‚ is a mistake. The complexity goes through the roof, because, as Craig noted, the services typically are not orthogonal.

Is interceptor-based extensibility any worse than other types of extensibility? I think so. Method-level interceptors change the path from a call-site to a method's code. It's pretty hard to write reasonable calling code if you don't know how method invocation actually works and it varies on a case-by-case basis.

Probably the best example of the potential pitfalls is error handling. I want to get the exceptions thrown by the methods I call. If there are n interceptors between me and the method, that may or may not happen. Maybe an interceptor will catch the original exception, and throw a new exception. If I don't know that happened, how will I ever track down the real error? And what if the original function didn't cause an exception at all, it just left the world in a state that some interceptor didn't like? How do I know what happened? The path from a call-site to method code should be sacred. Interceptors can alter in ways that are not clear (and often are not documented), in essence encouraging "programming by side effect", which is a really bad idea.

COM+ 1.5 added support for "Services without Components", aka CoEnterServiceDomain, and Everett exposes this new functionality as managed code. No components means no interception, which I like. These APIs were really designed to allow COM+ services to be deeply integrated with low-level plumbing, but I like to use them in my own application code. Craig and I wrote an MSDN Magazine piece about how this works. However, judging by the ratings, most readers don't find the idea too attractive (or they just don't like our prose ;-).

Thursday, June 19, 2003

A Different LOC Metric

I had dinner with a good friend of mine here in Boston tonight. He’s technical, too, so we got to talking about code. I mentioned something about how I’ve been programming a lot lately, and it detoured us into a discussion about lines of code (LOC) metrics.

He brought up an interesting one that he’d heard recently. The idea is that it’s not so much a good idea to measure how many lines of code a developer is producing (e.g. 7 per day), but rather how many lines of code they are responsible for. And that there’s an absolute maximum. In other words, once I’ve got 50KLOC, I’m done – all my time is going to be spent in maintenance.

I’m not saying 50K is the number – in fact the number will be different for different developers – but the idea that a single developer has a capacity resonates with me. I know how hard it is to go back to things I haven’t looked at for even a week when I’ve got a lot of stuff going. And forget about supporting code that I wrote two years ago.

What does this mean? A few things, I think:

1) Refactoring is super important. The fewer LOC you have, the more functionality you can support.

2) As you write code, you need to hire more people. Duh, but if someone can figure out the numbers, it may help plan. Or motivate refactoring.

Also, in light of recent discussion, it begs the question: does a line of code spit out ten times by a code generator get counted once or ten times? I’d guess the former, but it’s an area I’d love to see experimental data on.

Monday, June 16, 2003

Do Not Touch The Tools

OK, reading this
one actually hurt – both the thought of it, and the laughing about it.

"Do not touch the tools."

[BitWorking]

CryptoGram

Bruce Schneier just published the latest issue of his newsletter “CryptoGram”,
available here. My favorite
bit was his link to this
story, wherein an Idaho police department gave their officers laptops with wireless
access to the police network. As you may know, it’s basically trivial to crack
a 802.11 networks, even with WEP encryption enabled. To get around this, the police
department

added security by using a hard-to-crack
proprietary encryption protocol

This is a huge red flag
– secure protocols are generally not proprietary,
but rather developed in the public eye where they can undergo scrutiny by a community
of experts.

I’ve been down this road myself – I’ve invented security
protocols for clients only to invariably discover serious flaws in them months later
when I came to understand the problem better. When I went back to fix the problems,
I almost always converged on something that already existed – SSL, Kerberos,
whatever. Just look at WEP, the wireless encryption standard – even using “128-bit
encryption” it can be cracked in a matter of hours by freely available tools.
And even protocols that are thought by most experts to be secure can be cracked trivially
when they are implemented poorly.

Rule number one in security is: don’t
invent your own. We’ll see how long it is before that police department
gets bit by their mistake.

Friday, June 13, 2003

Lost at 10,000 Feet

There we were, lost in the middle of the forest at 10,000 feet. We’d hiked all day the day before, having flown in from sea level just the day before that, so we were tired. We’d followed the trail easily enough for the first two hours of our hike from Ouzel Lake, but now, just a mile from Thunder Lake, it sort of disintegrated into a hundred different sets of footsteps wandering off across the thick snow cover in all directions. We did our best to pick out the trail, but pretty soon we found ourselves standing in the middle of the woods saying, "Now what?"

It might not have been so bad if it hadn't been raining. And snowing. And hailing. It might not even have been so bad if there had been less than four feet of snow on the ground. Or if we were able to take more than twenty steps across said snow without punching through to our ankles, to our knees, or even occasionally to our hips. Or if our packs had weighed a little bit less, enabling us to crawl out of the the little snow sinkholes more easily.

I've done a fair amount of backpacking before, and I'd have to say this was the closest I've ever been to worried. We were fortunate that we had a decent map and knew how to read it. We were lucky that we happened to know we were between two streams, and that the trail was somewhere downhill. And we were damn happy when we finally found the campsite and were able to stop walking.

All in all it was a great trip. And it's getting better. Like most backpacking trips, I'm often wondering "why do I do this to myself" when I'm out there, but by the time I get back it's already turning into a fond memory. Some pictures here.

Thursday, June 12, 2003

Keith's New Book Developing Online

Keith Brown is writing his new book out in the open – he’s posting it on here on his website as he writes it. The tentative title is A .NET Developer's Guide to Windows Security. If you’ve read Keith’s Programming Windows Security, you’ll know this is something to be excited about it. Read it! Send him feedback!

Optimization - It's All the Rage

Quite a few people have been talking about optimization lately. Notably Sam, Don, and Tim. I particularly like what Tim has to say:

1. Design and code your app, trying hard not to do anything really stupid, and striving for flexibility.

2. If it’s fast enough, don’t worry any more.

3. If it’s slow, get out your profiler and measure things until you understand where the problem is.

4. Fix the problem, which may well require major refactoring, but that’s OK because that’s probably coming at you pretty soon anyhow with the next batch of requirements. Furthermore, you couldn’t have avoided it because nobody is smart enough to predict where the bottlenecks will be in a complex application before it’s running.

I can’t even count how many times I’ve seen people ask questions on the mailing lists that show they’re trying to optimize their system at a micro level before they’re even done coding. It’s soooooo tempting – I find myself constantly fighting the battle with myself (and last week, with my coworkers). I often lose. And then I find something like this is the slowest thing in my application.

Why do we do this to ourselves? Is it because we measure our professional worth by how cleverly we program? Is it because CS degrees have at least some focus on algorithms?

schtasks

Keith Brown just pointed me to schtasks, the new (as of XP, apparently)
command-line task scheduler. It’s the replacement for the venerable at command,
and running schtasks /? from the command
line shows that it has a boatload of
options. Including the abilities to run the tasks under arbitrary credentials and
to terminate tasks at a specific time. Cool.

I’ve been meaning to get my backup story whipped into shape…this
may be just the ticket.

Saturday, June 7, 2003

Gone for a few

My wife and I are about ten hours away from walking off into the Rocky
mountains. We’ll be spending a few days camping here in Colorado. The weather
looks to be beautiful, although at higher elevations it seems we’ll be hiking
over, and camping on, snow. I’m looking forward to it!

Side Effects and Code Generation

I’ve been working with Tim Ewald this week here in not-so-sunny Minnesota. As is usual when I hang out with Tim, my brain hurts from trying to keep up.

We had an interesting discussion over lunch yesterday. We were discussing – among other things – how the CLR has made things so much easier than they used to be that programmers have fallen into a death spiral of trying to abstract everything away. The fact that you can write the same application with about 4x less code leads us to try to cut things even further: a whole app in zero lines of code being the ultimate experience. Obviously I’m exaggerating, but I think the phenomenon of seeking to do things with fewer and fewer lines of code is a real one. And not a new one – if you’ve ever seen a C++ program that got a little bit too clever with preprocessor macros, you know what I mean.

The latest incarnation of “less is more” is the declarative programming model popularized (in the Microsoft world) by MTS and then COM+. The idea there was that you’d simply write your code, flip a few switches, and suddenly all sorts of magical services would appear at your fingertips. Transactions! Synchronization! Security! Just check a box. Or, if you’re programming in the CLR, just add an attribute. Clemens Vasters has been making the rounds lately showing how he has embraced this model to allow you to do all sorts of things by simply adding attributes to your code. It sounds like a great idea.

The problem is, it doesn’t work. It never has.

For a while, I tried to take the other side against Tim. Probably mostly because I’d recently done a bunch of this “hide everything from the programmer” work for a client of mine. But Tim has traveled this road for too long – after all, he wrote the best COM+ book there is. He made mincemeat out of my arguments. Which is hardly surprising – I studied COM+ under Tim for years, and already agreed with him. I just needed to be reminded. Tim pointed out a few things I already knew:

· How COM+ is fundamentally broken around synchronization and STAs

· How switching off one service in COM+ often requires you to switch off several others (e.g. transactions and synchronization)

The basic problem here is that services are not generally orthogonal. What that means is that it’s impossible to simply slap a new behavior onto code without understanding all the other behaviors that are already there. Don’t believe me? Then ask yourself why the CLR’s context infrastructure has IsContextOK. This is the method that – when implementing a new service – lets you look around your environment and decide if you’re compatible with all the other services that are already present. This has two basic problems:

1. How do you know if you’re compatible with a given service that didn’t even exist when you were created?

Wednesday, June 4, 2003

runas Magic

If you have followed any of the “Running as non-admin”
traffic that’s been fairly prevalent of late, you’ve probably heard of
the runas command. Runas lets
you launch a process with alternate credentials in the current window station.
Generally, you use this to do things like fire up a new instance of Visual Studio
under administrative credentials so you can debug ASP.NET applications or something.

Today I ran across an entirely new option: the /netonly switch.
Using it means that the credentials you supply don’t
have to be valid on the machine you’re running it on, but will still be passed
on when remote calls are made! So
cool. Why? Because I’m doing work with Microsoft, and I need to do things against
their servers that require authentication. I don’t want to join my machine to
their domain, which means I can’t get a process running under my Microsoft domain
account. However, using this switch, I can make a process look to remote systems as
if it were running under my Microsoft domain account. This turned out to be crucially
important for getting our build process working on my machine.

The one caveat is that since it doesn’t do an actual login, it’ll
take whatever password you throw at it. Even if it’s wrong – you won’t
find out until you try to actually use those credentials.

Tuesday, June 3, 2003

Public Domain Enhancement Act

This sounds like a good idea to me. I signed.

From Larry
Lessig:

We have launched a petition to
build support for the Public Domain
Enhancement Act. That act would require American copyright holders to pay $1 fifty
years after a work was published. If they pay the $1, the copyright continues. If
they don't, the work passes into the public domain. Historical estimates would suggest
98% of works would pass into the pubilc domain after 50 years. The Act would do a
great deal to reclaim a public domain.

This proposal has received a great deal of support.
It is now facing some important lobbyists' opposition. We need a public way to begin
to demonstrate who the lobbyists don't speak for. This is the first step.

If you are an ally in at least this cause, please
sign the petition. Please blog it, please email it, please spam it, please buy billboards
about it -- please do whatever you can. And most importantly, please help us explain
its importance. There is a chance to do something significant here. But it will take
a clearer, simpler voice than mine.

Now go and sign that petition.

Never doubt that a small group of thoughtful
committed people can change the world: indeed it's the only thing that ever has.

-- Margaret Meade

[BitWorking]

In MN This Week

My work with MSDN has taken me to the Minneapolis/St. Paul area this week. It turned out to be convenient because one of the team members lives here (the rest of us are scattered elsewhere) and MSFT has a big empty office we can work in. You won't hear me complaining about making the trip: I lived here until a year ago, and still have a lot of family and friends in the area.

Nothing like a free trip home.

Sunday, June 1, 2003

These Are a Few of My Favorite Things

I was writing some code Friday, and I realized I was working with three
of my favorite namespaces all at once: System.IO, System.Xml, and System.Diagnositcs.
Most people will have used all of these namespaces at least once, but there are a
bunch of hidden gems that not everyone will have seen.

The task I was trying to pull off was to dump an XML document for debugging
purposes. I wanted to send it both to the console and to a logfile, to give me a real-time
idea of what was going on as well as a permanent record I could go back and analyze
later.

The ability to write a piece of information to more than one place
for diagnostic purposes just cries out for System.Diagnostics.Trace.
So I started my program off with the following lines of code:

using System;

using System.IO;

using System.Diagnostics;

public class App

{

public static void Main()

{

    Trace.Listeners.Add(

      new TextWriteTraceListener(Console.Output));

  string logfile =

      DateTime.Now.ToString("yyyy-MM-dd-hh-mm-ss")
+ ".txt";

    Trace.Listeners.Add(

      new TextWriteTraceListener(logfile);

which sets it up so that any time I call Trace.WriteLine,
a piece of text is written both to the command console and to a logfile whose name
is derived from the current date and time. The overload of ToString that
lets you specify the exact format for the date and time is pretty cool, I think.

Later, when I wanted to actually dump the XML, I did this:

StringWriter sw = new StringWriter();

XmlTextWriter wtr = new XmlTextWriter(sw);

wtr.Formatting = Formatting.Indented;

xml.WriteTo(wtr);

Trace.WriteLine(sw.ToString());

wtr.Close();

where xml is the XmlDocument that
I wanted to dump out.

There are a number of cool things in these few lines of code:

· The StringWriter class
give you a stream that you can use anywhere you’d use a normal TextWriter (e.g.
the constructor of XmlTextWriter),
but it writes to an internally maintained StringBuilder instead.
Handy.

· The Formatting property
of XmlTextWriter to get that nice “every
nested element gets indented one level” look. Makes for much more readable output.

· Of
course, the call to Trace.WriteLine to
actually spew this same XML to more than once place.

A few caveats:

· You
need to define the symbol TRACE in your build (Project Properties->Configuration
Properties->Build->Conditional Compilation Constants – just type in TRACE)
or Trace.WriteLine will not even make
it into the compiled program.

· I
found that I needed to go through and flush each trace listener independently, or
my log file would get truncated. That was easy, though: just a foreach loop over the Trace.Listeners collection,
calling Flush and Close on
each TraceListenener.