CraigBlog: April 2007

Friday, April 27, 2007

Thinking Like Tim

I talk to Tim on a regular basis. So when he outed himself as a REST convert lately, it actually wasn't a huge surprise to me - we've been talking a lot lately about what the last few years have taught us about web service programming, and how more and more SOAP/WS-* just doesn't seem to be all that and a bag of chips. Still, the key part of the analysis in his latest post - that you can use HTTP as a way to publish application states and the transitions between them - really struck me as something significant, and brought a lot more clarity to the whole problem.

Here's where my thinking is at: Basically every SOAP web service I've written in the last four years could have been done just as well as a straight HTTP interface. And a several of them would have been significantly better. The ones that couldn't are very clearly RPC style, for better or for worse. I've been feeling this especially acutely lately as I try to optimize one of the services I've written. I found myself implementing HTTP over SOAP, which just seems stupid, not to mention clumsy.

For me it's really not about whether the thing is "REST" or not. I find the term REST to be too laden to be useful. I think more in terms of "SOAP via toolkits" (SVT?) versus Plain Old XML over HTTP (POX). After all, I'm perfectly willing to use a SOAP Envelope as long as I still get to use HTTP GET. But at the end of the day, ignoring particular terminology, I agree with Tim's analysis: SVT is for stuff I'd use DCOM for in the old days, not for putting up a large-scale, public-facing, web-based, programmatic interfaces.

I realize that programmers love their tools and toolkits, but honestly, when you build big giant systems sometimes you have to do things the "hard" way. It always surprises me when my clients complain that complex problems contain complexity in the solution. But in any case, I'm not even convinced that throwing away toolkits like WCF will raise the complexity level - there's a lot to be said for being closer to the wire. Plus, things like Linq for XML look really tasty when you consider how they might help with this style of programming.

I do really like the analysis of REST as Representational State *Transition*, as opposed to the T being for "Transfer". To me, viewing the design exercise as figuring out how to expose a connected state graph (i.e.

both transitions and states are explicit) brings a lot of mental order to what I already knew I wanted to do anyway. In particular, it helped me understand how a POX interface is useful for writes...because for reads it's dead obvious: GET kicks ass, especially if you care at all about performance. For a long time I was hung up on trying to figure out how to use SVT on write, and GET on read, because that seemed to make the most sense. But if the thing is just your application state where each state has an address, it's pretty obvious that it's okay to POST to a URL to cause a state transition. Or even for that state transition to significantly alter the URL space.

Of course, I have yet to build a large, real system using this new approach. I'd love to, though, because more and more it seems like the right thing. And no battle plan survives first contact with the enemy - witness SVT. The big question is whether the big concepts will prove out for me personally. Hence my eagerness to put my customers' money where my mouth is :).

And before you point it out in the comments, I should say that I'm aware of the efforts to make WCF more amenable to straight HTTP-style programming. It's just that I'm dubious about the viability of exposing transport semantics explicitly via an API that's all about abstracting transport semantics. It seems to me that if you're going to remove a layer of abstraction, you should actually remove a layer of abstraction. But again, I'll need to build something real with it before I can draw any supportable conclusions.

Tuesday, April 17, 2007

Converting Messages to Text in Outlook 2007

In Outlook 2003, you could open an email message and convert it from HTML to text by hitting Ctrl-Shift-O. That was handy, because I generally prefer to bottom-quote, and that's a pain in the ass when the email is HTML. So when I installed Outlook 2007 I was annoyed that Ctrl-Shift-O didn't work any more.

After a bunch of fumbling around, I finally discovered that if you open a message and edit it (Message->Other Actions->Edit), you can then change it to plain text (Options->Plain text). Well, being a keyboard guy, I got pretty tired of doing that, so now I have the following key sequence committed to muscle memory:

Enter (open message)

Alt H X E (edit message)

Alt P L (convert to plain text)

Alt H R P (reply)

Which, honestly, sucks, but is maybe 100 times as fast as doing the same thing with the mouse.

Wednesday, April 11, 2007

Fix Debugging QIs in ATL Code Under Vista

I think I have whiplash: in the space of a few hours I've gone from writing web services in C# to implementing COM stuff in C++. I don't do much C++ any more, so the adjustment has been ~~somewhat~~ extremely painful.

See, I'm trying to implement a new protocol handler to integrate with Windows Search. I have some content in a database, and I want to surface it along with other search results using Windows Desktop Search or the native Vista stuff. In the process, this set of posts has been an excellent resource. In fact, it's nearly the only resource for writing protocol handlers as far as I can tell.

While I was reading the posts and (slowly) implementing away, I decided that it would be a good idea to follow the suggestion turn on ATL QueryInterface debugging. Especially given that my code appeared to be doing nothing. So I was a bit annoyed to see output like this in DebugView:

[3552] CProtocolHandler
[3552] -
[3552] - failed

Those dashes are supposed to have interface names or IDs before them, and they're supposed to tell me what interfaces are being QueryInterfaced for. What's worse is that when I followed the link in the articles and read this, it looked as if the problem was fixed in VS2005, which I'm using. But it obviously wasn't.

Soon enough, I was down in the bowels of the ATL code (atlbase.h, no less), trying to figure out what's up. Well, eventually I spotted it: the code change they made to AtlDumpIID to fix the problem in VS2003 is still broken in the case where the registry keys can't be read. In that case, instead of dumping the raw ID, they dump…nothing.

I'm guessing that this is a Vista (which I'm using) problem that has to do with security being a lot more locked down. At any rate, if you encounter a similar problem, you'll want to change AtlDumpIID to something like the code below. I'm sure you can do better - this sometimes prints IDs twice, but it was the smallest change I could make to get the effect I wanted.

#if defined(_ATL_DEBUG_INTERFACES) || defined(_ATL_DEBUG_QI)
__forceinline HRESULT WINAPI AtlDumpIID(REFIID iid, LPCTSTR pszClassName, HRESULT hr) throw()
{
        USES_CONVERSION_EX;
        CRegKey key;
        TCHAR szName[100];
        DWORD dwType;
        DWORD dw = sizeof(szName);

        LPOLESTR pszGUID = NULL;
        if (FAILED(StringFromCLSID(iid, &pszGUID)))
                return hr;

OutputDebugString(pszClassName);
OutputDebugString(_T(" - "));

        LPTSTR lpszGUID = OLE2T_EX(pszGUID, _ATL_SAFE_ALLOCA_DEF_THRESHOLD);
#ifndef _UNICODE
        if(lpszGUID == NULL)
        {
                CoTaskMemFree(pszGUID);
                return hr;
        }
#endif
        // Attempt to find it in the interfaces section
        if (key.Open(HKEY_CLASSES_ROOT, _T("Interface"), KEY_READ) == ERROR_SUCCESS)
        {
                if (key.Open(key, lpszGUID, KEY_READ) == ERROR_SUCCESS)
                {
                        *szName = 0;
                        if (RegQueryValueEx(key.m_hKey, (LPTSTR)NULL, NULL, &dwType, (LPBYTE)szName, &dw) == ERROR_SUCCESS)
                        {
                                OutputDebugString(szName);
                        }
            else
            {
                OutputDebugString(lpszGUID);
            }
        }
      else
        {
            OutputDebugString(lpszGUID);
        }
    }
    // Attempt to find it in the clsid section
    if (key.Open(HKEY_CLASSES_ROOT, _T("CLSID"), KEY_READ) == ERROR_SUCCESS)
    {
        if (key.Open(key, lpszGUID, KEY_READ) == ERROR_SUCCESS)
        {
            *szName = 0;
            if (RegQueryValueEx(key.m_hKey, (LPTSTR)NULL, NULL, &dwType, (LPBYTE)szName, &dw) == ERROR_SUCCESS)
            {
                OutputDebugString(_T("(CLSID\?\?\?) "));
                OutputDebugString(szName);
            }
            else
            {
                OutputDebugString(lpszGUID);
            }

        }
        else
        {
            OutputDebugString(lpszGUID);
        }

    }
    else
        OutputDebugString(lpszGUID);

        if (hr != S_OK)
                OutputDebugString(_T(" - failed"));
        OutputDebugString(_T("\n"));
        CoTaskMemFree(pszGUID);

return hr;
}
#endif // _ATL_DEBUG_INTERFACES || _ATL_DEBUG_QI

Caveat Abstractor

Keith's recent post shows off an excellent use of anonymous delegates. There is, however, the fact that this is a bit weird to debug - as you step through you wind up jumping back and forth between contexts. Nothing that an experienced developer can't deal with, but anything that adds to the cognitive load required to maintain a piece of code needs to be carefully considered. The older I get, the more I realize that maintainability is the most consistently underserved of the five design tenets I like to remember. There's something to be said for having all the code right there in your face.

Syntactically, this gets a bit better with C# 3.0 due to the cleaner grammar for anonymous delegates, but I think it still has some of the same maintainability problems.

Of course, I've done this sort of thing myself. But I always think hard before I do it, and sometimes I feel a little dirty afterwards. Overused, this is a real maintenance nightmare.

Before someone jumps down my throat with a "so we should just repeat all that code, then?" I'll preemptively respond with two points:

Well, it's what we did in .NET 1.1. How bad was it then?

Every decision involves tradeoffs. Sometimes the tradeoffs involved make the decision easy. Sometimes they only seem to.

Monday, April 9, 2007

Yo, Package This!

I note that the cheekily-named Package This just went up on CodePlex. I had nothing to do with writing it, other than answering a couple of questions from the author.

The description from the project page:

Package This is a GUI tool written in C# for creating help files (.chm and .hxs) from the content obtained from the MSDN Library via the MSDN Content Service. You select the content you want from the table of contents, build a help file, and use the content offline. You are making personalized ebooks of MSDN content. Both help file formats also give full text search and keyword search.

The code illustrates how to use the MSDN Content Service to retrieve documentation from MSDN. It also shows how to build .hxs files and .chm files programmatically.

It's great to see interesting new applications making use of the web service!