I spend a lot of time dealing with collections when I code: should this thing be a list? An enumeration? A hashtable? Decisions about data structures abound. Which is one of the reasons I like Whidbey so much - the new support for generics makes life in the collection world ever so much nicer. But the other day I was poking around and came across System.Collectons.ObjectModel - a sister namespace to System.Collections.Generic, and one I hadn't heard much about. But it makes the story even better.
There's not a whole lot in the namespace at the moment. Just three classes: Collection<T>, ReadOnlyCollection<T>, and KeyedCollection<T>. Collection<T> and ReadOnlyCollection<T> are pretty obvious - they provide simple collection and read-only collection wrapper functionality. Handy, but my favorite is KeyedCollection<T>.
KeyedCollection<T> is an abstract base class that gives you that Hashtable/ArrayList crossover class you've probably written at least once in your life. That is, it gives you list semantics, but adds a non-integer indexer for doing lookups in the collection. So, for example, if you have a Person class that looks like this:
class Person {
private int _age;
private string _name;
public int Age { get { return _age; } set { _age = value; } }
public int Name { get { return _name; } set { _name = value; } }
}
All you need to do is create a collection like this
class People : KeyedCollection<string, Person> {
protected override string GetKeyForItem(Person item) {
return item.Name;
}
}
where the implementation of GetKeyForItem is whatever makes sense. It's the only method you need to override, because without it, the collection has no idea what the key is. But once you do, you can now access the collection as follows:
People instructors = Pluralsight.Instructors; // Acquire a collection from somewhere
Person craig = instructors[7]; // Can access by index
Person keith = instructors["Keith"]; // Or by name
In addition to being convenient, the documentation also suggests that KeyedCollection was implemented to provide roughly constant-time lookups by either index or key regardless of the size of the collection. Whether that means "consistently fast" or "consistently slow" I haven't measured yet. :) But I think the best part might just be that the class you derive from KeyedCollection is serializable via XmlSerializer, despite having dictionary-like semantics. That's basically the end of the old "Damn! I can't serialize a Hashtable" problem.
Thanks for the post. It would interesting to measure the performance.
ReplyDeleteVery cool! I've been using generic Dictionary as a substitute for Hashtable. I'll have to take a look at KeyedCollection now. Thanks for sharing.
ReplyDeleteI did a little work with KeyedCollection, and it's very cool, with the exception that you must derive from it. So I wrote this:
ReplyDeletepublic class GenericKeyedCollection< K, T > : KeyedCollection< K, T >
{
private Converter<T, K> keyRetriever;
public GenericKeyedCollection( Converter<T,K> keyRetriever )
{
this.keyRetriever = keyRetriever;
}
protected override K GetKeyForItem(T item)
{
return keyRetriever(item);
}
}
This way you don't need to provide a separate class for each different key retrieval method.
Very handy.
ReplyDeleteI'm stuck in a conference room and our marketing team is presenting. Fortunately, your blog entry gave me something to do to pass the time!
ReplyDeleteI generated 1,000,000 Person records, stuck them into a People instance and randomly accessed 1,000,000 entries by key -- 1442ms total and then by index -- 120ms total.
By comparison, a Hashtable looked up a million entries in 1742ms and an ArrayList in 160ms. Not bad! I'm on a 1.5GHz laptop, BTW.
Oops. Got to start paying attention, now.
Interesting. If you get bored again, compare it with Dictionary<K, T> and List<T> instead...that's probably a more fair comparison.
ReplyDeleteSweet - I hadn't noticed that one yet. I've used NameObjectCollectionBase for XmlSerializable keyed collections on 1.1 (though it has a dopey enumerable implementation - hope they've fixed that in KeyedCollection). This looks like a nice generic replacement.
ReplyDeleteI´m wondering if you need to override GetHashCode method in order to obtain a better perfomance. If we have to ...Do you know how to define this method for this case?
ReplyDeleteThanks a bunch!
Silvana
Hmm. I'm not sure if that would help. I guess it would depend on
ReplyDelete1) Your implementation of GetHashCode, which would surely be specific to the type of object you were containing,
2) What your performance requirements are.
Generally, this sounds like a case of premature optimization. Write your code, profile it, and if KeyedCollection is the slowest thing (unlikely) then look at how to improve performance.
I am really interested in Chris Tavares' implementation. My only question is how would you use the System.COnverter and how do you implement it?
ReplyDeleteTHanks
Does anyone know how to convert a generic collection to a generic KeyedCollection? I need the benefits of the KeyedCollection to be able to get items by index as well as by key. Now the problem is that my middle tier is returning generic collections. I would like to create a KeyedCOllection class that has a constructor that you pass it a Collection like List and Collections have. Anyone have any ideas on how to go about doing this?
ReplyDeleteDerive a class from KeyedCollection<K, T>. Write a constructor on your derived class that takes whatever you want. Use derived class.
ReplyDeleteGreat post! I noticed the KeyedCollection but never tried it... I'm switching now!
ReplyDelete