Last year we looked at the
containers library - a library that provides some of the most essential data structures in programming. While
containers are extremely efficient, sometimes you’re just need to go the extra mile, and that’s where
unordered-containers comes in.
If you have a look at the
insert methods in
containers, you’ll see type signatures like:
The only requirement on insertions is that we can order the keys in a map, or the elements of a set, respectively. This gives us an indication of the underlying algorithm - the ordering is probably exploited to form some type of balanced tree. However, there’s another way to build up these data structures, and that’s via the familiar process of hashing.
Hashing is available to us in Haskell via the
hashable library, currently maintained by Johan Tibell.
hashable gives us a new type class -
Hashable - which does what it says on the tin. Given some value that has a
Hashable instance, we can turn those values into hashes.
unordered-containers then builds on top of this to provide us with hash-maps and hash-sets.
Hashable, working with hash-maps and hash-sets in
unordered-containers is a breeze. For example, we could easily store a hash-map associating children with a list of presents they want for Christmas. First, we define our data types:
Off the bat, we can’t use these data types in any maps. They don’t have
Ord instances, and they also don’t have
Hashable instances. Thankfully, it’s a doddle to add a
Hashable instance thanks to GHC’s generic deriving support1:
Now we can populate our hash-map:
And query it just as you would query a map in
Which produces the output:
ocharles wants... Artisan Coffee Dependent Types in Haskell Lambda Fridge Magnets
A modest wishlist, I’m sure you’ll agree.
The API provided by
unordered-containers is not quite as extensive as what we have available to us in
containers - but it’s still perfectly usable. The real benefit from
unordered-containers is, of course, in the numbers. Johan has already done a good deal of benchmarking and comparison, and wrote up his findings on his blog, and hash-maps consistently out-perform regular
As to which you should use, my personal preference is still
containers as that API spoils me. However, sometimes these data structures are at the heart of what I’m building and entirely internal - in these cases I’m happy to know that a
Hashable instance exists and I might as well make use of it! Another consideration is that
Ord instances, and sometimes ordering just doesn’t make sense for the data I’m working with. In these cases, I can usually write a
Hashable instance, so in that situation the choice is clear.
hashable >= 1.2↩
You can contact me via email at firstname.lastname@example.org or tweet to me @acid2. I share almost all of my work at GitHub. This post is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
I accept Bitcoin donations:
14SsYeM3dmcUxj3cLz7JBQnhNdhg7dUiJn. Alternatively, please consider leaving a tip on