RFC: Adding a Hashable type class and HashMap/HashSet data types to HP

Daniel Peebles pumpkingod at gmail.com
Thu Nov 18 18:39:14 EST 2010


I like this idea. As I mentioned on IRC, I'd call the class Hash rather than
Hashable. I'm also with you on the Word return type. It may be less
convenient but maybe this will be a tiny step towards the "great Word
revolt" (in which all conceptually unsigned things in the prelude and
standard libraries actually become unsigned) that I hope will occur sometime
in the near future.


On Thu, Nov 18, 2010 at 5:05 PM, Johan Tibell <johan.tibell at gmail.com>wrote:

> Hi all,
>
> This is a request for early feedback on something that I'd like to
> propose for the next major HP release.
>
> Milan Straka showed [1] that it's possible to design a persistent set
> (and thus also map) for string-like keys that's about twice as fast as
> the current Data.Set using hashing and a Patricia tree (i.e. IntMap).
> The Clojure community has also managed to create a fast persistent
> map, based on Bagwell's hash array mapped trie.
>
> I'd like to propose that we get the necessary infrastructure in place
> for creating fast and easy to use hash-based data structure. Here's
> what I think is needed:
>
> * A type class for things that can be hashed, Hashable.
>
> class Hashable a where
>    hash :: a -> Word
>
> An alternative would be to have `hash` return an Int, which would make
> the method easier to use with Haskell data structures, which are
> indexed using Ints, but Word feels like the more correct type to me.
>
> The Hashable type class would have to go in either base, containers,
> or a new package (there's currently a hashable package on Hackage.)
>
> * Instances of Hashable for common types, in particular ByteString and
> Text.
>
> I already have fast instances for Text and ByteString, based on MurmurHash2
> [2].
>
> * Two new data types in the containers package, Data.HashMap and
> Data.HashSet. These would be persistent data structures, initially
> based on IntMap, and have the same or similar API to Data.Map and
> Data.Set..
>
> In the future we could also add mutable HashTables, operating in the
> ST monad, or other hash-based data types (persistent or mutable).
>
> Any comments?
>
> Johan
>
> 1. http://research.microsoft.com/~simonpj/papers/containers/containers.pdf
> 2. http://sites.google.com/site/murmurhash/
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20101118/2af9dcda/attachment.html


More information about the Libraries mailing list