[Haskell-cafe] Pure hashtable library

Jan-Willem Maessen jmaessen at alum.mit.edu
Thu Aug 28 09:44:11 EDT 2008


On Aug 27, 2008, at 4:31 PM, Bulat Ziganshin wrote:

> Hello Jan-Willem,
>
> Wednesday, August 27, 2008, 4:06:11 PM, you wrote:
>
>> One obvious way to make non-modifiable hash tables useful is to "eat
>> your own tail" non-strictly--- aggregate a set of hash table entries,
>> construct a hash table from them, and plumb the resulting hash table
>> into the original computation by tying the knot.  This works really
>> well if you can construct the bucket lists lazily and if you specify
>> the table size up front.
>
> i think it's impossible since you need to scan whole assoclist to
> build list of entries for any value of hash function.

I think this is because I wasn't quite clear enough about the problem  
I was solving.  I think you'll agree that we can use an assocList non- 
strictly in the following sense:
   * We can do lookups that succeed so long as we can compute all keys  
up to and including the key that matches.
   * We never do non-strict lookups that fail, as that would require  
examining the entire assocList.

Now I can build a hashtable with the same property from an assocList,  
albeit very inefficiently, using code like this (untested):

lazyArray :: (Ix i) => (i,i) -> [(i,v)] -> Array i [v]
lazyArray bnds kvs =
     array bnds [ (k, map snd . filter ((k==) . fst) $ kvs) | k <-  
range bnds ]

makeHash :: (Eq k, Hashable k) => [(k,v)] -> Array Int [(k,v)]
makeHash assocList =
     lazyArray (0,n-1) labeledAssocList
   where labeledAssocList = [ (hashToSize n k, (k,v)) | (k,v) <-  
assocList ]

We label each assocList element with its corresponding hash bucket  
(labeledAssocList); each bucket then contains exactly the elements of  
assocList that map to that bucket, in exactly the order in which they  
occurred in assocList.

The LazyArray library in hbc essentially did exactly what the  
lazyArray function I've shown above does, only the input list is  
traversed once rather than having a separate traversal for each  
bucket.  This can actually be implemented using the ST monad augmented  
by unsafeFreezeSTArray and unsafeInterleaveST; indeed the "State in  
Haskell" paper by Peyton Jones and Launchbury gives the implementation  
of a very similar function.

I have code for LazyArray based on the above paper that works with  
GHC, but I haven't needed it in a while.

-Jan



More information about the Haskell-Cafe mailing list