[Haskell-cafe] Re: Haskell Speed

Jan-Willem Maessen jmaessen at alum.mit.edu
Wed Jan 4 09:47:24 EST 2006


On Jan 4, 2006, at 5:30 AM, Simon Marlow wrote:

> On 30 December 2005 01:23, Jan-Willem Maessen wrote:
>
>> Probably.  The minimum table chunk size was rather large.  I have
>> been experimenting (tests are running even as I type) with alternate
>> implementations of Data.HashTable.  So far the winning implementation
>> is one based on multiplicative hashing and simple table doubling.
>> This seems to be consistently competitive with / faster than
>> Data.Map.  At this point my inclination is to make the whole suite
>> available:
>>
>> Data.HashTable.Class
>> Data.HashTable.DataMap
>> Data.HashTable.Doubling
>> Data.HashTable.Flat
>> Data.HashTable.Freeze
>> Data.HashTable.Modulus
>> Data.HashTable.Multiplicative
>> Data.HashTable.Original
>> Data.HashTable.Test
>> Data.HashTable
>>
>> I've separated out the hashing technique (Modulus, Multiplicative)
>> from the hash table implementation.  Note that a few of the above are
>> bogus, and this is only a fraction of the implementations tried thus
>> far.
>>
>> I need to get distribution clearance for a bunch of this code from my
>> employer, and figure out how to package it.  The latter may be
>> tricky, as Data.Hashtable is currently rather intimately tied to a
>> bunch of rather tricky bits of GHC.
>
> I wonder if you could put together a drop-in replacement for
> Data.HashTable that we can incorporate?  There's not much point in us
> still providing the inefficient version as standard after you've done
> all this work to figure out how to do it better.

That'd be good---though what qualifies as the "efficient version"  
will, I think, change based on the GC changes you said you made (I  
haven't had the time/patience to try to bootstrap the development  
version of GHC).  This is one of the reasons I've been saving so many  
bread crumbs along the way.  All of the above will work as drop-in  
Data.HashTable replacements, with Data.HashTable simply re-exporting  
multiplicative hashing and table doubling.

The tricky bit, as you probably know, is the rather intimate ties  
between Data.HashTable and the rest of the builtin code.  This  
requires the shipping Data.HashTable library to import most things  
from the source, rather than from the place where they're made  
publicly available.  Do you have any suggestions as to how I might go  
about testing that integration?

-Jan

>
> Cheers,
> 	Simon



More information about the Haskell-Cafe mailing list