Data.HashTable.hashInt seems somewhat sub-optimal

Jan-Willem Maessen jmaessen at alum.mit.edu
Thu Aug 30 08:37:16 EDT 2007


Oh, yes, in case others are interested in playing with the Jenkins  
hash, here's Haskell code for it.  If you think I should have stuck  
more "seq's" and "!"s into it, by all means go to town (I compile  
with optimization and everything is strict).

-Jan

-- | The burtleburtle.net hash function devised by Bob Jenkins and
-- used in perl et al.  This is written gracefully in a very
-- imperative way, and looks quite ugly when functionalized.
mix :: Int32 -> Int32 -> Int32 -> (Int32 -> Int32 -> Int32 -> a) -> a
mix a0 b0 c0 k0 =
   let mixR k a b c = (a-b-c) `xor` (c `shiftR` k)
       mixL k b c a = (b-c-a) `xor` (a `shiftL` k)
       mix3 k1 k2 k3 k a  b  c  =
           let a' = mixR k1 a  b  c
               b' = mixL k2 b  c  a'
               c' = mixR k3 c  a' b'
           in k a' b' c'
   in  (mix3 13 8 13 $ mix3 12 16 5 $ mix3 3 10 15 $ k0) a0 b0 c0

golden :: Int32
golden = -1640531527

hashInt :: Int -> Int32
hashInt x = mix golden 0 (fromIntegral x) $ \_ _ r -> r

-- | A hash function for Strings based on a slightly modified version
-- of the burtleburtle string hash.  We use the same mix, but we mix
-- every 3 Chars (not 12) since Haskell Chars are unicode.  That does
-- make this hash 4x more expensive in the common case.
--
--
hashString :: String -> Int32
hashString str = hs str golden 0 0
   where hs (a':b':c':str) a b c = mix (a + orrd a') (b + orrd b') (c  
+ orrd c') $
                                   hs str
         hs [b',c'] a b c = mix a (b + orrd b') (c + orrd c') $ \_ _  
r -> r
         hs [c'] a b c = mix a b (c + orrd c') $ \_ _ r -> r
         hs [] _ _ c = c
         orrd :: Char -> Int32
         orrd =  fromIntegral . fromEnum




More information about the Libraries mailing list