<div dir="ltr">>Adrian Hey wrote:<br>><br>> Using explicit seqs rather than strict data types is actually faster,<br>> for reasons that are a bit of a mystery to me. I'm not sure what cost<br>> Bertram is talking about, but AFAIK ghc uses the same info pointer<br>
> mechanism for all heap records, including unevaluated thunks (although<br>> the info pointers will point to different things of course). But the<br>> cost of pattern matching on *evaluated* AVL nodes should be independent<br>
> of strictness annotation AFAICS.<br><br>Thanks for chiming in Adrian. Just to get started I removed the strictness<br>annotations from the Data.Map Bin constructor, made no other changes, and ran a<br>silly benchmark (at the end of this email). The version without bangs is<br>
actually faster than the version currently shipping. I get about 10.5 sec for<br>the lazy version and 11.5 sec for the strict version (2.1Ghz Intel Core)<br><br>I'll repeat that in bold for people just skimming this thread: <br>
<br>__Removing Strictness Annotations Makes It Go Faster__<br><br>The reason I think is that the helper functions bin, join and balance already <br>provide just enough strictness, as they need to inspect the size field. The <br>
strictness analyzer can then do its job. The case for IntMap is tricker,<br>as there is no implicit strictness in the code so removing the bangs causes<br>stack overflows. Still working on that one.<br><br>Here are the benchmarks. The lazy version also evaluates "keySum dmap" slightly<br>
faster (repeated inserts) and its a tie for "keySum smap" (sequential inserts).<br>I admit this benchmark is goofy, if you have a better one please share.<br><br>Scott<br><br><br><br>import qualified Data.Map as Map<br>
import Data.List as List<br><br>n = 1000000<br>rkeys = [ (i*122789) `mod` 1006471 | i<-[0..] ] :: [Int]<br>dkeys = map (`div`1000) rkeys :: [Int]<br>skeys = [0..] :: [Int]<br><br>shuffle (a:b:c:d:e:f:g:h:rest) = e:a:h:d:c:b:g:f: shuffle rest<br>
<br>keySum = List.foldl' (+) 0 . Map.keys <br><br>rmap = Map.fromList . take n . shuffle $ rkeys `zip` [0..]<br>dmap = Map.fromList . take n . shuffle $ dkeys `zip` [0..]<br>smap = Map.fromAscList . take n . shuffle $ skeys `zip` [0..]<br>
<br>mix (x:xs) (y:ys) = x : y : mix xs ys<br>mix _ [] = []<br>mix [] _ = []<br><br>rkeys2 = [ (i*122789) `mod` 1006471 | i<-[0..] ] :: [Int] <br>rlooks = [ (i*122819) `mod` 1006471 | i<-[0..] ] :: [Int]<br><br>rlook = <br>
List.foldl' <br> (\k s -> case Map.lookup k rmap of Nothing -> s; Just x -> s+x)<br> 0 (take n $ rkeys2 `mix` drop 1000 rlooks)<br><br>main = print rlook -- or print (keySum dmap) or whatever<br><br>
<br></div>