[Haskell-cafe] Efficient functional idiom for histogram

Daniel Fischer daniel.is.fischer at web.de
Sat Aug 1 14:31:32 EDT 2009

Am Samstag 01 August 2009 15:44:39 schrieb Paul Moore:
> 2009/7/31 Paul Moore <p.f.moore at gmail.com>:
> > 2009/7/31 Gregory Collins <greg at gregorycollins.net>:
> Hmm, I'm obviously still mucking up the performance somehow. My full
> program (still a toy, but a step on the way to what I'm aiming at) is
> as follows. It's rolling 3 6-sided dice 100000 times, and printing a
> summary of the results.
> import System.Random
> import qualified Data.Map as Map
> import Data.Map (Map)
> import Data.List
> dice :: Int -> Int -> IO Int
> dice 0 n = return 0
> dice m n = do
>   total <- dice (m - 1) n
>   roll <- randomRIO (1, n)
>   return (total + roll)

Don't do too much in IO, it's better to separate the pure parts from the IO.
IMO, this would better have signature

dice :: RandomGen g => Int  -> Int -> g -> (Int,g)
dice 0 _ g = (0,g)
dice m n g = case dice (m-1) n g of
              (total,g1) -> case randomR (1,n) g1 of
                            (roll,g2) -> (total+roll,g2)

or, better still be in a State monad or the Random monad ( 
http://hackage.haskell.org/package/MonadRandom )

die :: RandomGen g => Int -> State g Int
die n = State $ randomR (1,n)

dice :: RandomGen g => Int -> Int -> State g Int
dice m n = liftM sum $ replicateM m (die n)

-- the "do" is superfluous
> simulate count m n = do
>   mapM (dice m) (replicate count n)

Ouch, that hurts (not yet so incredibly much for 100000 rolls, but if you try 1000000, 
it'll really hurt).
Since you're doing it in IO, the whole list must be built before any further processing 
can begin, so you're building up a largish list, only to destroy it immediately 
afterwards, much work for the garbage collector. If you can accumulate the scores as they 
come, the intermediate list can be fused away and the garbage collector is kept idle.

If you absolutely have to do it in IO, use

import System.IO.Unsafe

simulate 0 _ _ = return []
simulate count m n = unsafeInterleaveIO $ do
    val <- dice m n
    rst <- simulate (count-1) m n
    return (val:rst)

to avoid building a large list. If you use the (lazy) State monad, that's automatically 
done :).

simulate count m n = replicateM count (dice m n)  -- now in State

histogram :: Ord a => [a] -> [(a,Int)]
histogram = Map.assocs . foldl f Map.empty
    f m k = Map.insertWith (+) k 1 m

-- simulation :: RandomGen g => State g [(Int,Int)]
simulation = do
  lst <- simulate 1000000 3 6
  return (histogram lst)

main = do
    sg <- getStdGen
    print $ evalState simulation sg

much faster, still not very fast, since StdGen isn't a particularly fast PRNG.

Another method is to create an infinite list of random numbers and use it as needed:
module Main (main) where

import System.Random
import Data.Array.Unboxed
import Data.List
import System.Environment (getArgs)

dice :: RandomGen g => g -> Int -> [Int]
dice g mx = randomRs (1,mx) g

splits :: Int -> [a] -> [[a]]
splits l = unfoldr f
        f xs = case splitAt l xs of
                r@(h,t) | null t    -> Nothing
                        | otherwise -> Just r

simulation :: RandomGen g => g -> Int -> Int -> Int -> UArray Int Int
simulation g rep dn df = accumArray (+) 0 (dn,dn*df) lst
        rls = dice g df
        scs = splits dn rls
        lst = take rep [(sum rll,1) | rll <- scs]

main :: IO ()
main = do
    (rp:dn:df:_) <- getArgs
    sg <- getStdGen
    print $ assocs $ simulation sg (read rp) (read dn) (read df)

Using an unboxed array instead of a Map gives a little extra speed, but not much.

> histogram :: Ord a => [a] -> [(a,Int)]
> histogram = Map.assocs . foldl f Map.empty
>   where
>     f m k = Map.insertWith (+) k 1 m

For some reason it doesn't make much difference here, but it should be the strict 
versions, foldl' and insertWith' in general.

> simulation = do
>   lst <- simulate 100000 3 6
>   return (histogram lst)
> main = do
>   s <- simulation
>   putStrLn (show s)
> When compiled, this takes over twice as long as a naively implemented
> Python program.
> What am I doing wrong here? I'd have expected compiled Haskell to be
> faster than interpreted Python, so obviously my approach is wrong. I'm
> expecting the answer to be that I've got unnecessary laziness

Quite on the contrary, it's unnecessary strictness here :D

> - which is fine, but ultimately my interest is in ease of expression and
> performance combined, so I'm looking for beginner-level improvements
> rather than subtle advanced techniques like unboxing.

Nothing advanced with using unboxed arrays.

> Thanks,
> Paul.
> PS I know my code is probably fairly clumsy

Actually, the style is rather good, I think (mine's worse, usually).
You shouldn't use IO so much, though, and your code betrays a certain level of 
unfamiliarity with strictness/performance characteristics of the libraries. But that's 

> - I'd appreciate style
> suggestions, but my main interest here is whether a beginner, with a
> broad programming background, a basic understanding of Haskell, and
> access to Google, put together a clear, efficient, program (ie, the
> case where my usual scripting language is too slow and I want to knock
> something up quickly in a high-level, high-performance language).

Performance is a nontrivial thing, it takes some experience to know which data structures 
to use when. And, as said above, Haskell's StdGen isn't fast, the above programme spends 
about 90% of the time creating pseudo random numbers.

More information about the Haskell-Cafe mailing list