[Haskell] optimisation of State monad

Simon Peyton-Jones simonpj at microsoft.com
Wed Sep 29 07:52:13 EDT 2004


I took a quick look.  My bet is that profiling is interfering with
optimisation (which it certainly does).  I also used -G1 to force
single-generation GC, which gives more accurate residency numbers.

Without profiling, but all compiled with -O.

iterR (take 100000 (repeat 1)) (Foo 0 0.0)
  ./test1 +RTS -K64M -sstderr -G1 
Foo {foo = 0, bar = 0.0}
  6,923,484 bytes allocated in the heap
      3,288 bytes copied during GC
        192 bytes maximum residency (18 sample(s))

iterL (take 100000 (repeat 1)) (Foo 0 0.0)
    ./test2 +RTS -K64M -sstderr -G1 
Foo {foo = 0, bar = 0.0}
  4,847,872 bytes allocated in the heap
     22,032 bytes copied during GC
      1,224 bytes maximum residency (18 sample(s))


iterR' (take 100000 (repeat 1)) (Foo 0 0.0)
  [iterR' is your iterR calling t1']
   ./test3 +RTS -K64M -sstderr -G1 
Foo {foo = -10, bar = 0.0}
 41,046,244 bytes allocated in the heap
     29,756 bytes copied during GC
        212 bytes maximum residency (148 sample(s))

        148 collections in generation 0 (  0.18s)


These all seem reasonable to me.

I didn't look at the state stuff.

Simon

| -----Original Message-----
| From: haskell-bounces at haskell.org [mailto:haskell-bounces at haskell.org]
On Behalf Of Georg Martius
| Sent: 21 September 2004 11:15
| To: haskell
| Subject: [Haskell] optimisation of State monad
| 
| Hi folks,
| 
| I have two questions to tail recursion, optimisation(ghc) and the
State monad. Sorry about bothering
| you with efficiency issues, but they become crusual to me since my
programm needs more memory
| than I have :-(
| 
| I compiled the following small examples with
| ghc -O6 -Wall -prof -auto-all -o test Test.hs  (ghc 6.2.1)
| and ran them with
| ./test +RTS -K64M -sstderr
| 
| > data Foo = Foo { foo :: !Integer               , bar :: Double
| >                } deriving Show
| > type Transformation a = a -> a
| >
| > addX :: Integer -> Transformation Foo
| > addX x f = f { foo = (foo f) + x }
| >t1 :: Integer -> Transformation Foo
| > t1 x = (addX x . addX (-x))
| >iterR :: [Integer] -> Transformation Foo
| > iterR list f =  foldr t1 f list
| > iterL :: [Integer] -> Transformation Foo
| > iterL list f =  foldl (\ f' i -> t1 i f') f listiterR (take 100000
(repeat 1)) (Foo 0 0.0)
|   16,151,796 bytes copied during GC
|   11,546,120 bytes maximum residency (5 sample(s))
| Productivity  20.3% of total user, 19.7% of total elapsed
| 
| > iterL (take 100000 (repeat 1)) (Foo 0 0.0)
|        3,020 bytes copied during GC
|       25,904 bytes maximum residency (1 sample(s))
| Productivity 100.0% of total user, 80.0% of total elapsed
| 
| 
| Okay, foldl is tail recursive and foldr not. Fine!
| Now one weird thing: If I define t1 as:
| > t1' :: Integer -> Transformation Foo
| > t1' x f = let newfoo = foldr addX f (take 10 $ repeat x)
| >               in newfoo {foo = foo f - foo newfoo}
| 
| > iterR (take 100000 (repeat 1)) (Foo 0 0.0)
|       22,476 bytes copied during GC
|    2,113,876 bytes maximum residency (3 sample(s))
| Productivity  60.0% of total user, 59.0% of total elapsed
| 
| Why is the compiler able to optimise the call of t1' and not the one
of t1?
| 
| My second question belongs to the State monad:
| > withFoo :: Foo -> State Foo (a) -> Foo
| > withFoo state monad = execState monad state
| >addXM :: Integer -> State Foo ()
| > addXM x = modify (\ f -> f { foo = (foo f) + x })
| >iterM :: [Integer] -> State Foo ()
| > iterM list    =  sequence_ $ map t1M list
| >t1M :: Integer -> State Foo ()
| > t1M x = do addXM x
| >            addXM (-x)
| 
| > withFoo (Foo 0 0.0) $ do iterM (take 100000 (repeat 1))
|   48,989,348 bytes copied during GC
|   14,037,156 bytes maximum residency (7 sample(s))
| Productivity  15.5% of total user, 14.7% of total elapsed
| 
| and if I define t1M as:
| t1M' :: Integer -> State Foo ()
| t1M' x = do old <- get
|              sequence_ $ map addXM (take 10 $ repeat x)
|              modify (\ f -> f {foo = foo f - foo old})
| then the memory consumption is awfully high:
| 
| > withFoo (Foo 0 0.0) $ do iterM (take 100000 (repeat 1))
| 172,468,996 bytes copied during GC
|   48,522,520 bytes maximum residency (10 sample(s))
| Productivity   2.6% of total user, 2.5% of total elapsed
| 
| Is there a way to optimise the State monad version?
| 
| Any help would be appreciated.
| 
|   Georg
| _______________________________________________
| Haskell mailing list
| Haskell at haskell.org
| http://www.haskell.org/mailman/listinfo/haskell


More information about the Haskell mailing list