[Haskell-cafe] Memory allocation in extractor function (newbie question)

Alexander Kireyev alexander.kireyev+haskell.cafe at gmail.com
Mon Apr 7 06:55:57 EDT 2008


Hello,

While trying to write a program for the countPaths Code Jam problem I
ran into what seems to me as a weird behaviour in terms of memory
allocation.

The task is to count the number of way you can spell a certain "word"
by walking some path on a board of letters.

Being a newbie I started with a very straightforward recursive
approach, which is not effective, but still helps to improve
understanding of Haskell basics. Quite naturally I quickly hit the
point where the program becomes too slow, so I tried to profile and
optimize the code.

I used 3 data types:

data GridEntry = GridEntry {letter :: Char, neighbors :: [Point]}
data Area = Area {letters :: (Array (Int, Int) GridEntry), width ::
Int, height :: Int}
data Point = Point { x :: Int, y :: Int} deriving (Show, Eq)

With the recursive approach I had to retrieve GridEntries (a letter in
a certain point of grid together with the list of its neighbours on
the grid) a lot. Quite surprisingly I saw that this operation was
overly expensive in my program. The most time was spent in these 2
functions:

countPathsFrom :: Area -> Int -> String -> Point -> Int
countPathsFrom _ count [] _ = count
countPathsFrom area count (l:[]) point =
   count +
      if (letter (letterAt area point) == l)
      then 1
      else 0
countPathsFrom area count (l:ls) point =
  let gridPt = letterAt area point
      points = neighbors gridPt
      in
        if (letter gridPt == l)
        then countForPoints area ls count points
        else count

letterAt :: Area -> Point -> GridEntry
letterAt area (Point x y) = letters area ! (x, y)

The profiling log (./paths +RTS -P) shows the following time/space
behaviour for them:

      countPathsFrom    Main
      181    11830476  37.8   52.3    59.5   82.2     28 1325013312
       letterAt         Main
      184    11830476  21.6   29.9    21.6   29.9     16 757150464

What puzzles me here is how the function letterAt generates 64 bytes
of garbage per call. Since there are no side-effects in this section
of program - and noone can damage the 'area' structure - I would
expect the compiler/runtime to use a reference to the data in the
'area' structure when returning values from letterAt. Instead it seems
to make a copy of GridEntry and return that. The question here is
whether (why?) this behaviour is normal and if it is possible to
optimize the code in such cases.

Thanks in advance,
-Alexander


More information about the Haskell-Cafe mailing list