[Haskell-cafe] Allocating enormous amounts of memory and wondering why

Jefferson Heard jeff at renci.org
Sun Jul 8 17:26:18 EDT 2007


I'm using the Data.AltBinary package to read in a list of 4.8 million
floats and 1.6 million ints.  Doing so caused the memory footprint to
blow up to more than 2gb, which on my laptop simply causes the program
to crash.  I can do it on my workstation, but I'd really rather not,
because I want my program to be fairly portable.  

The file that I wrote out in packing the data structure was only 28MB,
so I assume I'm just using the wrong data structure, or I'm using full
laziness somewhere I shouldn't be.

I've tried compiling with profiling enabled, but I wasn't able to,
because the Streams package doesn't seem to have an option for compiling
with profiling.  I'm also a newbie to Cabal, so I'm probably just
missing something.  

The fundamental question, though is "Is there something wrong with how I
wrote the following function?"

binaryLoadDocumentCoordinates :: String -> IO (Ptr CFloat, [Int])
binaryLoadDocumentCoordinates path = do
  pointsH <- openBinaryFile (path ++ "/Clusters.bin") ReadMode
  coordinates <- get pointsH :: IO [Float]
  galaxies <- get pointsH :: IO [Int]
  coordinatesArr <- mallocArray (length coordinates)
  pokeArray coordinatesArr (map (fromRational . toRational) coordinates)
  return (coordinatesArr, galaxies)

I suppose in a pinch I could write a C function that serializes the
data, but I'd really rather not.  What I'm trying to do is load a bunch
of coordinates into a vertex array for OpenGL.  I did this for a small
30,000 item vertex array, but I need to be able to handle several
million vertices in the end.  

If I serialize an unboxed array instead of a list or if I do repeated
"put_" and "get" calls, will that help with the memory problem?



More information about the Haskell-Cafe mailing list