[Haskell-cafe] Program with ByteStrings leads to memory exhaust.

Grigory Sarnitskiy sargrigory at ya.ru
Mon Sep 14 10:05:41 EDT 2009


I have a simple program that first generates a large (~ 500 mb) file of random numbers and then reads the numbers back to find their sum.
It uses Data.Binary and Data.ByteString.Lazy.

The problem is when the program tries to read the data back it quickly (really quickly) consumes all memory.

The source: http://moonpatio.com/fastcgi/hpaste.fcgi/view?id=3607#a3607

or:

module Main where

import Data.Binary
import Data.Int
import System.Random
import qualified Data.ByteString.Lazy as BL

encodeFileAp f = BL.appendFile f . encode
path = "Results.data"
n = 20*1024*1024 :: Int

getBlockSize :: BL.ByteString -> Int64
getBlockSize bs = round $ (fromIntegral $ BL.length bs) / (fromIntegral n)

fillFile :: StdGen -> Int -> IO ()
fillFile _ 0 =return ()
fillFile gen i = do
    let (x, gen') = random gen :: (Double, StdGen)
    encodeFileAp path x
    fillFile gen' (i-1)

processFile :: BL.ByteString -> Int64 -> Int -> Double -> Double
processFile bs blockSize 0 sum = sum
processFile bs blockSize i sum = let
    tmpTuple = BL.splitAt blockSize bs
    x = decode $ fst $! tmpTuple
    in processFile (snd tmpTuple) blockSize (i-1) $! sum + x

main = do
    fillFile (mkStdGen 42) n
    results <- BL.readFile path
    putStrLn $ show $ processFile results (getBlockSize results) n 0


More information about the Haskell-Cafe mailing list