[Haskell-cafe] two problems with Data.Binary and Data.ByteString

Don Stewart dons at galois.com
Tue Aug 12 20:18:20 EDT 2008


newsham:
> I have a program that read in and populated a large data structure and
> then saved it out with Data.Binary and Data.ByteString.Lazy.Char8:
> 
>    saveState db = B.writeFile stateFile =<<
>        encode <$> atomically (readTVar db)
> 
> when I go to read this in later I get a stack overflow:
> 
> loadState db = do
>     d <- decode <$> B.readFile stateFile
>     atomically $ writeTVar db d
> 
>   Stack space overflow: current size 8388608 bytes.
>   Use `+RTS -Ksize' to increase it.
> 
> or from ghci:
> 
>     d <- liftM decode
>       (Data.ByteString.Lazy.Char8.readFile
>          "savedState.bin") :: IO InstrsDb
> 
>     fromList *** Exception: stack overflow
> 
> The data type I'm storing is a Map (of maps):
> 
>    type DailyDb = M.Map Date Daily
>    type InstrsDb = M.Map String DailyDb
> 
> What's going on here?  Why is the system capable of building and saving
> the data but not in reading and umarhsalling it?  What is the proper way
> to track down where the exception is happening?  Any debugging tips?

So a big Map is serialised as a huge list.

    instance (Ord k, Binary k, Binary e) => Binary (Map.Map k e) where
        put = put . Map.toAscList
        get = liftM Map.fromDistinctAscList get

so that fromAscList's the result of parsing the map as a list, via,

    instance Binary a => Binary [a] where
        put l  = put (length l) >> mapM_ put l
        get    = do n <- get :: Get Int
                    replicateM n get

so that's a length-prefixed list, strictly. Which is possibly where the
stack's being consumed. Does just bumping the stack size a bit help?

Alternatively, you could consider serialising the Map in some other
format (i.e. newtype the list, and serialise that say, in a lazy/chunked
encoding).
  
> I also noticed another issue while testing.  If my program loads
> the data at startup by calling loadState then all later calls to
> saveState give an error:
> 
>   Log: savedState.bin: openFile: resource busy (file is locked)
> 
> this does not occur if the program wasnt loaded.  My best guess here
> is that B.readFile isnt completing and closing the file for some
> reason.  Is there a good way to force this?

Lazy IO. So force the result to be evaluated, and then close the handle,
or use strict bytestring reading.

-- Don


More information about the Haskell-Cafe mailing list