[Haskell-cafe] (general question) Approaches for storing large amount of simple data structures

Don Stewart dons at galois.com
Thu Nov 15 18:28:58 EST 2007


bbrown:
> I have a project where I want to store a data structure on a file,
> binary or ascii.  And I want to use haskell to read and write the
> file. I will have about half a million records so it would be nice if
> the format was able to load quickly.  I guess I could, but I kind of
> want to avoid using XML.
> 
> I have the following structure in pseudo code.
>       -> keywords associated with that URL
>       -> title associated with that URL
>       -> links contained in that URL. (0 ... N)
> 
> What is an approach for saving 500 thousand of those types of records
> and where I can load the data into a haskell data type.

Data.Binary is the standard approach for large data/high performance
serialising to and from Haskell types. It composes well with the gzip
librarry too, so you can compress the stream on the way out, and
decompress lazily on the way in.

   http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-0.4.1

The interface is really simple:

    encode :: Binary a => a -> ByteString
    decode :: Binary a => ByteString -> a

For marshalling Haskell type 'a' into a bytestring and back.

-- Don


More information about the Haskell-Cafe mailing list