binary files in haskell

Simon Marlow simonmar@microsoft.com
Tue, 6 Feb 2001 04:50:25 -0800


> > How about this slightly more general interface, which works 
> with the new
> > FFI libraries, and is trivial to implement on top of the 
> primitives in
> > GHC's IOExts:
> > 
> >         hPut :: Storable a => Handle -> a -> IO ()
> >         hGet :: Storable a => Handle -> IO a
> 
> What about endianess? In which format are Floats or even just Bools
> stored? For a file which probably shall be read from 
> different machines
> this is not clear at all.

The behaviour is defined by the Storable instances for each type.  The
endianness for writing say an Int32 would be the same as the host
architecture, for instance.  If you want to work with just bytes, you
can always just use hPut and hGet at type Word8.

Overloading with Storable gives you more flexibility, since if you have
a way to serialise an object in memory for passing to a foreign
function, you also have a way to store it in binary format in a file
(modulo problems with pointers, of course).

In the long term, we'll want to be able to serialise more than just
Storable objects (c.f. the other overloaded binary I/O libraries out
there), and possibly make the output endian-independent - but after all
there's no requirement that Haskell's Int has the same size on all
implementations, so there's no guarantee that binary files written on
one machine will be readable on another, unless they only use explicitly
sized types or Integer.

Perhaps these should be called hPutStorable and hGetStorable so as not
to prematurely steal the best names.

> I think John is right that there needs to be a primitive interface for
> just writing bytes. You can then build anything more 
> complicated on top
> (probably different high-level ones for different purposes).
> 
> I just see one problem with John's proposal: the type Byte. It is
> completely useless if you don't have operations that go with it;
> bit-operations and conversions to and from Int. The FFI 
> already defines
> such a type: Word8. So I suggest that the binary IO library 
> explicitely
> reads and writes Word8's.

yup, that's what I had in mind.

Cheers,
	Simon