using ghc with make

Bulat Ziganshin bulat.ziganshin at gmail.com
Wed Apr 19 14:17:29 EDT 2006


Hello Simon,

Wednesday, April 19, 2006, 7:28:23 PM, you wrote:

> GHC might well be able to make use of such stuff too.  In general,
> one would like to be able to treat a file much like a database, as
> you suggest, with binary serialisation of data structures into it.

what you mean by "database"? what operations you need, in addition to
sequential read and write?

> GHC's serialisation also includes a simple communing-up mechanism
> for "leaves", especially strings.  We build a kind of dictionary, to
> avoid repeatedly re-serialising the same string.  I guess that any
> good binary serialisation will want to do something similar.  (Or
> something more dynamic, a la arithmetic coding.)

arithmetic coding in Haskell? :) it will be MUCH faster to use
simplest form of serialization and then call C compression library
such as ziplib

i just scanned ghc's Binary library and can say what features i don't
implemented in my lib:

1) lazyGet/lazyPut. it's no problem to copy your implementation but i
still don't understand how lazyGet should work - it share the same
buffer pointer as one used in `get`. so `get` and consuming structure
returned by lazyGet should interfere

2) i don't think that dictionary sharing should be part of general
Binary library. but i tried to implement my lib so that this can be
implemented in user code. it seems that i failed and i think that it
is Haskell's drawback :)  let's see: we want to use dictionary in
get/put_ functions for FastString, so that large datastructure that
includes strings can be serialized with just `put`. but `put` have
the following signature:

class Binary a where
  put :: OutByteStream h => h -> a -> IO ()

where OutByteStream defined as

class OutByteStream m h where
  vPutByte :: h -> Word8 -> m ()

so, `put` only has access to OutByteStream's functions (i.e. only
vPutByte) and can't deal with any data specific to user-supported
stream, including it's dictionary. well, we can redefine Binary:

class OutByteStream m h => Binary m h a where
  put :: h -> a -> m ()

instance Binary IO StreamWithDict FastString where
  put = ...  -- now `put` can use functions specific for StreamWithDict

but there is again catch:

instance (Binary m h a) => Binary m h [a] where
  put h = replicateM_ (put h)

here. internal call to `put` again will receive only OutByteStream
dictionary! instance for FastString will just not matched!!!


btw, btw. Haskell type classes has many non-obvious peculiarities. for
example, it was not easy for to understand that Haskell resolve all
overloading at compile time but finds what overloaded function to call
at runtime (well, i can't even describe this behavior). can you
recommend me paper to read about using Haskell class system?


-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Glasgow-haskell-users mailing list