[Haskell-cafe] Design suggestion for Data.Binary.Defer

Neil Mitchell ndmitchell at gmail.com
Mon Jun 16 12:43:04 EDT 2008


Hi,

I'm in the process of updating the Deferred Binary library,
http://www-users.cs.york.ac.uk/~ndm/binarydefer/. The idea is that its
does serialisation, but certain elements can be marked as deferred -
instead of being written in the current file stream, they are merely
pointed at and if needed, that pointer will be followed.

Example:("hello","world"), where the first field is marked as deferred
would write out:

[6]"world""hello"

i.e. [skip 6 characters if you want hello], the "world" data, the
"hello" data we previously promised to put here. When reading, the
"hello" would only be seeked to and read if necessary.

So, its like binary, but some fields are lazy. The question is how to
indicate which fields are lazy. There are three schemes I can think
of:

== Simple Instances ==

put (a,b) = putDefer a >> put b
get = do a <- getDefer; b <- get; return (a,b)

If the put/get and putDefer/getDefer items do not line up perfectly it
will go very wrong at runtime - probably resulting in random values
being created. You also can't derive the instances automatically, with
something like Derive or DrIFT.

== Complex Instances ==

This is the current scheme, based on lazy pattern matching and
exceptions - very confusing, probably low performance.

deferBoth = [\~(a,b) -> unit (,) <<~ a << b]

Where <<~ a means write out the a field lazily, and << b means write
out the b field strictly. The advantage over the simple instances is
that a field being deferred is declared once.

== Lazy Data Type ==

Instead of customizing the instance, you can write a data Defer a =
Defer a type, and then instead of the original tuple write:

(Defer "hello","world")

But now the code must unwrap the Defer before accessing "hello", but
the instance becomes much simpler, and can be derived.

== The Question ==

Is there a simple way of tagging fields in a constructor as deferred,
just once for reading and writing, and ideally outside the instance
definition and not requiring additional code to unwrap? I can't think
of any, but there may be something I have missed.

Thanks

Neil


More information about the Haskell-Cafe mailing list