[Haskell-cafe] Clean Dynamics and serializing code to disk

Thu Dec 6 08:36:27 EST 2007

if you want to make something like this for haskell (and
i'd very much like to use it!-), there are several issues,
including:

1 handling code: 
    - going for a portable intermediate representation, 
        such as bytecode, is most promising, especially
        if the code representation is fairly stable (if you're
        worried about performance, choose a representation
        that can be compiled further by the backend on
        the target machine)
    - shipping the code objects, and serializing only
        pointers into those, then linking into the objects
        dynamically on load, can work if platform and 
        compiler (ghc versions aren't abi-compatible) 
        don't change

2 handling types: 
    - 'out e' needs a runtime representation of the static 
        type of e; 
    - 'e <- in' needs to compare e's type representation 
        at runtime against the static type expected by the 
        usage context
    - types evolve, type representations evolve
    - haskell's Dynamic doesn't handle polymorphic
        types, and its type representations are not
        standardised, so they can change even if
        the types do not (there's no guarantee that
        reading an expression written by another
        program will work)
    - doing a proper integration of Dynamic raises
        other issues wrt interactions with other type
        system features (see the Clean papers for 
        examples) or wrt parametricity (if any type
        can be wrapped in/extracted from a Dynamic,
        can you now have type-selective functions of
        type 'a->a'?)

3 handlings graphs: in a typical functional language
    implementation, there are several parts that need
    to do this already, although they tend to be 
    specialised to their intended use, so they might
    not cover all needs for general serialization

    - a distributed implementation needs to ship 
        graphs to other nodes (but often ships code
        in a separate setup phase..)
    - the memory manager needs to move graphs
        between memory areas (but often does not
        touch code at all)

    graph representations evolve, so if you can reuse 
    code from one of the existing parts here, that will 
    save you not only initial time but, more importantly, 
    avoid a maintenance nightmare.

4 surviving evolution:

    if you got all of that working, the more interesting
    issues start popping up

    - implementations move on, can your code keep up?
    - programs move on, can your system handle versions?
    - the distinction between static/dynamic has suddenly
        become rather blurry, raising lots of interesting
        opportunities, but also pointing out limitations of
        tools that rely on a fixed single compile-time/
        runtime cycle

    many of those issues were investigated long ago,
    in the area of orthogonally persistent systems 
    (including types and first-class procedures), see
    a thread from earlier this year for a few references:
    http://www.haskell.org/pipermail/haskell-cafe/2007-June/027162.html

non-trivial, but doable - you'd probably spend a
lot more time thinking, investigating current code, 
prototyping options, and sorting out issues, than 
on the final implementation. it is the kind of problem
where you either spend a lot of time preparing an
overnight success, or you start hacking right away
and never get anywhere!-)

claus

ps: i implemented first-class i/o for a pre-haskell
functional language (long gone now) - i experimented 
with dynamic linking of object code, but went with 
byte code, didn't have to struggle with types, because 
the language had runtime types and checking anyway, 
and was able to reuse code from the distributed 
implementation for storage/retrieval. and saw only
the very early stages of 4;-)