[Haskell-cafe] Re: [Haskell] ANNOUNCE: RefSerialize-0.2.1

Alberto G. Corona agocorona at gmail.com
Sun Nov 2 09:02:10 EST 2008


I uploadad RefSerialize<http://hackage.haskell.org/cgi-bin/hackage-scripts/package/RefSerialize>
to  Hackage .

Read, Show and Data.Binary do not check for repeated references to the same
data address. As a result, the data is serialized multiple times when
serialized. This is a waste of space in the filesystem  and  also a waste of
serialization time. but the worst consequence is that, when the serialized
data is read, it allocates multiple copies in memory for the same object
referenced multiple times. Because multiple referenced data is very typical
in a pure language such is Haskell, this means that the resulting data loose
the beatiful economy of space and processing time that referential
transparency permits.

This package allows the serialization and deserialization of large data
structures without duplication of data, with
the result of optimized performance and memory usage. It is also useful for
debugging purposes.

There are automatic derived instances for instances of Read/Show, lists and
strings. the deserializer contains a almos complete set of Parsec.Token
parsers for deserialization.

 Every instance of Show/Read is also a instance of Data.RefSerialize

 The serialized string has the form "expr( var1, ...varn) where
var1=value1,..valn=valueN " so that the
string can ve EVALuated.

 See demo.hs and tutorial. I presumably will add a entry in
haskell-web.blogspot.com

                     To develop: -derived instances for Data.Binary
                                 -serialization to/from ByteStings

I wrote this module because I needed to serialize lists of verisions of the
same data with slight modifications between each version.


This is a short tutorial (in tutorial.txt)



runW applies showp, the serialization parser of the instance Int for the
RefSerialize class

Data.RefSerialize>let x= 5 :: Int
Data.RefSerialize>runW $ showp x
"5"

every instance of Read and Show is an instance of RefSerialize.

rshowp is derived from showp, it labels the serialized data with a variable
name

Data.RefSerialize>runW $ rshowp x
" v8 where {v8= 5; }"

Data.RefSerialize>runW $ rshowp [2::Int,3::Int]
" v6 where {v6= [ v9,  v10]; v9= 2; v10= 3; }"

while showp does a normal show serialization

Data.RefSerialize>runW $ showp [x,x]
"[5, 5]"

rshowp variables are serialized memory references: no piece of data that
point to the same addrees is serialized but one time

Data.RefSerialize>runW $ rshowp [x,x]
" v9 where {v6= 5; v9= [ v6, v6]; }"

This happens recursively

Data.RefSerialize>let xs= [x,x] in str = runW $ rshowp [xs,xs]
Data.RefSerialize>str
" v8 where {v8= [ v10, v10]; v9= 5; v10= [ v9, v9]; }"

the rshowp serialized data is read with rreadp. The showp serialized data is
read by readp

Data.RefSerialize>let xss= runR rreadp str :: [[Int]]
Data.RefSerialize>print xss
[[5,5],[5,5]]

this is the deserialized data

the deserialized data keep the references!! pointers are restored! That is
the whole point!

Data.RefSerialize>varName xss !! 0 == varName xss !! 1
True


rShow= runW rshowp
rRead= runR rreadp

Data.RefSerialize>rShow x
" v11 where {v11= 5; }"


In the definition of a referencing parser non referencing parsers can be
used and viceversa. Use a referencing parser
when the piece of data is being referenced many times inside the serialized
data.

by default the referencing parser is constructed by:

rshowp= insertVar showp
   rreadp= readVar readp
but this can be redefined. See for example the instance of [] in
RefSerialize.hs

This is an example of a showp parser for a simple data structure.

data S= S Int Int deriving ( Show, Eq)

instance  Serialize S  where
    showp (S x y)= do
                    xs <- rshowp x  -- rshowp parsers can be inside showp
parser
                    ys <- rshowp y
                    return $ "S "++xs++" "++ys



    readp =  do
                    symbol "S"     -- I included a (almost) complete Parsec
for deserialization
                    x <- rreadp
                    y <- rreadp
                    return $ S x y

there is a mix between referencing and no referencing parser here:

Data.RefSerialize>putStrLn $ runW $ showp $ S x x
S  v23 v23 where {v23= 5; }


(I corrected some errors in this file here)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20081102/cf532cf9/attachment.htm


More information about the Haskell-Cafe mailing list