[Haskell-cafe] broken IO support in uvector package, when using non primitive types

Malcolm Wallace malcolm.wallace at cs.york.ac.uk
Sat Mar 14 13:20:52 EDT 2009


> The main issue seems to be that although the semantics of UIO may be
> arbitrary, Wallace's patch actually broke deserialization for any
> production-based UArr, and I'm not sure the benefits are worthwhile
> (loading a file someone else sent you) given that endianness is
> already not taken into account when loading (so the chances of someone
> giving you a raw binary file that happens to contain values of the
> correct endianness is rather low, it seems).

In my experience, having written several libraries in Haskell for  
serialisation and deserialisation, it is highly problematic when a  
library writer decides that all data to be stored began its life in  
Haskell, and is only being serialised in order to be read back in  
again by the same Haskell library.  I have already made that mistake  
myself in two different libraries now, eventually regretting it (and  
fixing it).

The real utility of serialisation is when it is possible to read data  
from any arbitrary external source, and to write data according to  
external standards.  A library that can only read and write data in  
its own idiosyncratic format is not production-ready at all.

This is why I submitted the patch that enables the uvector library to  
read raw binary data that was not produced by itself.  I had 300Gb of  
data from an external source that I needed to deal with efficiently,  
and uvector was the ideal candidate apart from this small design  
flaw.  And yes, my code also had to deal with endianness conversion on  
this data.

Regards,
     Malcolm



More information about the Haskell-Cafe mailing list