using ghc with make

Bulat Ziganshin bulat.ziganshin at gmail.com
Tue Apr 18 12:40:11 EDT 2006


Hello Simon,

Tuesday, April 18, 2006, 3:02:20 PM, you wrote:

>> if that is due to the time of reading .hi files, my alternative Binary
>> library should help in some future

> I'd be suprised if you could improve on GHC's binary library.  Using 
> BinMem (reading/writing directly to memory), GHC's binary library is 
> about as fast as it gets.  I'm sure yours wins when dealing with files,
> though.

sorry, Simon, but when i found 10x difference in speed (6mb/s vs 60
mb/s) it was on membufs :)  although i can't say that the difference
anyway will be 10 times. for so fast lib the time required to traverse
lists is essential. there are also a lot of other possible problems. i
just asking - whether the time required for reading these (or any
other binary) files is essential for compilation speed?

(although i don't tested code in ghc 6.5, may be it is faster)

if you are interested, i can comment some bits of Binary module. i
optimized essentially the same code, so i know it's potential
bottlenecks :)

getWord8 (BinMem _ ix_r sz_r arr_r) = do
    ix <- readFastMutInt ix_r
    sz <- readFastMutInt sz_r
    when (ix >= sz)  $
        ioError (mkIOError eofErrorType "Data.Binary.getWord8" Nothing Nothing)
    arr <- readIORef arr_r
    w <- unsafeRead arr ix
    writeFastMutInt ix_r (ix+1)
    return w

here you use boxed reference for storing array. i uses unboxed Ptr
reference, that is of course faster. also, using ioError here means
that the whole function result can't be "unboxed". at least i seen
substantial slowdown when i added error processing here. of course,
it's hard to omit. also the following may be faster:

    writeFastMutInt ix_r (ix+1)
    unsafeRead arr ix

i also use one less variable here (only curptr and bufend)

instance Binary Word16 where
  get h = do
    w1 <- getWord8 h
    w2 <- getWord8 h
    return $! ((fromIntegral w1 `shiftL` 8) .|. fromIntegral w2)

first possible problem here is what getWord8 is not inlined. second -
using of checked arithmetic (operations on Ints may have additional
checks, unlike operations on Int#). third - it's better to make all
operations on Int and only then pack all data to Int16 constructor

btw, my lib use 32/64-bit independent binary files (i.e. you can write
the file with 64-bit program and then read it with the 32-bit one). to
achieve this, i use variable-size representation for Int and Word. i
also use compiler-independent representation for Integer (the same as
dor Int). may be it will be interesting for ghc and will allow to use
both (32 bit and 64 bit) compilers on the same project?




-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Glasgow-haskell-users mailing list