large binaries

Malcolm Wallace Malcolm.Wallace@cs.york.ac.uk
Fri, 19 Jul 2002 11:34:09 +0100


> > > Is there some reason haskell binaries have to be statically linked?

It would not be entirely fair to lay all the blame for large Haskell
binaries entirely at the door of static vs. dynamic linking.
After all, the Haskell version is dynamically linked against exactly
the same shared libraries as the C version, at least on my machine:

    ldd Hello	(Hello.hs)
   	libm.so.6 => /lib/libm.so.6 (0x40022000)
	libc.so.6 => /lib/libc.so.6 (0x40044000)
	/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Of course, it is static linking against the *Haskell* runtime system,
Prelude and Libraries that is the cause of binary bloat.  Quite simply,
lots of extra stuff is dragged in that isn't visible in the apparently
simple source program.  For instance, I can find all the following symbols
in the binary for "hello world" (compiled with nhc98):

    putStr, shows, showChar, showParen, showString, fromCString,
    toCString, hGetFileName, hPutChar, hPutStr, error, flip, id, init,
    length, not, putChar, putStrLn, seq, show, subtract, exitWith,
    instance Bounded Int (maxBound, minBound), instance Enum Ordering
    (succ, pred, toEnum, fromEnum, enumFrom, enumFromThen, enumFromTo,
    enumFromThenTo), instance Enum ErrNo (succ, pred, toEnum,
    fromEnum, enumFrom, enumFromThen, enumFromTo, enumFromThenTo),
    instance Monad IO (>>=, >>, return, fail), instance Eq ErrNo (==,
    /=), instance Eq Int (==, /=), instance Eq Ordering (==, /=),
    instance Num Int (+, -, *, negate, abs, signum, fromInteger),
    instance Ord Int (compare, <, <=, >=, >, max, min), instance Show
    ErrNo (show, showsPrec, showList), instance Show IOError (show,
    showsPrec, showList), instance Show Int (show, showsPrec, showList)

This is not the fault of any particular implementation - the ghc-built
binary has a similar collection - rather it is dictated by the nature
of the language and its standard libraries.  Because Prelude functions
are small and re-usable, they do get used all over the place in the
implementation of other parts of the Prelude, so you end up with a
huge dependency graph hiding underneath the simplest of calls.

In fact, most of the extra stuff in "Hello World" is there purely to
handle all possible error conditions in the I/O monad.  Several years
ago, Colin Runciman and I did the experiment of removing all the nice
error-handling stuff from the prelude (and eliminating a few classes
too I think), to see just how small we could squash "Hello World".
The idea was to target embedded systems where memory is a scarce
resource, and fancy error-reporting is pointless (a single red LED
would do).  IIRC, we managed to achieve a size of 25kb, compiled
with nhc98, which don't forget includes a bytecode interpreter in
the runtime system.

Regards,
    Malcolm