[Haskell-cafe] hssfig: Unresolved symbols when importing a foreign library: what is the preferred way of handling?

Wed Jul 27 10:32:12 EDT 2005

I would like to collect opinions from interested people on this matter.

Background:

Hsffig parses include files (.h) and generates FFI import declarations
for every function/external variable it encounters a prototype
declaration for. The result is a .hsc file which is converted into a
Haskell source and compiled into an object file and linked into an
executable along with other application source files using ghc as
frontend (so far, hsffig is targeted to ghc users only). This object
file will contain references to all foreign functions to be imported.

Problem:

In some cases, a include file may contain prototypes of functions not
existing in the library it corresponds to. In my case, <unistd.h>
(Linux, glibc2) contains prototypes for __ftruncate and pthread_atfork
which come up unresolved when linking the executable without
additional libraries specified.

pthread_atfork is resolved when linking with -lpthread. Where is
__ftruncate defined, I don't know, and I do not really want ot know
unless I need to use this function.

Possible solutions, sorted by expected amount of work necessary to
implement (please pick one you would prefer, or suggest yours):

- leave everything as is: the executable will not be linked until the
developer finds all the libraries to resolve all the missing symbols.
That's how it works (or doesn't work) now.

- tell ld only to warn about unresolved symbols: the executable will
be built anyway, but if the unresolved function is called from under
the hood it most likely results in segfault hard to explain (which
function caused it). Only special linker option is required.

- for each foreign function imported, generate a stub which, when
called, complains loudly and aborts the program. These stubs are
placed in a separate object file and linked after all possible
libraries, so library-defined symbols always take precedence. The
disadvantage is: the developer would not know which symbols remained
unresolved when the executable was linked, only will he know when the
application crashes. I am trying to implement this now, using the
hsc2hs' #def macro to define those stub functions.

- analyze the linker output and make auto-correction in the generated
.hsc file just by excluding FFI declarations for unresolved functions.
Non-elegant, requires second pass and highly platform and tool
dependent.

- instead of generation of a single object file referencing all the
functions, create one per funciton import and place them all in an
archive (.a). Then, when linking, only actually used functions will be
picked from the archive. I am not sure how would this interact with
ghc --make: I got an impression that a .hi file is necessary for all
modules (.hs/.hi/.o triple) imported unless they are gotten from a
package. If I have in my program:

import FOO_H

and there is no FOO_H.hi nor FOO_H.hs, only libFOO_H.a: how ghc would
live with this?

The latter approach does not take much to implement in hsffig itself
(I already have a splitter which splits into one object file per a
structure/union). I am more concerned how to make ghc live with this.
Creation of a package for each imported include file seems like
overkill.

PS I believe that hsffig will be an useful utility. I noticed several
people downloaded nightly tarballs or checked out from the DARCS repo.
If these persons have found this program useful, would they please
suggest something on the matter I described.

-- 
Dimitry Golubovsky

Anywhere on the Web