Best practices for foreign imports

bgamari - 2021-07-12

tl;dr: When importing system libraries we strongly recommend that users use GHC’s capi calling convention. For details, see the recommendations section below.

One of Haskell’s strengths is its great foreign function interface: using time-tested foreign libraries or raw system calls is just a foreign import away. However, while syntactically simple, safely using foreign functions can be quite tricky. A few weeks ago we saw one facet of this problem in the keepAlive# post. This week we will look at another complexity which has recently caused us trouble: calling conventions.

Why this matters

With the increasing prevalance of ARM hardware with Apple’s recent releases, many latent bugs due to calling convention details are becoming more visible.

For instance, in #20079 it was noticed that GHCi crashes on AArch64/Darwin when the terminal window is resized. We eventually found that this was due to a bug in haskeline: ioctl, a variadic function, was imported using GHC’s ccall calling convention. The fix is straightforward: use the capi pseudo-calling convention introduced in GHC 7.6.1.

It turns out that incorrect ioctl imports is rather common pattern among Hackage packages. Consequently, we thought it would be helpful to offer some explicit guidance for users.

Background: Foreign calling conventions

During a function call both the caller and the callee must agree on several operational details:

  • when the function is called:
    • which arguments can be passed in registers?
    • in what order are the remaining arguments pushed to the stack?
    • how are variadic functions handled?
    • must the stack be aligned?
    • where is the return address found?
  • when the function returns:
    • who is responsible for popping the arguments from the stack?
    • where is the return value(s) stored?

Together, these details are known as a calling convention and are typically implied by the operating system and target architecture. For instance, x86-64 Linux (and most other POSIX platforms) typically uses the System V amd64 calling convention whereas 32-bit Windows has no fewer than three commonly-used conventions.

When compiling C source, the C compiler determines a function’s calling convention using its signature, which typically appears in a header file. However, when GHC imports a function with the usual ccall calling convention, e.g.:

foreign import ccall "hello_world" helloWorld :: IO ()

it does not have the benefit of a signature; instead it must infer the calling convention from the type given by the import. This can break in two ways:

  • many calling conventions treat variadic functions (e.g. printf) differently from the corresponding non-variadic signature; while it is documented that ccall does not support variadic functions, this fact is not well-known by users.
  • the type provided by the user may be wrong (e.g. using Int instead of CInt)

Unfortunately, with the foreign import ccall mechanism the compiler has no way of catching such issues, potentially leaving the user with difficult-to-spot, platform-dependent soundness bugs.

Safe foreign calls via CApiFFI

To address help mitigate this class of bugs, GHC 7.10 introduced a new language extension, CApiFFI, which offers a more robust way to import foreign functions. Unlike ccall, capi requires that the user specify both the foreign function’s name as well as the name of the header file where its signature can be found. For instance, one can write:

foreign import capi "stdio.h puts" c_puts :: Ptr CChar -> IO CInt

To compile this, GHC will construct a C source file which #include’s stdio.h. and defines a stub function which performs the call:

#include "stdio.h"
HsInt32 ghczuwrapperZC0ZCmainZCHelloZCputs(void* a1) {
    return puts(a1);
}

This approach brings a few advantages:

  • capi imports can be used to import functions defined using CPP
  • the calling convention is decided by the C compiler using the signature provided in the indicated header file, eliminating the potential for inconsistency
  • variadic functions “just work”
  • it removes the need to worry about which of Windows’ zoo of supported conventions is used (see #12890, #3052)

Recommendations for users

As a rule, the easiest code to debug is the code that you don’t need to write. Consequently, users are encouraged to use existing bindings libraries (e.g. unix) instead of defining their own foreign imports when possible.

Of course, not all libraries have bindings available. In these cases we recommend that users use foreign import capi for imports of libraries not under their control (e.g. system libraries).

Note, however, that capi does incur a small (arguably negligible) runtime cost due to the to the C stub. It is justifiable to use ccall to avoid this runtime cost in cases where the foreign function is shipped with a package’s cbits, where the calling convention is clear.