Personal tools

GHC/FAQ

From HaskellWiki

< GHC(Difference between revisions)
Jump to: navigation, search
m ("wouldd" -> "would")
Line 90: Line 90:
   
   
It wouldd make a lot of sense to give GHC a .NET back end, and it's a
+
It would make a lot of sense to give GHC a .NET back end, and it's a
 
question that comes up regularly. The reason that we haven't done it
 
question that comes up regularly. The reason that we haven't done it
 
here, at GHC HQ, is because it's a more substantial undertaking than
 
here, at GHC HQ, is because it's a more substantial undertaking than

Revision as of 16:39, 15 April 2006

Please feel free to add stuff here.

This page is rather long. We've started to add some sub-headings, but would welcome your help in making it better organsised.

Contents

1 GHC on particular platforms

1.1 How do I port GHC to platform X?

There are two distinct possibilities: either

  • The hardware architecture for your system is already supported by GHC, but you're running an OS that isn't supported (or perhaps has been supported in the past, but currently isn't). This is the easiest type of porting job, but it still requires some careful bootstrapping.
  • Your system's hardware architecture isn't supported by GHC. This will be a more difficult port (though by comparison perhaps not as difficult as porting gcc).

Both ways require you to bootstrap from intermediate HC files: these are the stylised C files generated by GHC when it compiles Haskell source. Basically the idea is to take the HC files for GHC itself to the target machine and compile them with gcc to get a working GHC, and go from there.

The Building Guide has all the details on how to bootstrap GHC on a new platform.


1.2 GHC on Linux

1.2.1 I Can't run GHCi on Linux, because it complains about a missing libreadline.so.3.

The "correct" fix for this problem is to install the correct RPM for the particular flavour of Linux on your machine. If this isn't an option, however, there is a hack that might work: make a symbolic link from libreadline.so.4 to libreadline.so.3 in /usr/lib. We tried this on a SuSE 7.1 box and it seemed to work, but YMMV.

1.3 Linking a program causes the following error on Linux: /usr/bin/ld: cannot open -lgmp: No such file or directory

The problem is that your system doesn't have the GMP library installed. If this is a RedHat distribution, install the RedHat-supplied gmp-devel package, and the gmp package if you don't already have it. There have been reports that installing the RedHat packages also works for SuSE (SuSE doesn't supply a shared gmp library).

1.4 GHC on Solaris

1.4.1 Solaris users may sometimes get link errors due to libraries needed by GNU Readline.

We suggest you try linking in some combination of the termcap, curses and ncurses libraries, by giving -ltermcap, -lcurses and -lncurses respectively. If you encounter this problem, we would appreciate feedback on it, since we don't fully understand what's going on here. The build fails in readline.

It has been reported that if you have multiple versions of the readline library installed on Linux, then this may cause the build to fail. If you have multiple versions of readline, try uninstalling all except the most recent version.

1.5 GHC on Windows

1.5.1 My program that uses a really large heap crashes on Windows.

For utterly horrible reasons, programs that use more than 128Mb of heap won't work when compiled dynamically on Windows (they should be fine statically compiled).

1.5.2 I can't use readline under GHCi on Windows

In order to load the readline package under GHCi on Windows, you need to make a version of the readline library that GHCi can load. Instructions for GHC 6.2.2. are here.

1.5.3 Ctrl-C doesn't work on Windows

When running GHC under a Cygwin shell on Windows, Ctrl-C sometimes doesn't work. The workaround is to use Ctrl-Break instead.

1.5.4 How do I link Haskell with C++ code compiled by Visual Studio?

1.5.4.1 Prerequisites

It is assumed that the reader is familiar with the Haskell Foreign function interface (FFI), and is able to compile Haskell programs with GHC and C++ programs with Visual Studio.

1.5.4.2 Background

GHC has two modes of code generation. It either compiles Haskell straight into object code (native mode), or translates Haskell into intermediate C code, and uses a C compiler as backend.

The Windows distribution of GHC comes bundled with the GCC compiler, which is used as backend. That's why linking Haskell with Visual C++ is no different from linking GCC-generated code with the code generated by Visual C++.

One cannot statically link together object files produced by those two compilers, but they can be linked dynamically: an executable produced by Visual C++ can invoke a DLL produced by GCC, and vice versa. Likewise, we can link Haskell with Visual C++ in one of these ways.

Note: if Haskell ever becomes able to use Visual C++ as a backend, we would simply list all source files (Haskell and C++) on the command line of GHC.

1.5.4.3 Invoking a Haskell DLL from a C++ executable

  1. Make a Haskell DLL as explained in [1]
  2. Make a module definition file, such as
    LIBRARY Adder
    EXPORTS
        adder
  3. Create an import library using Visual Studio's lib.exe:
    lib /DEF:adder.def /OUT:adder.lib
  4. Link the C++ program against the import library.

1.5.4.4 Invoking a C++ DLL from a Haskell executable

  1. Make a DLL project in Visual Studio. It will create a .vcproj and .sln files for you. Add your C++ source files to this project.
  2. Create a .def file for your DLL. It might look like
    LIBRARY MyDLL
    EXPORTS
        function1
        function2
    where function1 and function2 are the names of the C++ functions that you want to invoke from Haskell (there can be more of them, of course), MyDLL is the name of your DLL.
  3. Create an import library that can be used by ghc:
    dlltool -d MyDLL.def -l libMyDLL.a
  4. Link your Haskell project, adding the library:
    ghc --make main.hs -optl-lMyDLL -optl-L.
    mind the dot at the end of the command line!
    (-optl switch passes its argument as an option to the linker).

1.6 Why isn't GHC available for .NET?

It would make a lot of sense to give GHC a .NET back end, and it's a question that comes up regularly. The reason that we haven't done it here, at GHC HQ, is because it's a more substantial undertaking than might at first appear (see below). Furthermore, it'd permanently add a complete new back-end platform for us to maintain. Given our rather limited development effort, we have so far not bitten the bullet, and we have no immediate plans to do so.

It would be a good, well-defined project for someone else to tackle, and we would love to see it done. There is some good groundwork already done:

  • Sigbjorn Finne did a simple interop implementation that allows a Haskell program to be compiled to native code (as now) but to call .NET programs via a variant of the FFI. I don't think this work is in active use, and I'd be surprised if it worked out of the box, but it could probably be revived with modest effort
  • Andre Santos and his colleagues at UFPE in Brazil are working on a .NET back end, that generates CLR IL, though I don't know where they are up to.
  • Nigel Perry and Oliver Hunt have a Haskell.NET prototype that works using GHC to compile to Core, and then compiling Core to NET. I'm not sure what stage it is at.
  • GHC.Net would be extra attractive if there was a Visual Studio integration for GHC. Substantial progress on this has been made in 2004 by Simon Marlow, Krasimir Angelov, and Andre Santos and colleagues.

There may be others that I don't know of. If anyone wants to join in this effort, do contact the above folk. And please keep us informed!

Here's a summary of why it's a non-trivial thing to do:

  • The first thing is to generate native CLR Intermediate Language (IL). That's not really hard. Requires thinking about representations for thunks and functions, and it may not be particularly efficient, but it can surely be done. An open question is about whether to generate verifiable IL or not. The trouble here is that Haskell's type system is more expressive than the CLR's in some ways, notably the use of higher-kinded type variables. So, to generate verifiable IL one is bound to need some run-time casts, and it's not clear how to minimise these.

At first blush this is *all* you need do. But it isn't!

  • Next, you need to think about how to inter-operate with .NET libraries. You don't really want to write "foreign import..." for each and every import. You'd like GHC to read the CLR meta-data directly. But there are lots of tricky issues here; see the paper that Mark Shields and I wrote about "Object-oriented style overloading for Haskell".
  • Now you need to figure out how to implement GHC's primitive operations:
  • the I/O monad
  • arbitrary precision arithmetic
  • concurrency
  • exceptions
  • finalisers
  • stable pointers
  • software transactional memory Not all of these are necessary, of course, but many are used in the libraries. The CLR supports many of them (e.g. concurrency) but with a very different cost model.
  • Last, you have to figure out what to do for the libraries. GHC has a pretty large library, and you either have to implement the primops on which the library is based (see previous point), or re-implement it. For example, GHC's implementation of I/O uses mutable state, concurrency, and more besides. For each module, you need to decide either to re-implement it using .NET primitives, or to implement the stuff the module is based on.

These challenges are mostly broad rather than deep. But to get a production quality implementation that runs a substantial majority of Haskell programs "out of the box" requires a decent stab at all of them.


2 Running GHC

2.1 GHC doesn't like filenames containing +.

Indeed not. You could change + to p or plus.

2.2 Why does linking take so long?

Linking a small program should take no more than a few seconds. Larger programs can take longer, but even linking GHC itself only takes 3-4 seconds on our development machines.

Long link times have been attributed to using Sun's linker on Solaris, as compared to GNU ld which appears to be much faster. So if you're on a Sun box, try switching to GNU ld. This article from the mailing list has more information.

2.3 Why do I get errors about missing include files when compiling with -O or -prof?

Certain options, such as -O, turn on via-C compilation, instead of using the native code generator. Include files named by -#include options or in foreign import declarations are only used in via-C compilation mode. See Section 8.2.2.1, ´Finding Header files¡ for more details.

2.4 How do I compile my program for profiling without overwriting the object files and hi files I've already built?

You can select alternative suffixes for object files and interface files, so you can have several builds of the same code coexisting in the same directory. For example, to compile with profiling, you might do this:

    ghc --make -prof -o foo-prof -osuf p.o -hisuf p.hi Main

See Section 4.6.4, ´Redirecting the compilation output(s)¡ for more details on the -osuf and -hisuf options.


3 Syntax

3.1 I can't get string gaps to work

If you're also using CPP, beware of the known pitfall with string gaps mentioned in Section 4.10.3.1, �%G��%@-Y´CPP and string gaps¡.


4 GHCi

4.1 GHCi complains about missing symbols like CC_LIST when loading a previously compiled .o file.

This probably means the .o files in question were compiled for profiling (with -prof). Workaround: recompile them without profiling. We really ought to detect this situation and give a proper error message.

4.2 When I try to start ghci (probably one I compiled myself) it says ghc-5.02: not built for interactive use

To build a working ghci, you need to build GHC 5.02 with itself; the above message appears if you build it with 4.08.X, for example. It'll still work fine for batch-mode compilation, though. Note that you really must build with exactly the same version of the compiler. Building 5.02 with 5.00.2, for example, may or may not give a working interactive system; it probably won't, and certainly isn't supported. Note also that you can build 5.02 with any older compiler, back to 4.08.1, if you don't want a working interactive system; that's OK, and supported.

4.3 I get an error message from GHCi about a "duplicate definition for symbol __module_registered"

An error message like this:

    GHCi runtime linker: fatal error: I found a duplicate definition for symbol
           __module_registered
        whilst processing object file
           /usr/local/lib/ghc-6.2/HSfgl.o

probably indicates that when building a library for GHCi (HSfgl.o in the above example), you should use the -x option to ld.


5 The Foreign Function Interface

5.1 When do other Haskell threads get blocked by an FFI call?

safe unsafe
-threaded NO YES
no -threaded YES YES

The -threaded flag (given when linking; see the manual) allows other Haskell threads to run concurrently with a thread making an FFI call. This nice behaviour does not happen for foreign calls marked as `unsafe` (see the FFI Addendum).

There used to be another modifier, threadsafe, which is now deprecated. Use `safe` instead.

5.2 When I use a foreign function that takes or returns a float, it gives the wrong answer, or crashes.

You should use the -#include option to bring the correct prototype into scope (see Section 4.10.5, ´Options affecting the C compiler (if applicable)¡).


6 Input/Output

6.1 If I print out a string using putStr, and then attempt to read some input using hGetLine, I don't see the output from the putStr.

The stdout handle is line-buffered by default, which means that output sent to the handle is only flushed when a newline (/n) is output, the buffer is full, or hFlush is called on the Handle. The right way to make the text appear without sending a newline is to use hFlush:

      import System.IO
      main = do
        putStr "how are you today? "
        hFlush stdout
        input &- hGetLine
        ...

You'll probably find that the behaviour differs when using GHCi: the hFlush isn't necessary to make the text appear. This is because in GHCi we turn off the buffering on stdout, because this is normally what you want in an interpreter: output appears as it is generated.

6.2 If I explicitly set the buffering on a Handle to NoBuffering I'm not able to enter EOF by typing "Ctrl-D".

This is a consequence of Unixy terminal semantics. Unix does line buffering on terminals in the kernel as part of the terminal processing, unless you turn it off. However, the Ctrl-D processing is also part of the terminal processing which gets turned off when the kernel line buffering is disabled. So GHC tries its best to get NoBuffering semantics by turning off the kernel line buffering, but as a result you lose Ctrl-D. C'est la vie.

6.3 When I open a FIFO (named pipe) and try to read from it, I get EOF immediately.

This is a consequence of the fact that GHC opens the FIFO in non-blocking mode. The behaviour varies from OS to OS: on Linux and Solaris you can wait for a writer by doing an explicit threadWaitRead on the file descriptor (gotten from Posix.handleToFd) before the first read, but this doesn't work on FreeBSD (although rumour has it that recent versions of FreeBSD changed the behaviour to match other OSs). A workaround for all systems is to open the FIFO for writing yourself, before (or at the same time as) opening it for reading.

6.4 When I foreign import a function that returns char or short, I get garbage back.

This is a known bug in GHC versions prior to 5.02.2. GHC doesn't mask out the more significant bits of the result. It doesn't manifest with gcc 2.95, but apparently shows up with g++ and gcc 3.0.

7 Optimization issues

7.1 My program spent too much time doing garbage collection

Add the "+RTS -A10m" option to the command line when you run your program. This sets the allocation area size used by the garbage collector to 10M, which should sufficiently decrease GC times (the default is 256K; see the section "Running a compiled program" in the users' guide). You can also add to your program C module containing statement

char *ghc_rts_opts = "-A10m";

to force your program to use this setting on each run.


7.2 Does GHC do common subexpression elimination?

In general, GHC does not do CSE. It'd be a relatively easy pass for someone to add, but it can cause space leaks. And it can replace two strictly-evaluated calls with one lazy thunk:

    let { x = case e of ....;  y = case e of .... } in ...
  ==>
    let { v = e; x = case v of ...; y = case v of ... } in ...

Now v is allocated as a thunk. (Of course, that might be well worth it if e is an expensive expression.)

Instead GHC does "opportunistic CSE". If you have

    let x = e in .... let y = e in ....

then it'll discard the duplicate binding. This can still cause space leaks but it guarantees never to create a new thunk, and it turns out to be very useful in practice.

Bottom line: if you care about sharing, do it yourself using let or where.


8 Other frequently asked questions

8.1 Do I have to recompile all my code if I upgrade GHC?

Yes. There are two reasons for this:

  • GHC does a lot of cross-module optimisation, so compiled code will include parts of the libraries it was compiled against (including the Prelude), so will be deeply tied to the actual version of those libraries it was compiled against. When you upgrade GHC, the libraries may change; even if the external interface of the libraries doesn't change, sometimes internal details may change because GHC optimised the code in the library differently.
  • We sometimes change the ABI (application binary interface) between versions of GHC. Code compiled with one version of GHC is not necessarily compatible with code compiled by a different version, even if you arrange to keep the same libraries.

8.2 Why doesn't GHC use shared libraries?

GHC does provide shared libraries, currently only on MacOS X. We are working on making shared libraries work on other platforms.

However, GHC-compiled libraries are very tightly coupled, which means it's unlikely you'd be able to swap out a shared library for a newer version unless it was compiled with exactly the same compiler and set of libraries as the old version.

8.3 My program is failing with head [], or an array bounds error, or some other random error, and I have no idea how to find the bug. Can you help?

Compile your program with -prof -auto-all (make sure you have the profiling libraries installed), and run it with +RTS -xc -RTS to get a ´stack trace¡ at the point at which the exception was raised. See Section 4.14.4, ´RTS options for hackers, debuggers, and over-interested souls¡ for more details.

8.4 How do I increase the heap size permanently for a given binary?

See Section 4.14.5, ´´Hooks¡ to change RTS behaviour¡.

8.5 I'm trying to compile my program for parallel execution with the -parallel, and GHC complains with an error like ´failed to load interface file for Prelude¡.

GHC doesn't ship with support for parallel execution; that support is provided separately by the GPH project.

8.6 When is it safe to use unsafePerformIO?

We'll give two answers to this question, each of which may be helpful. These criteria are not rigorous in any real sense (you'd need a formal semantics for Haskell in order to give a proper answer to this question), but should give you a feel for the kind of things you can and cannot do with unsafePerformIO.

  • It is safe to implement a function or API using unsafePerformIO if you could imagine also implementing the same function or API in Haskell without using unsafePerformIO (forget about efficiency, just consider the semantics).
  • In pure Haskell, the value of a function depends only on the values of its arguments (and free variables, if it has any). If you can implement the function using unsafePerformIO and still retain this invariant, then you're probably using unsafePerformIO in a safe way. Note that you need only consider the observable values of the arguments and result.

For more information, see this thread.

8.7 I can't get finalizers to work properly. My program sometimes just prints <<loop>>.

Chances are that your program is trying to write a message to stdout or stderr in the finalizer. Handles have finalizers themselves, and since finalizers don't keep other finalized values alive, the stdout and stderr Handles may be finalized before your finalizer runs. If this happens, your finalizer will block on the handle, and probably end up receiving a NonTermination exception (which is printed as <<loop>>).

8.8 Does GHC implement any kind of extensible records?

No, extensible records are not implemented in GHC. Hugs implements TRex, one extensible record variant. The problem is that the record design space is large, and seems to lack local optima. And all reasonable variants break backward compatibility. As a result, nothing much happens.