Next Previous Contents

## 2. Using GHC

GHC is a command-line compiler: in order to compile a Haskell program, GHC must be invoked on the source file(s) by typing a command to the shell. The steps involved in compiling a program can be automated using the make tool (this is especially useful if the program consists of multiple source files which depend on each other). This section describes how to use GHC from the command-line.

## 2.1 Overall command-line structure

An invocation of GHC takes the following form:

 ghc [argument...] 

Command-line arguments are either options or file names.

Command-line options begin with -. They may not be grouped: -vO is different from -v -O. Options need not precede filenames: e.g., ghc *.o -o foo. All options are processed and then applied to all files; you cannot, for example, invoke ghc -c -O1 Foo.hs -O2 Bar.hs to apply different optimisation levels to the files Foo.hs and Bar.hs. For conflicting options, e.g., -c -S, we reserve the right to do anything we want. (Usually, the last one applies.)

## 2.2 Meaningful file suffixes

File names with meaningful'' suffixes (e.g., .lhs or .o) cause the right thing'' to happen to those files.

.lhs:

A literate Haskell'' module.

.hs:

.hi:

A Haskell interface file, probably compiler-generated.

.hc:

Intermediate C file produced by the Haskell compiler.

.c:

A C file not produced by the Haskell compiler.

.s:

An assembly-language source file, usually produced by the compiler.

.o:

An object file, produced by an assembler.

Files with other suffixes (or without suffixes) are passed straight to the linker.

## 2.3 Help and verbosity options

A good option to start with is the -help (or -?) option. GHC spews a long message to standard output and then exits.

The -v option makes GHC verbose: it reports its version number and shows (on stderr) exactly how it invokes each phase of the compilation system. Moreover, it passes the -v flag to most phases; each reports its version number (and possibly some other information).

Please, oh please, use the -v option when reporting bugs! Knowing that you ran the right bits in the right order is always the first thing we want to verify.

If you're just interested in the compiler version number, the --version option prints out a one-line string containing the requested info.

## 2.4 Running the right phases in the right order

The basic task of the ghc driver is to run each input file through the right phases (parsing, linking, etc.).

The first phase to run is determined by the input-file suffix, and the last phase is determined by a flag. If no relevant flag is present, then go all the way through linking. This table summarises:

Phase of the Suffix saying Flag saying (suffix of) compilation system start here'' stop after'' output file literate pre-processor .lhs - - C pre-processor (opt.) - - - Haskell compiler .hs -C, -S .hc, .s C compiler (opt.) .hc or .c -S .s assembler .s -c .o linker other - a.out

Thus, a common invocation would be: ghc -c Foo.hs

Note: What the Haskell compiler proper produces depends on whether a native-code generator is used (producing assembly language) or not (producing C).

The option -cpp must be given for the C pre-processor phase to be run, that is, the pre-processor will be run over your Haskell source file before continuing.

The option -E runs just the pre-processing passes of the compiler, outputting the result on stdout before stopping. If used in conjunction with -cpp, the output is the code blocks of the original (literal) source after having put it through the grinder that is the C pre-processor. Sans -cpp, the output is the de-litted version of the original source.

The option -optcpp-E runs just the pre-processing stage of the C-compiling phase, sending the result to stdout. (For debugging or obfuscation contests, usually.)

## 2.5 Re-directing the compilation output(s)

GHC's compiled output normally goes into a .hc, .o, etc., file, depending on the last-run compilation phase. The option -o foo re-directs the output of that last-run phase to file foo.

Note: this feature'' can be counterintuitive: ghc -C -o foo.o foo.hs will put the intermediate C code in the file foo.o, name notwithstanding!

EXOTICA: But the -o option isn't of much use if you have several input files... Non-interface output files are normally put in the same directory as their corresponding input file came from. You may specify that they be put in another directory using the -odir <dir> (the Oh, dear'' option). For example:

 % ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir arch 

The output files, Foo.o, Bar.o, and Bumble.o would be put into a subdirectory named after the architecture of the executing machine (sun4, mips, etc). The directory must already exist; it won't be created.

Note that the -odir option does not affect where the interface files are put. In the above example, they would still be put in parse/Foo.hi, parse/Bar.hi, and gurgle/Bumble.hi.

MORE EXOTICA: The -osuf <suffix> will change the .o file suffix for object files to whatever you specify. (We use this in compiling the prelude.). Similarly, the -hisuf <suffix> will change the .hi file suffix for non-system interface files (see Section Other options related to interface files).

The -hisuf/-osuf game is useful if you want to compile a program with both GHC and HBC (say) in the same directory. Let HBC use the standard .hi/.o suffixes; add -hisuf g_hi -osuf g_o to your make rule for GHC compiling...

FURTHER EXOTICA: If you are doing a normal .hs-to-.o compilation but would like to hang onto the intermediate .hc C file, just throw in a -keep-hc-file-too option . If you would like to look at the assembler output, toss in a -keep-s-file-too, too.

### Saving GHC's standard error output

Sometimes, you may cause GHC to be rather chatty on standard error; with -dshow-rn-trace, for example. You can instruct GHC to append this output to a particular log file with a -odump <blah> option.

### Redirecting temporary files

If you have trouble because of running out of space in /tmp (or wherever your installation thinks temporary files should go), you may use the -tmpdir <dir> option to specify an alternate directory. For example, -tmpdir . says to put temporary files in the current working directory.

Alternatively, use your TMPDIR environment variable. Set it to the name of the directory where temporary files should be put. GCC and other programs will honour the TMPDIR variable as well.

Even better idea: Set the TMPDIR variable when building GHC, and never worry about TMPDIR again. (see the build documentation).

## 2.6 Warnings and sanity-checking

GHC has a number of options that select which types of non-fatal error messages, otherwise known as warnings, can be generated during compilation. By default, you get a standard set of warnings which are generally likely to indicate bugs in your program. These are: -fwarn-overlapping-patterns, -fwarn-duplicate-exports, and -fwarn-missing-methods. The following flags are simple ways to select standard packages'' of warnings:

-Wnot:

Turns off all warnings, including the standard ones.

-w:

Synonym for -Wnot.

-W:

Provides the standard warnings plus -fwarn-incomplete-patterns, -fwarn-unused-imports and -fwarn-unused-binds.

-Wall:

Turns on all warning options.

The full set of warning options is described below. To turn off any warning, simply give the corresponding -fno-warn-... option on the command line.

-fwarn-name-shadowing:

This option causes a warning to be emitted whenever an inner-scope value has the same name as an outer-scope value, i.e. the inner value shadows the outer one. This can catch typographical errors that turn into hard-to-find bugs, e.g., in the inadvertent cyclic definition let x = ... x ... in.

Consequently, this option does not allow cyclic recursive definitions.

-fwarn-hi-shadowing:

Warns you about shadowing of interface files along the supplied import path. For instance, assuming you invoke ghc with the import path -iutils:src and Utils.hi exist in both the utils and src directories, -fwarn-hi-shadowing will warn you that utils/Utils.hi shadows src/Utils.hi.

-fwarn-overlapping-patterns:

By default, the compiler will warn you if a set of patterns are either incomplete (i.e., you're only matching on a subset of an algebraic data type's constructors), or overlapping, i.e.,

 f :: String -> Int f [] = 0 f (_:xs) = 1 f "2" = 2 g [] = 2 

where the last pattern match in f won't ever be reached, as the second pattern overlaps it. More often than not, redundant patterns is a programmer mistake/error, so this option is enabled by default.

-fwarn-incomplete-patterns:

Similarly for incomplete patterns, the function g will fail when applied to non-empty lists, so the compiler will emit a warning about this when this option is enabled.

-fwarn-missing-methods:

This option is on by default, and warns you whenever an instance declaration is missing one or more methods, and the corresponding class declaration has no default declaration for them.

-fwarn-unused-imports:

Report any objects that are explicitly imported but never used.

-fwarn-unused-binds:

Report any function definitions (and local bindings) which are unused. For top-level functions, the warning is only given if the binding is not exported.

-fwarn-unused-matches:

Report all unused variables which arise from pattern matches, including patterns consisting of a single variable. For instance f x y = [] would report x and y as unused. To eliminate the warning, all unused variables can be replaced with wildcards.

-fwarn-duplicate-exports:

Have the compiler warn about duplicate entries in export lists. This is useful information if you maintain large export lists, and want to avoid the continued export of a definition after you've deleted (one) mention of it in the export list.

This option is on by default.

If you would like GHC to check that every top-level value has a type signature, use the -fsignatures-required option.

If you're feeling really paranoid, the -dcore-lint option is a good choice. It turns on heavyweight intra-pass sanity-checking within GHC. (It checks GHC's sanity, not yours.)

## 2.7 Separate compilation

This section describes how GHC supports separate compilation.

### Interface files

When GHC compiles a source file F which contains a module A, say, it generates an object F.o, and a companion interface file A.hi.

NOTE: Having the name of the interface file follow the module name and not the file name, means that working with tools such as make(1) become harder. make implicitly assumes that any output files produced by processing a translation unit will have file names that can be derived from the file name of the translation unit. For instance, pattern rules becomes unusable. For this reason, we recommend you stick to using the same file name as the module name.

The interface file for A contains information needed by the compiler when it compiles any module B that imports A, whether directly or indirectly. When compiling B, GHC will read A.hi to find the details that it needs to know about things defined in A.

Furthermore, when compiling module C which imports B, GHC may decide that it needs to know something about A --- for example, B might export a function that involves a type defined in A. In this case, GHC will go and read A.hi even though C does not explicitly import A at all.

The interface file may contain all sorts of things that aren't explicitly exported from A by the programmer. For example, even though a data type is exported abstractly, A.hi will contain the full data type definition. For small function definitions, A.hi will contain the complete definition of the function. For bigger functions, A.hi will contain strictness information about the function. And so on. GHC puts much more information into .hi files when optimisation is turned on with the -O flag. Without -O it puts in just the minimum; with -O it lobs in a whole pile of stuff.

A.hi should really be thought of as a compiler-readable version of A.o. If you use a .hi file that wasn't generated by the same compilation run that generates the .o file the compiler may assume all sorts of incorrect things about A, resulting in core dumps and other unpleasant happenings.

### Finding interface files

In your program, you import a module Foo by saying import Foo. GHC goes looking for an interface file, Foo.hi. It has a builtin list of directories (notably including .) where it looks.

-i<dirs>

This flag prepends a colon-separated list of dirs to the import directories'' list.

-i

resets the import directories'' list back to nothing.

-fno-implicit-prelude

GHC normally imports Prelude.hi files for you. If you'd rather it didn't, then give it a -fno-implicit-prelude option. You are unlikely to get very far without a Prelude, but, hey, it's a free country.

-syslib <lib>

If you are using a system-supplied non-Prelude library (e.g., the POSIX library), just use a -syslib posix option (for example). The right interface files should then be available. Section The GHC Prelude and Libraries lists the libraries available by this mechanism.

-I<dir>

Once a Haskell module has been compiled to C (.hc file), you may wish to specify where GHC tells the C compiler to look for .h files. (Or, if you are using the -cpp option , where it tells the C pre-processor to look...) For this purpose, use a -I option in the usual C-ish way.

### Other options related to interface files

The interface output may be directed to another file bar2/Wurble.iface with the option -ohi bar2/Wurble.iface (not recommended).

To avoid generating an interface file at all, use a -nohi option.

The compiler does not overwrite an existing .hi interface file if the new one is byte-for-byte the same as the old one; this is friendly to make. When an interface does change, it is often enlightening to be informed. The -hi-diffs option will make ghc run diff on the old and new .hi files. You can also record the difference in the interface file itself, the -keep-hi-diffs option takes care of that.

The .hi files from GHC contain usage'' information which changes often and uninterestingly. If you really want to see these changes reported, you need to use the -hi-diffs-with-usages option.

Interface files are normally jammed full of compiler-produced pragmas, which record arities, strictness info, etc. If you think these pragmas are messing you up (or you are doing some kind of weird experiment), you can tell GHC to ignore them with the -fignore-interface-pragmas option.

When compiling without optimisations on, the compiler is extra-careful about not slurping in data constructors and instance declarations that it will not need. If you believe it is getting it wrong and not importing stuff which you think it should, this optimisation can be turned off with -fno-prune-tydecls and -fno-prune-instdecls.

### The recompilation checker

In the olden days, GHC compared the newly-generated .hi file with the previous version; if they were identical, it left the old one alone and didn't change its modification date. In consequence, importers of a module with an unchanged output .hi file were not recompiled.

This doesn't work any more. In our earlier example, module C does not import module A directly, yet changes to A.hi should force a recompilation of C. And some changes to A (changing the definition of a function that appears in an inlining of a function exported by B, say) may conceivably not change B.hi one jot. So now...

GHC keeps a version number on each interface file, and on each type signature within the interface file. It also keeps in every interface file a list of the version numbers of everything it used when it last compiled the file. If the source file's modification date is earlier than the .o file's date (i.e. the source hasn't changed since the file was last compiled), and you give GHC the -recomp flag, then GHC will be clever. It compares the version numbers on the things it needs this time with the version numbers on the things it needed last time (gleaned from the interface file of the module being compiled); if they are all the same it stops compiling rather early in the process saying Compilation IS NOT required''. What a beautiful sight!

It's still an experimental feature (that's why -recomp is off by default), so tell us if you think it doesn't work.

Patrick Sansom has a workshop paper about how all this is done. Ask him (email: sansom@dcs.gla.ac.uk) if you want a copy.

### Using make

It is reasonably straightforward to set up a Makefile to use with GHC, assuming you name your source files the same as your modules. Thus:

 HC = ghc HC_OPTS = -cpp $(EXTRA_HC_OPTS) SRCS = Main.lhs Foo.lhs Bar.lhs OBJS = Main.o Foo.o Bar.o .SUFFIXES : .o .hi .lhs .hc .s cool_pgm :$(OBJS) rm $@$(HC) -o $@$(HC_OPTS) $(OBJS) # Standard suffix rules .o.hi: @: .lhs.o:$(HC) -c $<$(HC_OPTS) .hs.o: $(HC) -c$< $(HC_OPTS) # Inter-module dependencies Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz  (Sophisticated make variants may achieve some of the above more elegantly. Notably, gmake's pattern rules let you write the more comprehensible:  %.o : %.lhs$(HC) -c $<$(HC_OPTS) 

What we've shown should work with any make.)

Note the cheesy .o.hi rule: It records the dependency of the interface (.hi) file on the source. The rule says a .hi file can be made from a .o file by doing... nothing. Which is true.

Note the inter-module dependencies at the end of the Makefile, which take the form

 Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz 

They tell make that if any of Foo.o, Foo.hc or Foo.s have an earlier modification date than Baz.hi, then the out-of-date file must be brought up to date. To bring it up to date, make looks for a rule to do so; one of the preceding suffix rules does the job nicely.

Putting inter-dependencies of the form Foo.o : Bar.hi into your Makefile by hand is rather error-prone. ghc offers you a helping hand with it's -M option. To automatically generate inter-dependencies, add the following to your Makefile:

### Dumping out compiler intermediate structures

-noC:

Don't bother generating C output or an interface file. Usually used in conjunction with one or more of the -ddump-* options; for example: ghc -noC -ddump-simpl Foo.hs

-hi:

Do generate an interface file (on stdout.) This would normally be used in conjunction with -noC, which turns off interface generation; thus: -noC -hi.

-hi-with-&lt;section&gt;:

Generate just the specified section of an interface file. In case you're only interested in a subset of what -hi outputs, -hi-with-&lt;section&gt; is just the ticket. For instance

 -noC -hi-with-declarations -hi-with-exports 

will output the sections containing the exports and the declarations. Legal sections are: declarations, exports, instances, instance_modules, usages, fixities, and interface.

-dshow-passes:

Prints a message to stderr as each pass starts. Gives a warm but undoubtedly misleading feeling that GHC is telling you what's happening.

-ddump-<pass>:

Make a debugging dump after pass <pass> (may be common enough to need a short form...). Some of the most useful ones are:

-ddump-rdr reader output (earliest stuff in the compiler) -ddump-rn renamer output -ddump-tc typechecker output -ddump-deriv derived instances -ddump-ds desugarer output -ddump-simpl simplifer output (Core-to-Core passes) -ddump-stranal strictness analyser output -ddump-occur-anal occurrence analysis' output -ddump-spec dump specialisation info -ddump-stg output of STG-to-STG passes -ddump-absC unflattened Abstract C -ddump-flatC flattened Abstract C -ddump-realC same as what goes to the C compiler -ddump-asm assembly language from the native-code generator

-dverbose-simpl and -dverbose-stg:

Show the output of the intermediate Core-to-Core and STG-to-STG passes, respectively. (Lots of output!) So: when we're really desperate:

 % ghc -noC -O -ddump-simpl -dverbose-simpl -dcore-lint Foo.hs 

-dppr-{user,debug,all}:

Debugging output is in one of several styles.'' Take the printing of types, for example. In the user'' style, the compiler's internal ideas about types are presented in Haskell source-level syntax, insofar as possible. In the debug'' style (which is the default for debugging output), the types are printed in the most-often-desired form, with explicit foralls, etc. In the show all'' style, very verbose information about the types (e.g., the Uniques on the individual type variables) is displayed.

-ddump-raw-asm:

Dump out the assembly-language stuff, before the mangler'' gets it.

-ddump-rn-trace:

Make the renamer be *real* chatty about what it is upto.

-dshow-rn-stats:

Print out summary of what kind of information the renamer had to bring in.

-dshow-unused-imports:

Have the renamer report what imports does not contribute.

### How to read Core syntax (from some -ddump-* flags)

Let's do this by commenting an example. It's from doing -ddump-ds on this code:

 skip2 m = m : skip2 (m+2) 

Before we jump in, a word about names of things. Within GHC, variables, type constructors, etc., are identified by their Uniques.'' These are of the form letter' plus number' (both loosely interpreted). The letter' gives some idea of where the Unique came from; e.g., _ means built-in type variable''; t means from the typechecker''; s means from the simplifier''; and so on. The number' is printed fairly compactly in a base-62' format, which everyone hates except me (WDP).

Remember, everything has a Unique'' and it is usually printed out when debugging, in some form or another. So here we go...

 Desugared: Main.skip2{-r1L6-} :: _forall_ a$_4 =>{{Num a$_4}} -> a$_4 -> [a$_4] --# r1L6' is the Unique for Main.skip2; --# _4' is the Unique for the type-variable (template) a' --# {{Num a\$_4}}' is a dictionary argument _NI_ --# _NI_' means "no (pragmatic) information" yet; it will later --# evolve into the GHC_PRAGMA info that goes into interface files. Main.skip2{-r1L6-} = /\ _4 -> \ d.Num.t4Gt -> let { {- CoRec -} +.t4Hg :: _4 -> _4 -> _4 _NI_ +.t4Hg = (+{-r3JH-} _4) d.Num.t4Gt fromInt.t4GS :: Int{-2i-} -> _4 _NI_ fromInt.t4GS = (fromInt{-r3JX-} _4) d.Num.t4Gt --# The +' class method (Unique: r3JH) selects the addition code --# from a Num' dictionary (now an explicit lamba'd argument). --# Because Core is 2nd-order lambda-calculus, type applications --# and lambdas (/\) are explicit. So +' is first applied to a --# type (_4'), then to a dictionary, yielding the actual addition --# function that we will use subsequently... --# We play the exact same game with the (non-standard) class method --# fromInt'. Unsurprisingly, the type Int' is wired into the --# compiler. lit.t4Hb :: _4 _NI_ lit.t4Hb = let { ds.d4Qz :: Int{-2i-} _NI_ ds.d4Qz = I#! 2# } in fromInt.t4GS ds.d4Qz --# I# 2#' is just the literal Int 2'; it reflects the fact that --# GHC defines data Int = I# Int#', where Int# is the primitive --# unboxed type. (see relevant info about unboxed types elsewhere...) --# The !' after I#' indicates that this is a *saturated* --# application of the I#' data constructor (i.e., not partially --# applied). skip2.t3Ja :: _4 -> [_4] _NI_ skip2.t3Ja = \ m.r1H4 -> let { ds.d4QQ :: [_4] _NI_ ds.d4QQ = let { ds.d4QY :: _4 _NI_ ds.d4QY = +.t4Hg m.r1H4 lit.t4Hb } in skip2.t3Ja ds.d4QY } in :! _4 m.r1H4 ds.d4QQ {- end CoRec -} } in skip2.t3Ja 

(It's just a simple functional language'' is an unregisterised trademark of Peyton Jones Enterprises, plc.)

### Command line options in source files

Sometimes it is useful to make the connection between a source file and the command-line options it requires quite tight. For instance, if a (Glasgow) Haskell source file uses casms, the C back-end often needs to be told about which header files to include. Rather than maintaining the list of files the source depends on in a Makefile (using the -#include command-line option), it is possible to do this directly in the source file using the OPTIONS pragma :

 {-# OPTIONS -#include "foo.h" #-} module X where ... 

OPTIONS pragmas are only looked for at the top of your source files, upto the first (non-literate,non-empty) line not containing OPTIONS. Multiple OPTIONS pragmas are recognised. Note that your command shell does not get to the source file options, they are just included literally in the array of command-line arguments the compiler driver maintains internally, so you'll be desperately disappointed if you try to glob etc. inside OPTIONS.

NOTE: the contents of OPTIONS are prepended to the command-line options, so you *do* have the ability to override OPTIONS settings via the command line.

It is not recommended to move all the contents of your Makefiles into your source files, but in some circumstances, the OPTIONS pragma is the Right Thing. (If you use -keep-hc-file-too` and have OPTION flags in your module, the OPTIONS will get put into the generated .hc file).

Next Previous Contents