parallelizing ghc

Neil Mitchell ndmitchell at
Sun Jan 29 12:56:43 CET 2012

Hi Simon,

I have found that a factor of 2 parallelism is required on Linux to
draw with ghc --make. In particular:

GHC --make = 7.688
Shake -j1 = 11.828 (of which 11.702 is spent running system commands)
Shake full -j4 = 7.414 (of which 12.906 is spent running system commands)

This is for a Haskell program which has several bottlenecks, you can
see graph of spawned processes here:
- everything above the 1 mark is more than one process in parallel, so
it gets to 4 processes, but not all the time - roughly an average of ~
x2 parallelism.

On Windows the story is much worse. If you -j4 then the time spent
executing system commands shoots up from ~15s to around ~25s, since
even on a 4 core machine the contention in the processes is high. I
tried investigating this, checking for things like a locked file (none
I can find), or disk/CPU/memory contention (its basically taking no
system resources), but couldn't find anything.

If you specify -O2 then the parallel performance also goes down - I
suspect because each ghc process needs to read inline information for
packages that are imported multiple times, and ghc --make gets away
with doing that once?

> This looks a bit suspicious.  The Shake build is doing nearly twice as much
> work as the --make build, in terms of CPU time, but because it is getting
> nearly 2x parallelism it comes in a close second.  How many processes is the
> Shake build using?

Shake uses a maximum of the number of processes you specify, it never
exceeds the -j flag - so in the above example it caps out at 4. It is
very good at getting parallelism (I believe it to be perfect, but the
code is 150 lines of IORef twiddling, so I wouldn't guarantee it), and
very safe about never exceeding the cap you specify (I think I can
even prove that, for some value of proof). The profiling makes it easy
to verify these claims after the fact.

Thanks, Neil

More information about the Glasgow-haskell-users mailing list