documentation in GHC.Conc, Control.Parallel.Strategies; querying number of CPUs

Sat Jul 14 18:31:55 EDT 2007

Hello,

I am interested in implementing some multi-threaded algorithms in
Haskell.

I have run into some documentation dead-ends. The documentation in
GHC.Conc is what I get when I search for "ghc pseq" on google, but it
doesn't document pseq and some other functions:

pseq
par
forkOnIO
childHandler
ensureIOManagerIsRunning

In particular, I wonder: How is pseq different from seq? Under what
circumstances is it used? I have looked at the source code so I see
that it is implemented in terms of 'seq' and 'lazy':

> -- "pseq" is defined a bit weirdly (see below)
> --
> -- The reason for the strange "lazy" call is that
> -- it fools the compiler into thinking that pseq  and par are non-strict in
> -- their second argument (even if it inlines pseq at the call site).
> -- If it thinks pseq is strict in "y", then it often evaluates
> -- "y" before "x", which is totally wrong.  
> 
> pseq :: a -> b -> b
> pseq  x y = x `seq` lazy y

- does this mean pseq should be used instead of 'seq' when I want the
first argument to be evaluated first? And I am also curious about the
others, although par seems to be documented in Control.Parallel.

Also, the following functions in Control.Parallel.Strategies are not
documented, at least in Haddock:

(>|) :: Done -> Done -> Done
(>||) :: Done -> Done -> Done
using :: a -> Strategy a -> a
demanding :: a -> Done -> a
sparking :: a -> Done -> a
sPar :: a -> Strategy b
sSeq :: a -> Strategy b
r0 :: Strategy a
rwhnf :: Strategy a
rnf :: Strategy a
($|) :: (a -> b) -> Strategy a -> a -> b
($||) :: (a -> b) -> Strategy a -> a -> b
(.|) :: (b -> c) -> Strategy b -> (a -> b) -> a -> c
(.||) :: (b -> c) -> Strategy b -> (a -> b) -> a -> c
(-|) :: (a -> b) -> Strategy b -> (b -> c) -> a -> c
(-||) :: (a -> b) -> Strategy b -> (b -> c) -> a -> c
seqPair :: Strategy a -> Strategy b -> Strategy (a, b)
parPair :: Strategy a -> Strategy b -> Strategy (a, b)
seqTriple :: Strategy a -> Strategy b -> Strategy c -> Strategy (a, b, c)
parTriple :: Strategy a -> Strategy b -> Strategy c -> Strategy (a, b, c)
fstPairFstList :: NFData a => Strategy [(a, b)]
force :: NFData a => a -> a
sforce :: NFData a => a -> b -> b

The types Done and Strategy or the class NFData and related classes in
this module are also not documented in Haddock. If there is a paper
which defines all of these then it would be nice to have a link to the
paper in the module's documentation, for people to use until the
module's documentation itself can be updated.

As an aside, if you've read this far then you may know the answer to a
related question: is there a way to query how many processors the
current machine has? I am implementing a parallel sort, and in cases
such as sorting where one can decompose an algorithm into an
arbitrarily large number of threads, I am wondering how to tell what
the maximum useful number of threads is (usually this will be some
increasing function of the number of CPUs) to avoid the overhead of
spawning a thread when it is not needed. (I'm about to read
"Lightweight concurrency primitives for GHC" by Li et al, if that's
the right place to look)

Thanks,

Frederik

-- 
http://ofb.net/~frederik/