FFI: number of worker threads?

Li, Peng ringer9cs+ghc at gmail.com
Wed Jun 21 12:31:41 EDT 2006


On 6/21/06, Simon Peyton-Jones <simonpj at microsoft.com> wrote:
> New worker threads are spawned on as needed.  You'll need as many of
> them as you have simultaneously-blocked foreign calls. If you have 2000
> simultaneously-blocked foreign calls, you'll need 2000 OS threads to
> support them, which probably won't work.

2000 OS threads definitely sound scary, but it is possible to work.
The Linux NPTL threads can scale well up to 10K threads and the stack
address spaces would be sufficient on 64-bit systems.

I am thinking about some p2p applications where each peer is
maintaining a huge amount of TCP connections to other peers, but most
of these connections are idle. Unforturnately the default GHC RTS is
multiplexing I/O using "select", which is O(n) and it seems to have a
FDSET size limit of 1024.

That makes me wonder if the current design of the GHC RTS is optimal
in the long run. As software and hardware evolves, we will have
efficient OS threads (like NPTL)  and huge (64-bit) address spaces.
My guess is

(1) It is always a good idea to multiplex GHC user-level threads on OS
threads, because it improve performance.
(2) It may not be optimal to multiplex nonblocking I/O inside the GHC
RTS, because it is unrealistic to have an event-driven I/O interface
that is both efficient (like AIO/epoll) and portable (like
select/poll). What is worse, nonblocking I/O still blocks on disk
accesses. On the other hand, the POSIX threads are portable and it can
be efficiently implemented on many systems. At least on Linux, NPTL
easily beats "select"!

My wish is to have a future GHC implementation that (a) uses blocking
I/O directly provided by the OS, and (b) provides more control over OS
threads and the internal worker thread pool.  Using blocking I/O will
simplify the current design and allow the programmer to take advantage
of high-performance OS threads. If non-blocking I/O is really needed,
the programmer can use customized, Claessen-style threads wrapped in
modular libraries---some of my preliminary tests show that
Claessen-style threads can do a much better job to multiplex
asynchronous I/O.

>
> If you think you have only a handful of simultaneously-blocked foreign
> calls, but you still get "runaway worker threads", please do make a
> reproducible test case and file a bug report.

Yes, I will try to make a reproducible test case soon.

> Once you get answers, can I ask either or both of you to type in what
> you learned to the GHC user-documentation Wiki?  That way things
> improve!   The place to start is here
>         http://haskell.org/haskellwiki/GHC
> under "Collaborative documentation".  There's a already a page for
> "Concurrency" and for "FFI", so you can add to those.  Thanks

Certainly!


More information about the Glasgow-haskell-users mailing list