Weird impossible segfault in select_loop on OpenBSD
Simon Marlow
marlowsd at gmail.com
Mon Jun 28 04:52:22 EDT 2010
On 27/06/2010 20:00, Matthias Kilian wrote:
> [sorry for constantly replying to myself...]
>
> On Sat, Jun 26, 2010 at 10:42:02AM +0200, Matthias Kilian wrote:
>> However, the munmaps happen after what looks like GHC cleanup. For
>> example this comes before the munmaps:
> [...]
>> then, after some more stuff (like getrusage, gettimeofday, some
>> mprotects, a little bit IO) come the munmaps, and finally the
>> SIGSEGV.
>
> After some more poor man's debugging (sprinkling some puts(3) into
> rts/posix/Signals.c), the problem seems to be that ioManagerDie()
> assumes that writing that `kill byte' (IO_MANAGER_DIE) wakes the
> service loop immediately, and thus closes the io_manager_pipe and
> returns (which lets the shutdown proceed further and free the heap).
>
> But that assumption is wrong. The select loop may be anywhere, even
> right before the call to c_select. Depending on the systems thread
> scheduling this may cause the select loop to resume only after the
> heap is gone.
It's not making any assumptions about when the IO manager will die, it
just asks it to stop. Later on during the shutdown sequence, after we
have ensured that there are no more Haskell threads running, we free the
heap memory.
Now, I think I do understand what has gone wrong. The shutdown sequence
waits until there are no more Haskell threads running, but doesn't wait
for foreign calls to finish: this is usually the right thing, because
there might be a foreign call that is blocked indefinitely, and would
never finish, thus preventing termination of the program. If one of
these foreign calls does happen to finish during the shutdown sequence,
it will be blocked from re-entering Haskell and nothing goes wrong.
However, a foreign call in progress may well need to reference the heap
memory, which we're about to free. Freeing the heap memory is new since
6.12, which is why this bug only just appeared:
Thu Sep 10 09:46:30 BST 2009 Austin Seipp <mad.one at gmail.com>
* FIX #711 implement osFreeAllMBlocks for unix
So in fact, what I believe we should do is change the shutdown sequence
so that it only frees the heap memory if it also waits for foreign calls
to complete first. In the normal standalone-program shutdown sequence,
we should do neither. It's a one-line patch (+20 lines of comment),
I'll test and commit.
Cheers,
Simon
More information about the Cvs-ghc
mailing list