Weird impossible segfault in select_loop on OpenBSD

Simon Marlow marlowsd at gmail.com
Mon Jun 28 04:52:22 EDT 2010


On 27/06/2010 20:00, Matthias Kilian wrote:
> [sorry for constantly replying to myself...]
>
> On Sat, Jun 26, 2010 at 10:42:02AM +0200, Matthias Kilian wrote:
>> However, the munmaps happen after what looks like GHC cleanup. For
>> example this comes before the munmaps:
> [...]
>> then, after some more stuff (like getrusage, gettimeofday, some
>> mprotects, a little bit IO) come the munmaps, and finally the
>> SIGSEGV.
>
> After some more poor man's debugging (sprinkling some puts(3) into
> rts/posix/Signals.c), the problem seems to be that ioManagerDie()
> assumes that writing that `kill byte' (IO_MANAGER_DIE) wakes the
> service loop immediately, and thus closes the io_manager_pipe and
> returns (which lets the shutdown proceed further and free the heap).
>
> But that assumption is wrong. The select loop may be anywhere, even
> right before the call to c_select. Depending on the systems thread
> scheduling this may cause the select loop to resume only after the
> heap is gone.

It's not making any assumptions about when the IO manager will die, it 
just asks it to stop.  Later on during the shutdown sequence, after we 
have ensured that there are no more Haskell threads running, we free the 
heap memory.

Now, I think I do understand what has gone wrong.  The shutdown sequence 
waits until there are no more Haskell threads running, but doesn't wait 
for foreign calls to finish: this is usually the right thing, because 
there might be a foreign call that is blocked indefinitely, and would 
never finish, thus preventing termination of the program.  If one of 
these foreign calls does happen to finish during the shutdown sequence, 
it will be blocked from re-entering Haskell and nothing goes wrong.

However, a foreign call in progress may well need to reference the heap 
memory, which we're about to free.  Freeing the heap memory is new since 
6.12, which is why this bug only just appeared:

Thu Sep 10 09:46:30 BST 2009  Austin Seipp <mad.one at gmail.com>
   * FIX #711 implement osFreeAllMBlocks for unix

So in fact, what I believe we should do is change the shutdown sequence 
so that it only frees the heap memory if it also waits for foreign calls 
to complete first.  In the normal standalone-program shutdown sequence, 
we should do neither.  It's a one-line patch (+20 lines of comment), 
I'll test and commit.

Cheers,
	Simon



More information about the Cvs-ghc mailing list