SMP crash
Simon Marlow
simonmarhaskell at gmail.com
Mon Feb 27 06:31:41 EST 2006
Krasimir Angelov wrote:
> While trying to build VSHaskell with the recent GHC I found the
> following problem. In Stable.c the stable_mutex is used for
> synchronization but it is initialized only from initStablePtrTable.
> The initStablePtrTable function is called only from hs_init but
> according to the following comment:
>
> // Nothing to do:
> // the table will be allocated the first time makeStablePtr is
> // called, and we want the table to persist through multiple inits.
> //
> // Also, getStablePtr is now called from __attribute__((constructor))
> // functions, so initialising things here wouldn't work anyway.
>
> it might be too late. In this case the CRITICAL_SECTION will not be
> initialized at the right time. The consequence is that the whole
> program crashes. In order to fix that I have added a call to
> initStablePtrTable to each function that requires locking. The actions
> in initStablePtrTable are executed only the first time when it is
> called. Since I am using SPT_size as a flag it isn't safe to call
> it for a first time from two concurrent threads. As long as it is
> executed from hs_init or from any __attribute__((constructor))
> function, I think it is safe.
Thanks Krasimir, I've committed your patch.
> This fixes the problem but after that the RTS blocks with waitCondition
> at line 401 in Capability.c. The trace messages are with +RTS -Ds are:
>
> ACQUIRE_LOCK(0x64FDC130) Stable.c 248
> RELEASE_LOCK(0x64FDC130) Stable.c 251
> ACQUIRE_LOCK(0x64FDCE50) Schedule.c 2689
> sched (task 00000EF8): allocated 1 capabilities
> RELEASE_LOCK(0x64FDCE50) Schedule.c 2722
> ACQUIRE_LOCK(0x64FDCDF0) Storage.c 132
> RELEASE_LOCK(0x64FDCDF0) Storage.c 256
> ACQUIRE_LOCK(0x64FDCE50) RtsAPI.c 560
> sched (task 00000EF8): new task (taskCount: 1)
> RELEASE_LOCK(0x64FDCE50) RtsAPI.c 562
> ACQUIRE_LOCK(0x64FDD128) Capability.c 387
> sched (task 00000EF8): returning; I want capability 0
> RELEASE_LOCK(0x64FDD128) Capability.c 395
> sched (task 00000EF8): returning; got capability 0
> ACQUIRE_LOCK(0x64FDCE50) Schedule.c 2425
> RELEASE_LOCK(0x64FDCE50) Schedule.c 2429
> sched (task 00000EF8): created thread 1, stack size = f2 words
> sched (task 00000EF8): new bound thread (1)
> sched (task 00000EF8): ### NEW SCHEDULER LOOP (task: 01D8FD50, cap: 64FDD040)
> sched (task 00000EF8): ### Running thread 1 in bound thread
> sched (task 00000EF8): -->> running thread 1 ThreadRunGHC ...
> sched (task 00000EF8): thread 1 did a safe foreign call
> ACQUIRE_LOCK(0x64FDD128) Schedule.c 2212
> sched (task 00000EF8): starting new worker on capability 0
> ACQUIRE_LOCK(0x01D8FE10) Task.c 245
> sched (task 00000EF8): new worker task (taskCount: 2)
> RELEASE_LOCK(0x01D8FE10) Task.c 265
> RELEASE_LOCK(0x64FDD128) Schedule.c 2218
> sched (task 00000EF8): thread 1: leaving RTS
> ACQUIRE_LOCK(0x64FDD128) Capability.c 387
> sched (task 00000EF8): returning; I want capability 0
> RELEASE_LOCK(0x64FDD128) Capability.c 398
> ACQUIRE_LOCK(0x01D8FD70) Capability.c 401
> RELEASE_LOCK(0x01D8FD70) win32/OSThreads.c 75
>
> I also have added optional debug messages to ACQUIRE_LOCK/RELEASE_LOCK
> for Windows like they have been added for Linux. The applied patch is attached.
That's odd - there should be two threads attempting to grab capability
0. Can you see what each thread is doing? In gdb, something like this:
> thread 0
> where
> thread 1
> where
I'll try to get my Windows build up today and see if any of the threaded
tests are failing.
Cheers,
Simon
More information about the Cvs-ghc
mailing list