assert in rts/Schedule.c
Simon Marlow
marlowsd at gmail.com
Mon Jan 30 15:47:53 CET 2012
On 30/01/2012 13:22, Karel Gardas wrote:
> On 01/30/12 12:07 PM, Simon Marlow wrote:
>>>> up to 7.4.1 RC2 this was solely HEAD issue, now with 7.4.1 RC2 it shows
>>>> also on 7.4. The bug shows as:
>>>>
>>>> internal error: ASSERTION FAILED: file rts/Schedule.c, line 506
>>>>
>>>> (GHC version 7.4.0.20120126 for arm_unknown_linux)
>>>> Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
>>>>
>>>> The test which shows this fails always, so this is probably no race
>>>> condition. I've been able to get this for you:
>>>>
>>>> $ ./GEq1 +RTS -Ds
>>>> 40068000: created capset 0 of type 2
>>>> 40068000: created capset 1 of type 3
>>>> 40068000: assigned cap 0 to capset 0
>>>> 40068000: assigned cap 0 to capset 1
>>>> 40068000: allocated 1 more capabilities
>>>> 40068000: new task (taskCount: 1)
>>>> 40068000: returning; I want capability 0
>>>> 40068000: resuming capability 0
>>>> 40068000: cap 0: created thread 1
>>>> 40068000: cap 0: thread 1 appended to run queue
>>>> 40068000: new bound thread (1)
>>>> 40068000: cap 0: schedule()
>>>> 40068000: cap 0: running thread 1 (ThreadRunGHC)
>>>> 40068000: cap 0: thread 1 stopped (suspended while making a foreign
>>>> call)
>>>> 40068000: starting new worker on capability 0
>>>> 40068000: new worker task (taskCount: 2)
>>>> 40068000: returning; I want capability 0
>>>> 40eff470: cap 0: schedule()
>>>> 40eff470: giving up capability 0
>>>> 40eff470: passing capability 0 to bound task 0x40068000
>>>> 40068000: resuming capability 0
>>>> 40068000: cap 0: running thread 1 (ThreadRunGHC)
>>>> 40068000: cap 0: thread 1 stopped (suspended while making a foreign
>>>> call)
>>>> 40068000: freeing capability 0
>>>> 40068000: returning; I want capability 0
>>>> 40068000: resuming capability 0
>>>> 40068000: cap 0: running thread 1 (ThreadRunGHC)
>>>> 40068000: cap 0: created thread 2
>>>> 40068000: cap 0: thread 2 appended to run queue
>>>> 40068000: cap 0: thread 2 has label IOManager
>>>> 40068000: cap -1094678984: thread 0 stopped ((null))
>>>
>>> This looks very suspicious - clearly the Capability is wrong (pointing
>>> to the wrong place or the memory has been corrupted). If it reliably
>>> fails in the same place, it should be easy to track down where it is
>>> going wrong.
>>>
>>> I recommend just debugging this directly, rather than trying to bisect.
>>> Bisecting is difficult with GHC because we have multiple repositories
>>> that have to be in sync, and also because each build takes a long time.
>>> Even if you find the offending patch, it won't necessarily tell you what
>>> the bug is.
>>
>> The symptoms fit this bug, I bet it's the same problem:
>>
>> http://hackage.haskell.org/trac/ghc/ticket/5824
>
> I've patched my RC2 copy, rebuilt and running testsuite just now. So far
> (few previously buggy tests just passed) it looks like the patch really
> solves the issue.
> However, may we discuss this a little bit? I'm curious why we need to
> put registers which are saved at the beginning of function and restored
> at its end into the clobber list? i.e.
>
> StgRegTable *
> StgRun(StgFunPtr f, StgRegTable *basereg) {
> StgRegTable * r;
> __asm__ volatile (
> /*
> * save callee-saves registers on behalf of the STG code.
> */
> "stmfd sp!, {r4-r10, fp, ip, lr}\n\t"
> #if !defined(arm_HOST_ARCH_PRE_ARMv6)
> "vstmdb sp!, {d8-d11}\n\t"
> #endif
> [...]
> /*
> * restore callee-saves registers.
> */
> #if !defined(arm_HOST_ARCH_PRE_ARMv6)
> "vldmia sp!, {d8-d11}\n\t"
> #endif
> "ldmfd sp!, {r4-r10, fp, ip, lr}\n\t"
> : "=r" (r)
> : "r" (f), "r" (basereg), "i" (RESERVED_C_STACK_BYTES)
> : "%r4", "%r5", "%r6", "%r8", "%r9", "%r10", "%fp", "%ip", "%lr"
> );
>
>
> now, r4-r10, fp, ip, lr are saved and restored and still we need to mark
> them as clobbered in clobber list? If this is really required, then why
> we do not also add r7 to the clobbered list and also d8-d11 (for ARMv7)?
Bearing in mind that I don't know ARM assembly, I think the problem is
that gcc is assigning basereg to one of these registers that is saved
and restored, because there is nothing telling it not to. If basereg is
assigned to %r4, say, then the instruction that restores %r4 at the end
of the inline fragment will clobber the basereg value that we are trying
to return from STG-land.
Maybe there's a nicer way to fix this, but my gcc inline asm is a bit
rusty...
Cheers,
Simon
> I'm referring here to
> http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3 --
> which explains GCC's clobber regs list. From this description I don't
> see why we need to put those regs into clobbered lists as we correctly
> restores them before returning to C world, but if there is really some
> needs to do so, then my engineering guess is that we also need to do
> this for d8-d11 and r7 too... I'm still quite lost in this GCC's inline
> assembly support...
>
> Ben, what do you think about it?
>
> Thanks!
> Karel
More information about the Cvs-ghc
mailing list