How to use C-land variable from Cmm-land?

Mon Dec 10 11:58:05 CET 2012

On 08/12/12 23:12, Yuras Shumovich wrote:
> Hi,
>
> I'm working on that issue as an exercise/playground while studding the
> GHC internals: http://hackage.haskell.org/trac/ghc/ticket/693

It's not at all clear that we want to do this.  Perhaps you'll be able 
to put the question to rest and close the ticket!

> First I tried just to replace "ccall lockClosure(mvar "ptr")" with
> GET_INFO(mvar) in stg_takeMVarzh and stg_putMVarzh and got 60% speedup
> (see the test case at the end.)
>
> Then I changed lockClosure to read header info directly when
> enabled_capabilities == 1. The speedup was significantly lower, <20%
>
> I tried to hack stg_putMVarzh directly:
>
>      if (enabled_capabilities == 1) {
>          info = GET_INFO(mvar);
>      } else {
>          ("ptr" info) = ccall lockClosure(mvar "ptr");
>      }

You should use n_capabilities, not enabled_capabilities.  The latter 
might be 1, even when there are multiple capabilities actually in use, 
while the RTS is in the process of migrating threads.

> But got no speedup at all.
> The generated asm (amd64):
>
>          movl $enabled_capabilities,%eax
>          cmpq $1,%rax
>          je .Lcgq
> .Lcgp:
>          movq %rbx,%rdi
>          subq $8,%rsp
>          movl $0,%eax
>          call lockClosure
>          addq $8,%rsp
> .Lcgr:
>          cmpq $stg_MVAR_CLEAN_info,%rax
>          jne .Lcgu
> {...}
> .Lcgq:
>          movq (%rbx),%rax
>          jmp .Lcgr
>
>
> It moves enabled_capabilities into %eax and then compares 1 with %rax.
> It looks wrong for me: the highest part of %rax remains uninitialized.

As Axel noted, this is correct.

Cheers,
	Simon