Stack manipulation in LLVM backend

Max Bolingbroke batterseapower at hotmail.com
Fri Apr 2 11:02:49 EDT 2010


Hi,

I've narrowed down one of the problems with the LLVM backend that
David descried in his thesis. LLVM sometimes generates gratuitous
extra stack manipulations.

For example, consider this input program:

{{{
module Toy where

toy :: (Int -> Maybe Int) -> Int
toy f = case f 10 of Just x -> x; Nothing -> 20
}}}

Compiled using ghc -O (which runs llvl's "opt -O2"), we get the
following LLVM IR:

{{{
define cc10 void @Toy_toy_entry(i32 %stg_terei_baseArg, i32
%stg_terei_spArg, i32 %stg_terei_hpArg, i32 %stg_terei_r1Arg) nounwind
{
cdx:
  %ndz = add i32 %stg_terei_spArg, -4             ; <i32> [#uses=3]
  %ndB = add i32 %stg_terei_baseArg, 84           ; <i32> [#uses=1]
  %ndC = inttoptr i32 %ndB to i32*                ; <i32*> [#uses=1]
  %ndD = load i32* %ndC                           ; <i32> [#uses=1]
  %ndE = icmp ult i32 %ndz, %ndD                  ; <i1> [#uses=1]
  br i1 %ndE, label %cdG, label %ndH

ndH:                                              ; preds = %cdx
  %ndJ = inttoptr i32 %stg_terei_spArg to i32*    ; <i32*> [#uses=2]
  %ndK = load i32* %ndJ                           ; <i32> [#uses=1]
  %ndP = inttoptr i32 %ndz to i32*                ; <i32*> [#uses=1]
  store i32 add (i32 ptrtoint (%Toy_toy1_closure_struct*
@Toy_toy2_closure to i32), i32 1), i32* %ndP
  store i32 ptrtoint (%sbo_info_struct* @sbo_info to i32), i32* %ndJ
  tail call cc10 void @stg_ap_p_fast(i32 %stg_terei_baseArg, i32 %ndz,
i32 %stg_terei_hpArg, i32 %ndK) nounwind
  ret void

cdG:                                              ; preds = %cdx
  %ne1 = add i32 %stg_terei_baseArg, -4           ; <i32> [#uses=1]
  %ne2 = inttoptr i32 %ne1 to i32*                ; <i32*> [#uses=1]
  %ne3 = load i32* %ne2                           ; <i32> [#uses=1]
  %ne4 = inttoptr i32 %ne3 to void (i32, i32, i32, i32)* ; <void (i32,
i32, i32, i32)*> [#uses=1]
  tail call cc10 void %ne4(i32 %stg_terei_baseArg, i32
%stg_terei_spArg, i32 %stg_terei_hpArg, i32 ptrtoint
(%Toy_toy_closure_struct* @Toy_toy_closure to i32)) nounwind
  ret void
}
}}}

So far so good. However, when we compile this with "llc -tailcallopt
-O2 -march=x86", we get:

{{{
_Toy_toy_entry:                         ## @Toy_toy_entry
## BB#0:                                ## %cdx
	subl	$12, %esp
	leal	-4(%ebp), %eax
	cmpl	84(%ebx), %eax
	jb	LBB2_2
## BB#1:                                ## %ndH
	movl	(%ebp), %esi
	movl	$_Toy_toy2_closure+1, -4(%ebp)
	movl	$_sbo_info, (%ebp)
	movl	%eax, %ebp
	addl	$12, %esp
	jmp	_stg_ap_p_fast  # TAILCALL
LBB2_2:                                 ## %cdG
	movl	-4(%ebx), %eax
	movl	$_Toy_toy_closure, %esi
	addl	$12, %esp
	jmpl	*%eax  # TAILCALL

	.globl	___stginit_Toy_
	.align	4, 0x90
}}}

Note the "subl	$12, %esp" and "addl	$12, %esp" pairs. %esp is never
actually used, so this fiddling is entirely pointless! This seems to
happen in every single function LLVM compiles, so I imagine fixing the
problem would be quite a big win.

Without -tailcallopt, we get nice code:

{{{
_Toy_toy_entry:                         ## @Toy_toy_entry
## BB#0:                                ## %cdx
	leal	-4(%ebp), %eax
	cmpl	84(%ebx), %eax
	jb	LBB2_2
## BB#1:                                ## %ndH
	movl	(%ebp), %esi
	movl	$_Toy_toy2_closure+1, -4(%ebp)
	movl	$_sbo_info, (%ebp)
	movl	%eax, %ebp
	jmp	_stg_ap_p_fast  # TAILCALL
LBB2_2:                                 ## %cdG
	movl	-4(%ebx), %eax
	movl	$_Toy_toy_closure, %esi
	jmpl	*%eax  # TAILCALL
}}}

Unfortunately -tailcallopt is apparently the only way to get
*guaranteed* tail calls. The LLVM developers appear to know that
tailcallopt causes this sort of rubbish in the output
(http://old.nabble.com/Removing--tailcallopt--td27475523.html) but
obviously haven't fixed it (my results are with the very recent LLVM
r100183).

I couldn't get the stack manipulations to go away with any amount of
"noreturn" and "unreachable" annotations around "tail call"s and
function definitions. It was worth a try though :-)

In summary, it looks like anyone wanting to fix the excess stack
manipulations issue is going to have to get their hands messy and
delve into LLVM's LLC!

Cheers,
Max



More information about the Cvs-ghc mailing list