<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#ffffff">
Hi,<br>
<br>
While trying to figure out why some of my code is very slow I have
found that it is something related to division.<br>
Digging a bit deeper I found an example which shows some unexpected
magic and a lack of the expected one.<br>
Before raising any tickets in trac I would like to consult with you
regarding what I see. Maybe I am misunderstanding the way GHC is
supposed to work.<br>
<br>
-------------------<br>
<blockquote>module Test where<br>
<br>
import Data.Int<br>
import GHC.Exts<br>
import GHC.Prim<br>
<br>
foo :: Int -> Int<br>
foo a =<br>
let<br>
b = a `quot` 1111<br>
c = b `quot` 1113<br>
d = c `quot` 1117<br>
in d<br>
<br>
bar :: Int -> Int<br>
bar !a' =<br>
let<br>
!(I# a) = a'<br>
!(b) = quotInt# a 1111#<br>
!(c) = quotInt# b 1113#<br>
!(d) = quotInt# c 1117#<br>
in I# d<br>
</blockquote>
-------------------<br>
<br>
Here 'foo' is a function written in a common way and 'bar' is
essentially identical one, written in a low-level style.<br>
* My understanding is that these functions are equivalent in terms
of what they do. The only difference is in the code being generated.<br>
<br>
Unexpected magic is in the Core dump:<br>
-------------------<br>
<blockquote>Test.$wfoo =<br>
\ (ww_sxw :: GHC.Prim.Int#) -><br>
case ww_sxw of wild1_ax0 {<br>
__DEFAULT -><br>
case GHC.Prim.quotInt# wild1_ax0 1111 of wild2_Xxc {<br>
__DEFAULT -><br>
case GHC.Prim.quotInt# wild2_Xxc 1113 of wild3_Xxt {<br>
__DEFAULT -> GHC.Prim.quotInt# wild3_Xxt 1117;<br>
(-9223372036854775808) -> (-8257271295304186)<br>
};<br>
(-9223372036854775808) -> (-7418931981405)<br>
};<br>
(-9223372036854775808) -> (-6677706553)<br>
}<br>
<br>
Test.bar =<br>
\ (a'_ah5 :: GHC.Types.Int) -><br>
case a'_ah5 of _ { GHC.Types.I# ipv_ste -><br>
GHC.Types.I#<br>
(GHC.Prim.quotInt#<br>
(GHC.Prim.quotInt# (GHC.Prim.quotInt# ipv_ste 1111) 1113)
1117)<br>
}<br>
</blockquote>
-------------------<br>
Question 1: what is the meaning of those magic numbers
-9223372036854775808, -6677706553, -7418931981405,
-8257271295304186?<br>
Question 2: under which circumstances those strange branches of
execution will be used and what those results would mean?<br>
Question 3: why is the Core for 'foo' so different to 'bar'?<br>
<br>
The lack of expected magic is in the assembler code:<br>
-------------------<br>
<blockquote> addq $16,%r12<br>
cmpq 144(%r13),%r12<br>
ja .Lcz1<br>
movl $1117,%ecx<br>
movl $1113,%r10d<br>
movl $1111,%r11d<br>
movq 7(%rbx),%rax<br>
cqto<br>
idivq %r11<br>
cqto<br>
idivq %r10<br>
cqto<br>
idivq %rcx<br>
movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)<br>
movq %rax,0(%r12)<br>
leaq -7(%r12),%rbx<br>
addq $8,%rbp<br>
jmp *0(%rbp)<br>
</blockquote>
-------------------<br>
Question: can't it use cheap multiplication and shift instead of
expensive division here? I know that such optimisation is
implemented at least to some extent for C--. I suppose it also won't
do anything smart for expressions like a*4 or a/4 for the same
reason.<br>
<br>
<br>
With kind regards,<br>
Denys Rtveliashvili
</body>
</html>