difficult profiling example

Serge D. Mechveliani mechvel at botik.ru
Sat Dec 9 08:19:26 EST 2006


This is again on the  time profiling in ghc-6.6.
Who could, please, guess what is happening?

------------------------------------------------------------------------ 
main = putStr (showsList newProveds "\n")
  where
  ...
  ((_, newProveds), _) =
                     {-# SCC "main.searchLemmataInInitial" #-}
                     searchLemmataInInitial opts rc rcPerLemma initState

searchLemmataInInitial proverOpts rc rcPerLemma searchState =

  (let ((state', newProveds), rcRem) =
                 ----------        {-# SCC "searchLemmata" #-}
                                   searchLemmata ((searchState, []), rc)

       cost = lsisCost searchState    -- data field selection
   in
   ((state' {lsisCost = cost + rc - rcRem},  newProveds),  rcRem)
                                             ----------

   -- only  newProveds  is used in the test in `main'
  )
  where
  ...
  searchLemmata  extState@ ((searchState, newProveds), rc) =

    {-# SCC "searchLemmata 2" #-}
    (case splitCs
     of
     ... <complex definitions>
    )
    where
    ... <complex definitions>
--------------------------------------------------------------------

ghc-6.6  is built with  -prof -auto-all,
so that the ghc library functions are shown as the cost centers too
(I doubt of whether this can help me).

Making:    ghc $compilationOptions -prof --make Main         
Running    ./Main +RTS -p -M200m -RTS

(this is for  -Onot,  
because -O takes too long, and commonly produces a similar report for 
this kind of example).

In this test,          newProveds  occurs  [],  

and it is printed out as the only result.
My user's SCC centers are set only at the  three points  shown above.

The profiling shows 

----------------------------------------------------------------
 CAF                     Main                                                 2\60          89   0.0    0.0    99.9   99.9
...
  main.searchLemmataInInitial Main                                             \    347           1  24.1   26.8    99.8   99.8
...
   searchLemmata 2       Main                                                 3\49         114   1.3    1.3     4.2    3.6
...
   searchLemmata         Main                                                 3\48           1   0.0    0.0     0.0    0.0
...
------------------------------------------------------------

What I expected
---------------
1. "main.searchLemmataInInitial"  inherits 99% of time.
2. searchLemmata  is a loop inside  searchLemmataInInitial  
   which does everything, so it inherits 99% of time   
   (it has SCC too).

What is reported
----------------

6 lines for the SCC "main.searchLemmataInInitial"
-- while in reality it is called only once.
I do not know why they are so many, but let it be.
Exactly one of these lines is "non-zero":
  
  main.searchLemmataInInitial Main
     347           1  24.1   26.8    99.8   99.8

Its  individual time % is  24.1,  inherited  99.8.

A single line for  "searchLemmata"  (main loop).

   searchLemmata         Main 
348          1   0.0    0.0     0.0    0.0

-- entered 1 time from `main', 
(but the function  searchLemmata  calls itself many times -- see 
"searchLemmata 2").

A single line for  "searchLemmata 2"  
(its SCC is in the beginning of the definition of  searchLemmata):

   searchLemmata 2       Main
349        114   1.3    1.3     4.2    3.6
 
It is probable that it is entered  114  times.
But its inherited time  4.2 %  is un-probable, it must be near 100%.
Because, who has taken a great cost of
             time(searchLemmataInInitial) - time(searchLemmata)
?
This difference must be close to zero, because it is defined

  let ((..., newProveds), ...) = searchLemmata ((searchState, []), rc)
      ...
  in  ((..., newProveds), ...),

and only  newProveds  is used in the test in `main'.
What computation remains after  searchLemmata ?
Only extracting  snd . fst  from a pair of pairs.
The result is  [],  so `shows' and the output cost almost zero.
So, the profiling says actually that this  snd . fst  costs 99%. 
I do not understand this profiling, so far.

It also shows many GHC and Haskell library centers, like this:

----------------------------------------------------------
        total time  =      228.40 secs   (4568 ticks @ 50 ms)
        total alloc = 4,272,700,464 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc

main.searchLemmataInInitial    Main                  24.1   26.8
.                              GHC.Base               7.6    4.9
insert                         Data.Set               6.2    1.4
foldlStrict                    Data.Set               5.6    0.0
hedgeUnion                     Data.Set               5.4    6.3
map                            GHC.Base               5.2    6.5
trim                           Data.Set               4.6    0.0
...

and also about 6000 of expanded lines.
----------------------------------------------------------

And this all does not explain why the inherited time % for 
"searchLemmata 2"  is  4.2  instead of  99.

Probably, compiling GHC under  -auto-all  does not help my aim,
but neither it seems creating extra difficulties. 

Another curious point is, for example, that   Base.map   is said to 
take a considerable time when entered from  (.)  (composition).
In my user program,  this is applied as   compose $ map shows xs

for the need of showing a list -- probably, the only usage.
And it costs zero in this example, because nothing is shown except 
[]. 
Maybe, the GHC library uses it.

My program contains many places with  foldl,  which I am going, maybe,
to replace with  foldr.  But I do not think that this will change 
essentially the profiling for this particular example.

Who can, please, tell what I may be missing?

Thank you in advance,

-----------------
Serge Mechveliani
mechvel at botik.ru



More information about the Glasgow-haskell-users mailing list