[Haskell-cafe] Odd profiling results

Erik de Castro Lopo mle+hs at mega-nerd.com
Thu Jan 6 07:31:01 CET 2011


Simon Marlow wrote:

> Or the C compiler, perhaps?

Thanks for the suggesion Simon. This one was actually easier
to test that Malcolm's suggestion.

To test it, I ran the same test under Oprofile [0], the Linux
kernel based profiler. OProfile is really cool because it
profiles the whole system, from user space all the way down to
the most insignificant kernel driver (eg allows you to see if
a kernel driver is causing your user space code to do something
weird).

My first run showed the GHC RTS profiling code being one of the
biggest runtime costs. After recompiling without profiling the
biggest CPU hogs were as follows (editied slightly for clarity
and severely truncated for brevity):

  samples  %      image name    app name      symbol name
  ---------------------------------------------------------------------
  128328  7.7305  ddc           ddc           evacuate
  97663   5.8832  ddc           ddc           s1PI_info
  91177   5.4925  ddc           ddc           stg_ap_pp_fast
  72926   4.3931  ddc           ddc           s1PV_info
  70650   4.2559  ddc           ddc           s1PJ_info
  62956   3.7925  ddc           ddc           s19S_info
  51240   3.0867  ddc           ddc           stg_ap_p_fast
  50197   3.0239  ddc           ddc           stg_upd_frame_info
  47797   2.8793  ddc           ddc           stg_PAP_apply
  39764   2.3954  ddc           ddc           sNr_info
  39351   2.3705  ddc           ddc           stg_ap_p_info
  38339   2.3095  ddc           ddc           stg_ap_pp_info
  30283   1.8242  cc1           cc1           /usr/lib/gcc/x86_64-linux-gnu/4.4/cc1
  29733   1.7911  ddc           ddc           scavenge_block
  27062   1.6302  ddc           ddc           base_GHCziBase_zdfOrdChar3_info
  23620   1.4229  ddc           ddc           s1PK_info
  21671   1.3055  ddc           ddc           s19R_info
  19043   1.1471  ddc           ddc           base_GHCziClasses_DZCOrd_con_info
  18090   1.0897  ddc           ddc           sAD_info
  16500   0.9940  ddc           ddc           sAE_info
  14978   0.9023  ddc           ddc           sxb_info
  13862   0.8350  ddc           ddc           stg_ap_0_fast
  12975   0.7816  ddc           ddc           stg_IND_STATIC_info
  12553   0.7562  ddc           ddc           base_GHCziClasses_DZCOrd_static_info

The C compiler is down there at %1.82 of total run time and definitely
not the culprit.

I've been discussing this with Ben as well and on the basis of this
plot that Ben captured:

    http://deluge.ouroborus.net/~benl/tmp/ddc-comp.pdf

we ought to be looking at scrapeRecursive.

Cheers,
Erik

[0] http://oprofile.sourceforge.net/
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/



More information about the Haskell-Cafe mailing list