Personal tools

ThreadScope Tour/Spark

From HaskellWiki

< ThreadScope Tour(Difference between revisions)
Jump to: navigation, search
(Sparking too fast (parlist0))
(say a bit more about the spark pools)
 
(10 intermediate revisions by one user not shown)
Line 1: Line 1:
  +
''Sparking too fast''
  +
 
= Objectives =
 
= Objectives =
   
Line 5: Line 7:
 
Note: you will need GHC 7.4 or higher to generate eventlogs used by the spark events viewer. You can use our sample logs if you just want to play with the UI a bit.
 
Note: you will need GHC 7.4 or higher to generate eventlogs used by the spark events viewer. You can use our sample logs if you just want to play with the UI a bit.
   
= Steps =
+
= Concepts =
   
== Sparking too fast (parlist0) ==
+
* spark pool - each core is associated with a finite pool of sparks; if there is not enough space in the pool for new sparks, the overflow is just dropped
  +
  +
= Steps =
   
 
<ol style="list-style-type: decimal;">
 
<ol style="list-style-type: decimal;">
Line 17: Line 19:
 
<pre> ./parlist0 +RTS -N2 -lf
 
<pre> ./parlist0 +RTS -N2 -lf
 
threadscope parlist0.eventlog</pre>
 
threadscope parlist0.eventlog</pre>
<p>Notice here that we pass the <code>f</code> flag to the <code>-l</code>. This means that in addition to the events collected by default [(s)cheduler, (g)arbage collector, and s(p)ark events], we also want to collect the (f)ull details for par spark events. See the [[RTS|list of event flags]] for details.</p>
+
<p>Notice here that we pass the <code>f</code> flag to the <code>-l</code>. This means that in addition to the events collected by default (s)cheduler, (g)arbage collector, and s(p)ark events we also want to collect the (f)ull details for par spark events. See the [[ThreadScope_Tour/RTS|list of event flags]] for details.</p>
<p>''-lf not recognised?'' Your GHC is not recent enough</p></li>
+
<p>''HINT: -lf not recognised? Your GHC is not recent enough''</p></li>
 
<li><p>Have a look at the activity profile. What problems do you see?</p>
 
<li><p>Have a look at the activity profile. What problems do you see?</p>
 
[[Image:ThreadScope-parlist0.png|parlist0 activity]]
 
[[Image:ThreadScope-parlist0.png|parlist0 activity]]
Line 33: Line 35:
 
</li>
 
</li>
 
<li><p>Scroll down to view these traces (shrink the bottom drawer if you need more room). What do you see?</p>
 
<li><p>Scroll down to view these traces (shrink the bottom drawer if you need more room). What do you see?</p>
<p>''HINT: Remember that the spark creation/conversion graphs represent /rates/ (sparks per ms) and not /quantity/''</p>
+
<p>''HINT: Remember that the spark creation/conversion graphs represent '''rates''' (sparks per ms) and not '''quantity'''''</p>
 
[[Image:ThreadScope-spark-creation-conversion.png|400px|spark creation/conversion]]
 
[[Image:ThreadScope-spark-creation-conversion.png|400px|spark creation/conversion]]
   
<p>The creation charts show that a large proportion of attempted sparks overflow. They are not created because of spark pools are full. (It may be useful to have a look at the [[:Image:spark-lifecycle.png|spark lifecycle]] if you have not done so already).</p>
+
<p>The creation charts show that a large proportion of attempted sparks overflow. They are not created because of spark pools are full. (It may be useful to have a look at the [[Special:Filepath/spark-lifecycle.png|spark lifecycle]] if you have not done so already).</p>
[[Image:spark-lifecycle.png|400px|spark lifecycle thumbnail]]
+
[[Image:spark-lifecycle.png|thumb]]
 
</li>
 
</li>
 
<li><p>Scrolling down a bit further to the spark pool. Notice how the spark pool usage for HEC 1 shoots up until full and gradually declines as the pool is depleted.</p>
 
<li><p>Scrolling down a bit further to the spark pool. Notice how the spark pool usage for HEC 1 shoots up until full and gradually declines as the pool is depleted.</p>
 
[[Image:ThreadScope-spark-pool.png|spark pool]]
 
[[Image:ThreadScope-spark-pool.png|spark pool]]
   
<p>Compare this with the spark creation chart. The red bits (overflow) coincide with the spark pool being full ('''TODO this isn't quite true; have I misunderstood something?''')</p></li>
+
<p>Compare this with the spark creation chart. The red parts (overflow) roughly coincide with where the spark pool is full. The pool stays full as any spark which is converted away is quickly replaced by new ones coming in (the bits of green). Only when we stop trying to create sparks does the pool begin to drain.</p>
  +
<p>''HINT: Remember that whereas the creation/conversion graphs represent '''rates''' (sparks per ms), the spark pool graph tracks '''quantities'''''</p>
  +
</li>
 
<li><p>We can see that we're getting overflow, that we are generating many sparks way faster than we can convert them, or conversely that we aren't converting sparks fast enough.</p>
 
<li><p>We can see that we're getting overflow, that we are generating many sparks way faster than we can convert them, or conversely that we aren't converting sparks fast enough.</p>
 
<p>Have a look at the source code. Do you see anything that suggests excessive sparking?</p>
 
<p>Have a look at the source code. Do you see anything that suggests excessive sparking?</p>
Line 56: Line 58:
 
go [] = []
 
go [] = []
 
go (y:ys) = y `par` go ys</pre>
 
go (y:ys) = y `par` go ys</pre>
<p>Study the <tt>parListWHNF</tt> function; it sparks the ''entire'' list off in parallel, all 100 thousand fibonnaci computations. Maybe we need to generate sparks a little less aggressively.</p></li></ol>
+
<p>Study the <tt>parListWHNF</tt> function; it sparks the ''entire'' list off in parallel, all 100 thousand fibonnaci computations. Maybe we need to generate sparks a little less aggressively.</p></li>
+
<li>Move on to the next section to see what happens when we [[ThreadScope_Tour/Spark2|throttle spark creation]] with the rolling buffer trick.</li>
== Throttling spark creation (parlist1) ==
+
</ol>
 
<ol start="8" style="list-style-type: decimal;">
 
<li><p>Build and run the somewhat revised parlist1 example.</p>
 
<pre> ./parlist1 +RTS -N2 -lf
 
threadscope parlist1.eventlog</pre></li>
 
<li><p>Have a look at the activity log. Notice any changes?</p>
 
[[Image:ThreadScope-parlist1.png|400px|parlist1 activity]]
 
 
<p>We're getting a bit more use out of that second core now, and seem to be using both of them throughout the life of the program.</p>
 
<p>What's changed? We have swapped out the parallel-everything strategy with a [http://hackage.haskell.org/packages/archive/parallel/latest/doc/html/Control-Parallel-Strategies.html rolling buffer] one which creates sparks in such a way that only a buffer-full are in the pool at any time. The list elements are all still evaluated in parallel, but the creation of sparks is throttled by a cunning bit of sequentialness that waits for each successive element of the list to be evaluated (to WHNF) before creating more sparks.</p>
 
<pre> parBufferWHNF :: Int -&gt; Strategy [a]
 
parBufferWHNF n0 xs0 = return (ret xs0 (start n0 xs0))
 
where -- ret :: [a] -&gt; [a] -&gt; [a]
 
ret (x:xs) (y:ys) = y `par` (x : ({-x `seq`-} ret xs ys))
 
ret xs _ = xs
 
 
-- start :: Int -&gt; [a] -&gt; [a]
 
start 0 ys = ys
 
start !_n [] = []
 
start !n (y:ys) = y `par` start (n-1) ys</pre></li>
 
<li><p>This is better, but it's still not the best we can do. Let's [[Spark2|continue studying spark events]] to get more out of our code.</p></li></ol>
 
   
 
= Hints and Tips =
 
= Hints and Tips =
   
 
# Not enough vertical space? Try shrinking the bottom drawer or just disable 'Views > Information Pane'
 
# Not enough vertical space? Try shrinking the bottom drawer or just disable 'Views > Information Pane'
# Remember that the spark creation/conversion graphs represent /rates/ (sparks per ms) and not /quantity/
+
# Remember that the spark creation/conversion graphs represent '''rates''' (sparks per ms) and not '''quantity'''
  +
# If you want to study just one of the cores, it may help to disable the traces for the other cores so that graphs are brought closer together<br/>[[Image:ThreadScope-focus-on-HEC.png‎]]

Latest revision as of 11:55, 9 December 2011

Sparking too fast

Contents

[edit] 1 Objectives

Use spark events to diagnose unthrottled sparking.

Note: you will need GHC 7.4 or higher to generate eventlogs used by the spark events viewer. You can use our sample logs if you just want to play with the UI a bit.

[edit] 2 Concepts

  • spark pool - each core is associated with a finite pool of sparks; if there is not enough space in the pool for new sparks, the overflow is just dropped

[edit] 3 Steps

  1. Download and build the parlist examples

     cd ThreadScope-Tour-examples
     make

    Note the need for GHC 7.4

  2. Run the parlist0 example, enabling collection of full spark events:

     ./parlist0 +RTS -N2 -lf
     threadscope parlist0.eventlog

    Notice here that we pass the f flag to the -l. This means that in addition to the events collected by default – (s)cheduler, (g)arbage collector, and s(p)ark events – we also want to collect the (f)ull details for par spark events. See the list of event flags for details.

    HINT: -lf not recognised? Your GHC is not recent enough

  3. Have a look at the activity profile. What problems do you see?

    parlist0 activity

    Some issues you may notice

    • We are only get half the activity possible from our CPUs
    • Most of work by HEC 1 after a while
    • There is also some 'stutter' and interruption for garbage collection (orange)

    We know now what some problems are, eg. only one CPU doing any work. Now how do we figure out what we have these problems?

  4. Enable the spark creation and conversion traces

    tick the spark creation/conversion boxes

  5. Scroll down to view these traces (shrink the bottom drawer if you need more room). What do you see?

    HINT: Remember that the spark creation/conversion graphs represent rates (sparks per ms) and not quantity

    spark creation/conversion

    The creation charts show that a large proportion of attempted sparks overflow. They are not created because of spark pools are full. (It may be useful to have a look at the spark lifecycle if you have not done so already).

    Spark-lifecycle.png
  6. Scrolling down a bit further to the spark pool. Notice how the spark pool usage for HEC 1 shoots up until full and gradually declines as the pool is depleted.

    spark pool

    Compare this with the spark creation chart. The red parts (overflow) roughly coincide with where the spark pool is full. The pool stays full as any spark which is converted away is quickly replaced by new ones coming in (the bits of green). Only when we stop trying to create sparks does the pool begin to drain.

    HINT: Remember that whereas the creation/conversion graphs represent rates (sparks per ms), the spark pool graph tracks quantities

  7. We can see that we're getting overflow, that we are generating many sparks way faster than we can convert them, or conversely that we aren't converting sparks fast enough.

    Have a look at the source code. Do you see anything that suggests excessive sparking?

    main =
      let xs = [ fib (5 + n `mod` 10) | n <- [0..100000] ]
                 `using` parListWHNF
       in print (sum xs)
    
    -- HINT: look here!
    parListWHNF :: Strategy [a] 
    parListWHNF xs = go xs `pseq` return xs
      where
        go []     = []
        go (y:ys) = y `par` go ys

    Study the parListWHNF function; it sparks the entire list off in parallel, all 100 thousand fibonnaci computations. Maybe we need to generate sparks a little less aggressively.

  8. Move on to the next section to see what happens when we throttle spark creation with the rolling buffer trick.

[edit] 4 Hints and Tips

  1. Not enough vertical space? Try shrinking the bottom drawer or just disable 'Views > Information Pane'
  2. Remember that the spark creation/conversion graphs represent rates (sparks per ms) and not quantity
  3. If you want to study just one of the cores, it may help to disable the traces for the other cores so that graphs are brought closer together
    ThreadScope-focus-on-HEC.png