<div dir="ltr"><div class="gmail_extra">This is great. I wanted this for a long time.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Joachim, could you write a wiki page with step-by-step instructions for how to set this up, detailed enough that e.g. one of our infrastructure volunteers could set it up on another machine.</div>


<div class="gmail_extra"><br></div><div class="gmail_extra">Haskell infrastructure people, do we have a (e.g. Hetzner) machine that we can run this on?</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">


On Wed, Jul 16, 2014 at 10:02 AM, Joachim Breitner <span dir="ltr"><<a href="mailto:mail@joachim-breitner.de" target="_blank">mail@joachim-breitner.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


Hi,<br>

<br>

I guess it’s time to talk about this, especially as Richard just brought<br>

it up again...<br>

<br>

I felt that we were seriously lacking in our grip on performance issues.<br>

We don’t even know whether 6.8.3 was better or worse than 6.8.3 or 7.6.4<br>

in terms of nofib, not to speak of the effect of each single commit.<br>

<br>

I want to change that, so I set up a benchmark monitoring dashboard. You<br>

can currently reach it at:<br>

<br>

                  <a href="http://ghcspeed-nomeata.rhcloud.com/" target="_blank">http://ghcspeed-nomeata.rhcloud.com/</a><br>

<br>

What does it do?<br>

~~~~~~~~~~~~~~~~<br>

<br>

It monitors the repository (master branch only) and builds each commit,<br>

complete with the test suite and nofib. The log is saved and analyzed,<br>

and some numbers are extracted:<br>

 * The build time<br>

 * The test suite summary numbers<br>

 * Runtime (if >1s), allocations and binary sizes of the nofib<br>

   benchmarks<br>

<br>

These are uploaded to the website above, which is powered by codespeed,<br>

a general performance dashboard, implemented in Python using Django.<br>

<br>

Under _Changes_, it provides a report for each commit (changes wrt. to<br>

the previous version, and wrt. to 10 revisions earlier, the so-called<br>

“trend”). A summary of these reports is visible on the front-page.<br>

<br>

The _Timeline_ is a graph for each individual performance number. If<br>

there are bumps, you can hopefully find them there! You can also compare<br>

to 7.8.3, which is available as a “baseline”.<br>

<br>

_Comparison_ will be more useful if we have more tagged revision, or if<br>

were benchmarking various options (e.g. -fllvm): Here you can do<br>

bar-chart comparisons.<br>

<br>

Why codespeed?<br>

~~~~~~~~~~~~~~<br>

<br>

For a long time I searched for a suitable software product, and one<br>

criterion is that it should be open source, rather simple to set up and<br>

mostly decoupled from other tools, i.e. something that I throw numbers<br>

at and which then displays them nicely. While I don’t think codespeed is<br>

the best performance dashboard out there (I find<br>

<a href="http://goperfd.appspot.com/perf" target="_blank">http://goperfd.appspot.com/perf</a> a bit better; I wonder how well<br>

codespeed scales to even larger numbers of benchmarks and I wish it were<br>

more git-aware), it was the easiest to get started with. And thanks to<br>

the loose coupling of (1) running the tests to acquire a log, (2)<br>

parsing the log to get numbers and (3) putting them on a server, we can<br>

hopefully replace it when we come along something better. I was hoping<br>

for the Phabricator guys to have something in their tool suite, but<br>

doesn’t look like it.<br>

<br>

How does it work (currently)?<br>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>

<br>

My office PC is underused (I work on my laptop), so its currently<br>

dedicated to it. I have a simple shell script that monitors the repo for<br>

new versions. It builds the newest revision and works itself back to the<br>

commit where everything was turned into submodules:<br>

<a href="https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/watch.sh" target="_blank">https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/watch.sh</a><br>

<br>

It calls a script that does the actual building:<br>

<a href="https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/run-speed.sh" target="_blank">https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/run-speed.sh</a><br>

This produces a log file which should contain all the required numbers<br>

somewhere.<br>

<br>

A second script extracts these numbers (with help of nofib-analyze) and<br>

converts them into codespeed compatible JSON files:<br>

<a href="https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/log2json.pl" target="_blank">https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/log2json.pl</a><br>

<br>

Finally, a simple invocation to curl uploads them to codespeed:<br>

<a href="https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/upload.sh" target="_blank">https://github.com/nomeata/codespeed/blob/ghc/tools/ghc/upload.sh</a><br>

<br>

So if you want additional benchmarks to be tracked, make sure they are<br>

present in the logs and adjust <a href="http://log2json.pl" target="_blank">log2json.pl</a>. codespeed will automatically<br>

pick up new benchmarks in these logs. Reimplementations in Haskell are<br>

also welcome :-)<br>

<br>

The testsuite is run with VERBOSE=4, so the performance numbers are also<br>

shown for failing test cases. So once a test case goes over the limit,<br>

you can grep through previous logs try to find the real culprit. I<br>

uploaded the logs (so far) to <a href="https://github.com/nomeata/ghc-speed-logs" target="_blank">https://github.com/nomeata/ghc-speed-logs</a><br>

(but this is not automated yet, ping me if you need an update on this).<br>

<br>

What next?<br>

~~~~~~~~~~<br>

<br>

Clearly, the current setup is only good enough to evaluate the system.<br>

Eventually, I might want to use my office PC again, and the free hosting<br>

on openshift is not very powerful.<br>

<br>

So if we want to keep this setup and make it “official”, we need find a<br>

permanent solution.¹ This involves:<br>

<br>

 * A dedicated machine to run the benchmarks. This probably shouldn’t be<br>

   a VM, if we want to keep the noise in the runtime down.<br>

 * A machine to run the codespeed server. Can be a VM, or even run on<br>

   any of the system that we have right now. Just needs a database<br>

   (postgresql preferably) and a webserver supporting WSGI (i.e. any<br>

   of them).<br>

 * Maybe a better place to store the logs for public consumption.<br>

<br>

Also, there are way to improve the system:<br>

<br>

 * As I said, I don’t think codespeed is the best. If we find something<br>

   better, we can replace it. Since we have all the logs, we can easily<br>

   fill the new system with the data, or even run both at the same time.<br>

 * We might want to have more numbers. I am already putting<br>

   lines-of-code and disk space usage numbers into the logs, but do not<br>

   parse them yet.<br>

 * In particular, we might want to put in each performance test case as<br>

   a benchmark of its own, to easier find commits that degrade (or<br>

   improve!) performance. I’m not sure how well the web page will handle<br>

   that.<br>

 * We might want to replace my rather simple watch.sh-script by<br>

   something more serious. In particular, I imagine that our builder<br>

   setup could manages this, with a dedicated builder doing the<br>

   benchmark runs and the builder server scheduling a build for each<br>

   commit.<br>

<br>

<br>

That’s it for now. Enjoy clicking around!<br>

<br>

Greetings,<br>

Joachim<br>

<br>

¹ I guess that could be considered beta-reduction :-)<br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

--<br>

Joachim Breitner<br>

  e-Mail: <a href="mailto:mail@joachim-breitner.de">mail@joachim-breitner.de</a><br>

  Homepage: <a href="http://www.joachim-breitner.de" target="_blank">http://www.joachim-breitner.de</a><br>

  Jabber-ID: <a href="mailto:nomeata@joachim-breitner.de">nomeata@joachim-breitner.de</a><br>

<br>

</font></span><br>_______________________________________________<br>

ghc-devs mailing list<br>

<a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

<a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/mailman/listinfo/ghc-devs</a><br>

<br></blockquote></div><br></div>