<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Apr 12, 2014 at 11:22 PM, Miro Karpis <span dir="ltr"><<a href="mailto:miroslav.karpis@gmail.com" target="_blank">miroslav.karpis@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div>I'm trying to make a small benchmarking for warp and scotty (later with json/no-json text performance test). My client is a Qt c++ application. I made a minimum code in both Haskell and C++. The problem is the numbers I'm getting.</div>

</div></blockquote><div><br></div><div>If you're not running your Haskell program with "+RTS -A4M" (or for a newer chip even larger, the "4M" should correspond to the size of your L3 cache), please do so. The default of 512k is really too small for most processors in use and will force the runtime into garbage collection before the L3 cache is even consumed. In my benchmarks this flag alone can give you a remarkable improvement.</div>

<div><br></div><div>Also, a more fundamental issue: those other tests you mentioned are measuring something different than you are. Those tests use a large number of simultaneous client connections to simulate a busy server, i.e. measuring throughput. Your test makes 10,000 connections serially: you're measuring the server's latency.</div>

<div><br></div><div>G</div><div>-- <br></div></div>Gregory Collins <<a href="mailto:greg@gregorycollins.net" target="_blank">greg@gregorycollins.net</a>>

</div></div>