<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Feb 8, 2013 at 12:30 AM, Edward Z. Yang <span dir="ltr"><<a href="mailto:ezyang@mit.edu" target="_blank">ezyang@mit.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">OK. I think it is high priority for us to get some latency benchmarks<br>
into nofib so that GHC devs (including me) can start measuring changes<br>
off them. I know Edsko has some benchmarks here:<br>
<a href="http://www.edsko.net/2013/02/06/performance-problems-with-threaded/" target="_blank">http://www.edsko.net/2013/02/06/performance-problems-with-threaded/</a><br>
but they depend on network which makes it a little difficult to move into nofib.<br>
I'm working on other scheduler changes that may help you guys out; we<br>
should keep each other updated.<br></blockquote><div><br></div><div style>That would be great :)</div><div style> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
I noticed your patch also incorporates the "make yield actually work" patch;<br>
do you think the improvement in 7.4.1 was due to that specific change?<br></blockquote><div> </div><div>Actually, I believe that patch is irrelevant to the scheduler change and probably should not be in there, strictly speaking. I actually needed that patch for the IO manager revisions to work properly.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
(Have you instrumented the run queues and checked how your patch changes<br>
the distribution of jobs over your runtime?)<br>
<br></blockquote><div style>I didn't do this very rigorously, but I think I added some print statements in the scheduler and I looked at some eventlogs in threadscope to see that threads work pushing slows down after a while. I had planned to write a script to analyze an event log file to extract these stats, but I never got around to it. </div>
<div style><br></div><div style>-Andi</div><div style><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Somewhat unrelatedly, if you have some good latency tests already,<br>
it may be worth a try compiling your copy of GHC -fno-omit-yields, so that<br>
forced context switches get serviced more predictably.<br>
<br>
Cheers,<br>
Edward<br>
<br>
Excerpts from Andreas Voellmy's message of Thu Feb 07 21:20:<a href="tel:25%20-0800%202013" value="+12508002013">25 -0800 2013</a>:<br>
<div class=""><div class="h5">> Hi Edward,<br>
><br>
> I did two things to improve latency for my application: (1) rework the IO<br>
> manager and (2) stabilize the work pushing. (1) seems like a big win and we<br>
> are almost done with the work on that part. It is less clear whether (2)<br>
> will generally help much. It helped me when I developed it against 7.4.1,<br>
> but it doesn't seem to have much impact on HEAD on the few measurements I<br>
> did. The idea of (2) was to keep running averages of the run queue length<br>
> of each capability, then push work when these running averages get too<br>
> out-of-balance. The desired effect (which seems to work on my particular<br>
> application) is to avoid cases in which threads are pushed back and forth<br>
> among cores, which may make cache usage worse. You can see my patch here:<br>
> <a href="https://github.com/AndreasVoellmy/ghc-arv/commits/push-work-exchange-squashed" target="_blank">https://github.com/AndreasVoellmy/ghc-arv/commits/push-work-exchange-squashed</a><br>
> .<br>
><br>
> -Andi<br>
><br>
> On Fri, Feb 8, 2013 at 12:10 AM, Edward Z. Yang <<a href="mailto:ezyang@mit.edu">ezyang@mit.edu</a>> wrote:<br>
><br>
> > Hey folks,<br>
> ><br>
> > The latency changes sound relevant to some work on the scheduler I'm doing;<br>
> > is there a place I can see the changes?<br>
> ><br>
> > Thanks,<br>
> > Edward<br>
> ><br>
> > Excerpts from Simon Peyton-Jones's message of Wed Feb 06 10:10:10 -0800<br>
> > 2013:<br>
> > > I (with help from Kazu and helpful comments from Bryan and Johan) have<br>
> > nearly completed an overhaul to the IO manager based on my observations and<br>
> > we are in the final stages of getting it into GHC<br>
> > ><br>
> > > This is really helpful. Thank you very much Andreas, Kazu, Bryan, Johan.<br>
> > ><br>
> > > Simon<br>
> > ><br>
> > > From: <a href="mailto:parallel-haskell@googlegroups.com">parallel-haskell@googlegroups.com</a> [mailto:<br>
> > <a href="mailto:parallel-haskell@googlegroups.com">parallel-haskell@googlegroups.com</a>] On Behalf Of Andreas Voellmy<br>
> > > Sent: 06 February 2013 14:28<br>
> > > To: <a href="mailto:watson.timothy@gmail.com">watson.timothy@gmail.com</a><br>
> > > Cc: <a href="mailto:kostirya@gmail.com">kostirya@gmail.com</a>; parallel-haskell;<br>
> > <a href="mailto:glasgow-haskell-users@haskell.org">glasgow-haskell-users@haskell.org</a><br>
> > > Subject: Re: Cloud Haskell and network latency issues with -threaded<br>
> > ><br>
> > > Hi all,<br>
> > ><br>
> > > I haven't followed the conversations around CloudHaskell closely, but I<br>
> > noticed the discussion around latency using the threaded runtime system,<br>
> > and I thought I'd jump in here.<br>
> > ><br>
> > > I've been developing a server in Haskell that serves hundreds to<br>
> > thousands of clients over very long-lived TCP sockets. I also had latency<br>
> > problems with GHC. For example, with 100 clients I had a 10 ms<br>
> > (millisecond) latency and with 500 clients I had a 29ms latency. I looked<br>
> > into the problem and found that some bottlenecks in the threaded IO manager<br>
> > were the cause. I made some hacks there and got the latency for 100 and 500<br>
> > clients down to under 0.2 ms. I (with help from Kazu and helpful comments<br>
> > from Bryan and Johan) have nearly completed an overhaul to the IO manager<br>
> > based on my observations and we are in the final stages of getting it into<br>
> > GHC. Hopefully our work will also fix the latency issues in CloudHaskell<br>
> > programs :)<br>
> > ><br>
> > > It would be very helpful if someone has some benchmark CloudHaskell<br>
> > applications and workloads to test with. Does anyone have these handy?<br>
> > ><br>
> > > Cheers,<br>
> > > Andi<br>
> > ><br>
> > > On Wed, Feb 6, 2013 at 9:09 AM, Tim Watson <<a href="mailto:watson.timothy@gmail.com">watson.timothy@gmail.com</a><br>
> > <mailto:<a href="mailto:watson.timothy@gmail.com">watson.timothy@gmail.com</a>>> wrote:<br>
> > > Hi Kostirya,<br>
> > ><br>
> > > I'm putting the parallel-haskell and ghc-users lists on cc, just in case<br>
> > other (better informed) folks want to chip in here.<br>
> > ><br>
> > > ----<br>
> > ><br>
> > > First of all, I'm assuming you're talking about network latency when<br>
> > compiling with -threaded - if not I apologise for misunderstanding!<br>
> > ><br>
> > > There is apparently an outstanding network latency issue when compiling<br>
> > with -threaded, but according to a conversation I had with the other<br>
> > developers on #haskell-distributed, this is not something that's specific<br>
> > to Cloud Haskell. It is something to do with the threaded runtime system,<br>
> > so would need to be solved for GHC (or is it just the Network package!?) in<br>
> > general. Writing up a simple C program and equivalent socket use in Haskell<br>
> > and comparing the latency using -threaded will show this up.<br>
> > ><br>
> > > See the latency section in<br>
> > <a href="http://haskell-distributed.github.com/wiki/networktransport.html" target="_blank">http://haskell-distributed.github.com/wiki/networktransport.html</a> for some<br>
> > more details. According to that, there *are* some things we might be able<br>
> > to do, but the 20% latency isn't going to change significantly on the face<br>
> > of things.<br>
> > ><br>
> > > We have an open ticket to look into this (<br>
> > <a href="https://cloud-haskell.atlassian.net/browse/NTTCP-4" target="_blank">https://cloud-haskell.atlassian.net/browse/NTTCP-4</a>) and at some point<br>
> > we'll try and put together the sample programs in a github repository (if<br>
> > that's not already done - I might've missed previous spikes done by Edsko<br>
> > or others) and investigate further.<br>
> > ><br>
> > > One of the other (more experienced!) devs might be able to chip in and<br>
> > proffer a better explanation.<br>
> > ><br>
> > > Cheers,<br>
> > > Tim<br>
> > ><br>
> > > On 6 Feb 2013, at 13:27, <a href="mailto:kostirya@gmail.com">kostirya@gmail.com</a><mailto:<a href="mailto:kostirya@gmail.com">kostirya@gmail.com</a>><br>
> > wrote:<br>
> > ><br>
> > > > Haven't you had a necessity to launch Haskell in no-threaded mode<br>
> > during the intense network data exchange?<br>
> > > > I am getting the double performance penalty in threaded mode. But I<br>
> > must use threaded mode because epoll and kevent are available in the<br>
> > threaded mode only.<br>
> > > ><br>
> > ><br>
> > > [snip]<br>
> > ><br>
> > > ><br>
> > > ><br>
> > > > среда, 6 февраля 2013 г., 12:33:36 UTC+2 пользователь Tim Watson<br>
> > написал:<br>
> > > > Hello all,<br>
> > > ><br>
> > > > It's been a busy week for Cloud Haskell and I wanted to share a few of<br>
> > > > our news items with you all.<br>
> > > ><br>
> > > > Firstly, we have a new home page at<br>
> > <a href="http://haskell-distributed.github.com" target="_blank">http://haskell-distributed.github.com</a>,<br>
> > > > into which most of the documentation and wiki pages have been merged.<br>
> > Making<br>
> > > > sassy looking websites is not really my bag, so I'm very grateful to<br>
> > the<br>
> > > > various author's whose Creative Commons licensed designs and layouts<br>
> > made<br>
> > > > it easy to put together. We've already had some pull requests to fix<br>
> > minor<br>
> > > > problems on the site, so thanks very much to those who've contributed<br>
> > already!<br>
> > > ><br>
> > > > As well as the new site, you will find a few of us hanging out on the<br>
> > > > #haskell-distributed channel on freenode. Please do come along and<br>
> > join in<br>
> > > > the conversation.<br>
> > > ><br>
> > > > We also recently split up the distributed-process project into separate<br>
> > > > git repositories, one for each component that makes up Cloud Haskell.<br>
> > This<br>
> > > > was done partly for administrative purposes and partly because we're<br>
> > in the<br>
> > > > process of setting up CI builds for all the projects.<br>
> > > ><br>
> > > > Finally, we've moved from Github's issue tracker to a hosted<br>
> > Jira/Bamboo setup<br>
> > > > at <a href="https://cloud-haskell.atlassian.net" target="_blank">https://cloud-haskell.atlassian.net</a> - pull requests are naturally<br>
> > still welcome<br>
> > > > via Github! Although you can browse issues freely without logging in,<br>
> > you will<br>
> > > > need to provide an email address and get an account in order to submit<br>
> > new ones.<br>
> > > > If you have any difficulties logging in, please don't hesitate to<br>
> > contact me<br>
> > > > directly, via this forum or the cloud-haskell-developers mailing list<br>
> > (on<br>
> > > > google groups).<br>
> > > ><br>
> > > > As always, we'd be delighted to hear any feedback!<br>
> > > ><br>
> > > > Cheers,<br>
> > > > Tim<br>
> > ><br>
> ><br>
<br>
</div></div><div class=""><div class="h5">--<br>
You received this message because you are subscribed to the Google Groups "parallel-haskell" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:parallel-haskell%2Bunsubscribe@googlegroups.com">parallel-haskell+unsubscribe@googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/groups/opt_out" target="_blank">https://groups.google.com/groups/opt_out</a>.<br>
<br>
<br>
</div></div></blockquote></div><br></div></div>