<font face="arial, helvetica, sans-serif">I posted this issue on StackOverflow today. A brief recap:</font><div><p style="margin-top:0px;margin-right:0px;margin-bottom:1em;margin-left:0px;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;border-top-width:0px;border-right-width:0px;border-bottom-width:0px;border-left-width:0px;border-style:initial;border-color:initial;vertical-align:baseline;background-image:initial;background-color:rgb(255,255,255);clear:both;word-wrap:break-word;line-height:18px;text-align:left">
<font face="arial, helvetica, sans-serif"> In the case when C FFI calls back a Haskell function, I have observed sharp increase in total time when multi-threading is enabled in C code (even when total number of function calls to Haskell remain same). In my test, I called a Haskell function 5M times using two scenarios (GHC 7.0.4, RHEL5, 12-core box):</font></p>
<p style="margin-top:0px;margin-right:0px;margin-bottom:1em;margin-left:0px;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;border-top-width:0px;border-right-width:0px;border-bottom-width:0px;border-left-width:0px;border-style:initial;border-color:initial;vertical-align:baseline;background-image:initial;background-color:rgb(255,255,255);clear:both;word-wrap:break-word;text-align:left">
</p><ul style="line-height:18px"><li><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">Single-threaded C </span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">function: call back Haskell function</span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif"> </span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">5M times - Total time 1.32s</span></li>
<li><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">5 threads in C</span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif"> </span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">function: each thread calls back the Haskell function</span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif"> </span><span style="background-color:transparent;font-family:arial,helvetica,sans-serif">1M times - so, total is still 5M - Total time 7.79s - Verified that pthread didn&#39;t contribute much to the overhead by having the same code call a C function instead, and compared with single-threaded version. So, almost all of the increase in overhead seems to come from GHC runtime.</span></li>
</ul><div style="line-height:18px"><font face="arial, helvetica, sans-serif">What I want to ask is if this is a known issue for GHC runtime? If not,  I will file a bug report for GHC team with code to reproduce it. I don&#39;t want to file a duplicate bug report if this is already known issue. I searched through GHC trac using some keywords but didn&#39;t see any bugs related to it.</font></div>
<div style="line-height:18px"><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif" style="line-height:18px">StackOverflow post link (has code and details on how to reproduce the issue): </font><font face="arial, helvetica, sans-serif"><span style="line-height:18px"><a href="http://stackoverflow.com/questions/8902568/runtime-performance-degradation-for-c-ffi-callback-when-pthreads-are-enabled">http://stackoverflow.com/questions/8902568/runtime-performance-degradation-for-c-ffi-callback-when-pthreads-are-enabled</a></span></font></div>
<p></p></div>