<div dir="ltr">glad I could help, <div><a href="https://github.com/wellposed/hblas/blob/master/src/Numerical/HBLAS/BLAS/Internal.hs#L146">https://github.com/wellposed/hblas/blob/master/src/Numerical/HBLAS/BLAS/Internal.hs#L146</a> is <br>


</div><div>an example of the "choose to do the safe vs unsafe ffi call" trick</div><div>in the case of blas / lapack routines, i can always estimate how long a compute job will take as a function of its inputs, and i use that estimate to decide which ffi strategy to use (ie i use unsafe ffi on < 10 microsecond computations so that the overhead doesn't dominate the compute time on tiny inputs)</div>


</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 14, 2014 at 5:38 PM, Christian Höner zu Siederdissen <span dir="ltr"><<a href="mailto:choener@tbi.univie.ac.at" target="_blank">choener@tbi.univie.ac.at</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That's actually a great idea, especially since the safe variants of the<br>

calls are already in place.<br>

<br>

* Carter Schonwald <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>> [14.08.2014 23:10]:<br>

<div class="">>    have a smart wrapper around you ffi call, and if when you think the ffi<br>

>    call will take more than 1 microsecond, ALWAYS use the safe ffi call,<br>

>    i do something like this in an FFI i wrote, it works great<br>

><br>

</div>>    On Thu, Aug 14, 2014 at 1:20 PM, Christian HAP:ner zu Siederdissen<br>

<div class="HOEnZb"><div class="h5">>    <<a href="mailto:choener@tbi.univie.ac.at">choener@tbi.univie.ac.at</a>> wrote:<br>

><br>

>      Thanks,<br>

><br>

>      I've played around some more and finally more than one capability is<br>

>      active. And indeed, unsafe calls don't block everything. I /had/<br>

>      actually read that but when I saw the system spending basically only<br>

>      100% cpu time, I'd thought to ask.<br>

><br>

>      One problem with this program seems to be that the different tasks are<br>

>      of vastly different sizes. Inputs range from ~ 7x10^1 to ~ 3x10^7<br>

>      elements inducing waits with the larger problem sizes.<br>

><br>

>      We'll keep the program single-threaded for now as this also keeps memory<br>

>      consumption at only 25 gbyte instead of the more impressive 70 gbyte in<br>

>      multi-threaded mode ;-)<br>

><br>

>      Viele Gruesse,<br>

>      Christian<br>

><br>

>      _______________________________________________<br>

>      Glasgow-haskell-users mailing list<br>

>      <a href="mailto:Glasgow-haskell-users@haskell.org">Glasgow-haskell-users@haskell.org</a><br>

>      <a href="http://www.haskell.org/mailman/listinfo/glasgow-haskell-users" target="_blank">http://www.haskell.org/mailman/listinfo/glasgow-haskell-users</a><br>

</div></div></blockquote></div><br></div>