[Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

Thu May 12 17:45:02 CEST 2011

On 12/05/2011 16:04, David Mazieres expires 2011-08-10 PDT wrote:
> At Thu, 12 May 2011 09:57:13 +0100,
> Simon Marlow wrote:
>>
>>> So to answer my own question from earlier, I did a bit of
>>> benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
>>> 3060, running linux 2.6.38), I get the following costs:
>>>
>>>        9 ns - return () :: IO ()       -- baseline (meaningless in itself)
>>>       13 ns - unsafeUnmask $ return () -- with interrupts enabled
>>>       18 ns - unsafeUnmask $ return () -- inside a mask_
>>>
>>>       13 ns - ffi                      -- a null FFI call (getpid cached by libc)
>>>       18 ns - unsafeUnmask ffi         -- with interrupts enabled
>>>       22 ns - unsafeUnmask ffi         -- inside a mask_
>>
>> Those are lower than I was expecting, but look plausible.  There's room
>> for improvement too (by inlining some or all of unsafeUnmask#).
>
> Do you mean inline unsafeUnmask, or unmaskAsyncExceptions#?  I tried
> inlining unsafeUnmask by writing my own version and giving it the
> INLINE pragma, and it didn't affect performance at all.

Right, I meant inlining unmaskAsyncExceptions#, which would require 
compiler support.

>> However, the general case of unsafeUnmask E, where E is something more
>> complex than return (), will be more expensive because a new closure for
>> E has to be created.  e.g. try "return x" instead of "return ()", and
>> try to make sure that the closure has to be created once per
>> unsafeUnmask, not lifted out and shared.
>
> Okay.  I'm surprised by getpid example wasn't already stressing this,
> but, indeed, I see a tiny difference with the following code:
>
>         ffi>>= return . (1 +) -- where ffi calls getpid
>
>         13 ns - no unmasking
>         20 ns - unsafeUnmask when not inside _mask
>         25 ns - unsafeUnmask when benchmark loop in inside one big _mask
>
> So now we're talking about 28 cycles or something instead of 22.
> Still not a huge deal.

Ok, sounds reasonable.

>> There are no locks here, thanks to the message-passing implementation we
>> use for throwTo between processors.
>
> Okay, that sounds good.  So then there is no guarantee about ordering
> of throwTo exceptions?  That seems like a good thing since there are
> other mechanisms for synchronization.

What kind of ordering guarantee did you have in mind?  We do guarantee 
that in

    throwTo t e1
    throwTo t e2

Thread t will receive e1 before e2 (obviously, because throwTo is 
synchronous and only returns when the exception has been raised).

Pending exceptions are processed in LIFO order (for no good reason other 
than performance), so there's no kind of fairness guarantee of the kind 
you get with MVars.  One thread doing throwTo can be starved by others. 
  I don't think that's a serious problem.

Cheers,
	Simon