[Haskell-cafe] Faster timeout but is it correct?

Tue Feb 22 23:50:40 CET 2011

On 22 February 2011 19:59, Bertram Felgenhauer
<bertram.felgenhauer at googlemail.com> wrote:
> Bas van Dijk wrote:
>> On 19 February 2011 00:04, Bas van Dijk <v.dijk.bas at gmail.com> wrote:
>> > So, since the new implementation is not really faster in a
>> > representative benchmark and above all is buggy, I'm planning to ditch
>> > it in favour of the event-manager based timeout.
>>
>> The patch is ready for review:
>>
>> http://hackage.haskell.org/trac/ghc/attachment/ticket/4963/faster_timeout.dpatch
>
> (For reference, this is the proposed timeout code:)
>
> | timeout :: Int -> IO a -> IO (Maybe a)
> | timeout usecs f
> |     | usecs <  0 = fmap Just f
> |     | usecs == 0 = return Nothing
> |     | otherwise  = do
> |         myTid <- myThreadId
> |         Just mgr <- readIORef eventManager
> |         mask $ \restore -> do
> |           key <- registerTimeout mgr usecs $ \key ->
> |                    throwTo myTid $ Timeout key
> |           let unregTimeout = M.unregisterTimeout mgr key
> |           (restore (fmap Just f) >>= \mb -> unregTimeout >> return mb)
> |             `catch` \e ->
> |                 case fromException e of
> |                   Just (Timeout key') | key' == key -> return Nothing
> |                   _ -> unregTimeout >> throwIO e
>
> What happens if the timeout triggers while the exception handler is
> running? I.e., we have the following sequence of events:
>
> 1. registerTimeout
> 2. (fmap Just f) raises an exception, or the thread gets killed otherwise.
> 3. We enter the `catch` handler, with the corresponding exception.
> 4. The timeout expires, and the event Manager runs the IO action, i.e.
>   throwTo myTid $ Timeout key
> 5. And now we have a pending Timeout exception which escapes the 'timeout'.
>   The unregTimeout will come too late.

Bummer! You're right.

But maybe we can catch and ignore a potential pending Timeout
exception: (code not tested and profiled yet)

timeout :: Int -> IO a -> IO (Maybe a)
timeout usecs f
    | usecs <  0 = fmap Just f
    | usecs == 0 = return Nothing
    | otherwise  = do
        myTid <- myThreadId
        Just mgr <- readIORef eventManager
        mask $ \restore -> do
          key <- registerTimeout mgr usecs $ \key ->
                   throwTo myTid $ Timeout key
          let unregTimeout = M.unregisterTimeout mgr key
          (restore (fmap Just f) >>= \mb -> unregTimeout >> return mb)
            `catch` \e ->
                case fromException e of
                  Just (Timeout key') | key' == key -> return Nothing
                  _ -> do (unregTimeout >> allowInterrupt)
                            `catch` \(Timeout _) -> return ()
                          throwIO e

Note I use the newly proposed[1] allowInterrupt:

-- | When invoked inside 'mask', this function allows a blocked
-- asynchronous exception to be raised, if one exists.  It is
-- equivalent to performing an interruptible operation (see
-- #interruptible#), but does not involve any actual blocking.
--
-- When called outside 'mask', or inside 'uninterruptibleMask', this
-- function has no effect.
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

> I've stumbled on another problem with the timeout function. Is this
> already known? Namely, the current implementation has trouble protecting
> against asynchronous exceptions, which can cause Timeout exceptions to
> escape from the corresponding 'timeout' call. The following program
> demonstrates this issue. (tested on ghc 7.0.1 using the threaded runtime)
>
> {-# LANGUAGE ScopedTypeVariables #-}
> import System.Timeout
> import Control.Exception
> import Control.Concurrent
> import Control.Monad
> import Prelude hiding (catch)
>
> delay = threadDelay 1000
>
> test = do
>    let act = timeout 1 (threadDelay 1) >> delay
>        act' = (act `catch` \ThreadKilled -> return ()) >> delay
>    tid <- forkIO $
>         act' `catch` \(e :: SomeException) ->
>             putStr $ "gotcha: " ++ show e ++ "!\n"
>    forkIO $ (threadDelay 10) >> killThread tid
>    return ()
>
> main = do
>    replicateM_ 1000 test
>    threadDelay 100000
>
> (Will print  gotcha: <<timeout>>!  for every escaping Timout exception.)
> What I believe happens is that the 'killThread' in the timeout function
> is interrupted by the 'killThread' from the test program; as a result,
> the forked timeout thread continues to run after the timeout function
> itself has finished.
>
> Protecting against this seems hard, if not impossible. Even if we
> introduce a lock
>    lock <- newMVar ()
> and let the timeout thread take the lock before throwing the exception
>    ...
>    forkIO (threadDelay n >> takeMVar lock >> throwTo pid ex)
>    ...
> when handling the exception we still face a problem: We can use
> tryTakeMVar lock  to stop the timeout thread from killing us, and to
> detect whether it's already too late for that. However, in that latter
> case, we will have to wait for the Timeout exception to arrive, in
> order to filter it; that means we will have to catch and remember all
> other pending async exception first, filter the Timeout exception, and
> then re-raise all the exceptions again.
>
> I suspect that the event manager based implementation will face the
> same problem.

Actually the event manager based implementation totally crashes on
your example, so again: bummer! I get the following error:

"gotcha: user error (Pattern match failure in do expression at
libraries/base/System/Event/Thread.hs:208:9-16)!"

Line 208:

Just mgr <- readIORef eventManager

I assumed that pattern match was safe because it's also used like that
in other places in the event manager (threadDelay, registerDelay,
closeFdWith and threadWait). I guess I was wrong...

All in all I have to seriously study this some more.

Thanks,

Bas

[1] http://hackage.haskell.org/trac/ghc/ticket/4857