FFI, signals and exceptions

Thu Aug 26 03:10:57 EDT 2010

Here is a possible implementation:

    Task *task = NULL;
    blockedThrowTo(cap,target,msg);
    if (target->bound) {
        // maybe not supposed to kill bound threads, but it
        // seems to work ok (as long as they don't want to try
        // to recover!)
        task = target->bound->task;
    } else {
        // walk all_tasks to find the correct worker thread
        for (task = all_tasks; task != NULL; task = task->all_link) {
            if (task->incall->suspended_tso == target) {
                break;
            }
        }
    }
    if (task != NULL) {
        pthread_cancel(task->id);
        // cargo cult cargo cult...
        task->cap = NULL;
        task->stopped = rtsTrue;
    }

This is quite good at causing the C computation to terminate,
but not so good at letting the Task that requested the FFI call
that it can wake up now.  In particular, consider the following
code (using the interruptible function defined earlier):

    foreign import ccall "foo.h" foo :: CInt -> IO ()

    fooHs n = do
        putStrLn $ "Arf " ++ show n
        threadDelay 1000000
        fooHs n

    main = main' 2

    main' 0 = putStrLn "Quitting"
    main' n = do
        tid <- newEmptyMVar
        interruptible () $ do
            putMVar tid =<< myThreadId
            (r :: Either E.AsyncException ()) <- E.try $ foo n
            putStrLn "Thread was able to catch exception"
        print =<< readMVar tid
        print =<< threadStatus =<< readMVar tid
        putStrLn "----"
        main' (pred n)

with foo.h/foo.c something like:

    void foo(int d) {
        while (1) {
            printf("Arf %d\n", d);
            sleep(1);
        }
    }

Without the RTS patch, the first foo(2) loop continues even after
interrupting (and resuming the primary execution of the program.
With the RTS patch, the first foo(2) loop terminates upon the
signal, but the thread 'tid' continues to be 'BlockedOnOther',
and "Thread was able to catch exception" is never printed.
If we use fooHs instead of foo, we see the expected behavior where
the loop is terminated, the exception caught, and the message
printed (eventually).

Tomorrow, I plan on looking more closely at how we might resume
the thread corresponding to 'tid'; however, it does seem like
something of a dangerous proposition given that the worker thread
was unceremoniously terminated, so none of the thunks actually got
evaluated.

Cheers,
Edward

P.S. I can post real diffs if other people are interested in replicating.