[Haskell-cafe] How to ensure code executes in the context of a specific OS thread?

Wed Jul 6 17:51:38 CEST 2011

On 06/07/2011 16:24, Jason Dagit wrote:
> On Wed, Jul 6, 2011 at 8:09 AM, Simon Marlow<marlowsd at gmail.com>  wrote:
>> On 06/07/2011 15:42, Jason Dagit wrote:
>>>
>>> On Wed, Jul 6, 2011 at 2:23 AM, Simon Marlow<marlowsd at gmail.com>    wrote:
>>>>
>>>> On 06/07/2011 07:37, Jason Dagit wrote:
>>>>>
>>>>> On Jul 5, 2011 1:04 PM, "Jason Dagit"<dagitj at gmail.com
>>>>> <mailto:dagitj at gmail.com>>    wrote:
>>>>>   >
>>>>>   >    On Tue, Jul 5, 2011 at 12:33 PM, Ian Lynagh<igloo at earth.li
>>>>> <mailto:igloo at earth.li>>    wrote:
>>>>>   >    >    On Tue, Jul 05, 2011 at 08:11:21PM +0100, Simon Marlow wrote:
>>>>>   >    >>
>>>>>   >    >>    In GHCi it's a different matter, because the main thread is
>>>>> running
>>>>>   >    >>    GHCi itself, and all the expressions/statements typed at the
>>>>> prompt
>>>>>   >    >>    are run in forkIO'd threads (a new one for each statement, in
>>>>> fact).
>>>>>   >    >>    If you want a way to run command-line operations in the main
>>>>> thread,
>>>>>   >    >>    please submit a feature request.  I'm not sure it can be done,
>>>>> but
>>>>>   >    >>    I'll look into it.
>>>>>   >    >
>>>>>   >    >    We already have a way: -fno-ghci-sandbox
>>>>>   >
>>>>>   >    I've removed all my explicit attempts to forkIO/forkOS and passed
>>>>> the
>>>>>   >    command line flag you mention.  I just tried this but it doesn't
>>>>>   >    change the behavior in my example.
>>>>>
>>>>> I tried it again and discovered that due to an argument parsing bug in
>>>>> cabal-dev that the flag was not passed correctly. I explicitly passed it
>>>>> and verified that it works. Thanks for the workaround. By the way, I did
>>>>> look at the user guide for options like this and didn't see it. Which
>>>>> part of the manual is it in?
>>>>>
>>>>> Can I still make a feature request for a function to make code run on
>>>>> the original thread? My reasoning is that the code which needs to run on
>>>>> the main thread may appear in a library in which case the developer has
>>>>> no control over how ghc is invoked.
>>>>
>>>> I'm not sure how that would work.  The programmer is in control of what
>>>> the
>>>> main thread does, not GHC.  So in order to implement some mechanism to
>>>> run
>>>> code on the main thread, we would need some cooperation from the main
>>>> thread
>>>> itself.  For example, in gtk2hs the main thread runs an event handler
>>>> loop
>>>> which occasionally checks a queue for requests from other threads (at
>>>> least,
>>>> I think that's how it works).
>>>
>>> What I'm wrestling with is the following.  Say I make a GUI library.
>>> As author of the GUI library I discover issues like this where the
>>> library code needs to execute on the "main" thread.  Users of the
>>> library expect the typical Haskell environment where you can't tell
>>> the difference between threads, and you fork at will.  How can I make
>>> sure my library works from GHC (with arbitrary user threads) and from
>>> GHCI?
>>>
>>> As John Lato points out in his email lots of people bump into this
>>> without realizing it and don't understand what the problem is.  We can
>>> try our best to educate everyone, but I have this sense that we could
>>> also do a better job of providing primitives to make it so that code
>>> will run on the main thread regardless of how people invoke the
>>> library.
>>>
>>> In my specific case (Cocoa on OSX), it is possible for me to use some
>>> Cocoa functions to force things to run on the main thread.  From what
>>> I've read Cocoa uses pthreads to implement this. I was hoping we could
>>> expose something from the RTS code in Control.Concurrent so that it's
>>> part of an "official" Haskell API that library writers can assume.
>>>
>>> Judging by this SO question, it's easier to implement this in Haskell
>>> on top of pthreads than to implement it in C (here I'm assuming GHC's
>>> RTS uses pthreads, but I've never checked):
>>>
>>> http://stackoverflow.com/questions/6130823/pthreads-perform-function-on-main-thread
>>>
>>> In fact, the it sounds like what Gtk2hs is doing with the postGUI
>>> functions.
>>
>> Right, but usually the way this is implemented is with some cooperation from
>> the main thread.  That SO answer explains it - the main thread runs some
>> kind of loop that periodically checks for requests from other threads and
>> services them.  I expect that's how it works on Cocoa.
>> So you can't just do this from a library - the main thread has to be in on
>> the game.
>
> Yes.  From my perspective (that of a library writer) that's what makes
> this tricky in GHCi.  I need GHCi's cooperation.  From GHCi's
> perspective it's tricky too.
>
>> I suppose you might wonder whether the GHC RTS could implement
>> runInMainThread by preempting the main thread and running some different
>> code on it.
>
> Yes, that's roughly what I was wondering about.
>
>>   In theory that's possible, but whether it's a good idea or not
>> is a different matter!  I think it amounts to the same thing as the gtk2hs
>> folks have been asking for - multiple Haskell threads bound to the same OS
>> thread.
>
> I'm starting to realize that I don't understand the GHC threading
> model very well :)  I thought that was already possible.  I may be
> mixing GHC's thread model up with other language implementations, but
> I thought that it had a pool of OS threads and that Haskell threads
> ran on them as needed.  I think what you're saying is that the RTS has
> bound threads and it has thread pooling, but what it doesn't have is
> "bound thread pooling" (that is, the combination of being bound and
> pooled).
>
>>   runInMainThread then becomes the same as forking a temporary new
>> thread bound to the main OS thread, or temporarily binding the current
>> thread to the main OS thread.  If the main OS thread is off making a foreign
>> call (e.g. in the GUI library's main loop) then it can't run any other
>> Haskell threads anyway, and then I have to figure out what to do with all
>> these Haskell threads waiting for their bound OS thread to come back from
>> the foreign call.  My guess is that all this would be pretty complex to
>> implement.
>
> Yes it does sound complex.  I'd really like help as much as possible.
> I know very little about GHC internals but perhaps I could take a look
> at some of the RTS code.  Is there some background reading I could do?
>   Perhaps a specific reference to a paper or wiki page?

This is the paper that explains the design:

   http://community.haskell.org/~simonmar/papers/conc-ffi.pdf

And there's some documentation on the implementation here:

   http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Scheduler

though I think the latter is very incomplete.  I'd really like to flesh 
it out sometime.

Cheers,
	Simon