[Haskell] thread-local variables

Sat Aug 5 09:32:34 EDT 2006

> > Maybe I'm misunderstanding your position - maybe you think that I
> > should use lots of different processes to segregate global state into
> > separate contexts? Well, that's nice, but I'd rather not. For
> > instance, I'm writing a server - and it's just not efficient to use a
> > separate process for each request. And there are some things such as
> > database connections, current user id, log files, various profiling
> > data, etc., that I would like to be thread-global but not
> > process-global.
> 
> I have done many servers in Haskell. Usually I have threads allocated
> to specific tasks rather than specific requests.
> 
> What guarantees do your code have that all the relevant parameters
> are already initialized - and how can an user of the code know
> which TLS variables need to be initialized? 

You could ask the same questions about process-global state, couldn't
you?

> If it is documented maybe it could be done at the level of an
> implicit parameter?

Do you think implicit parameters are better than TLS?

> > Or maybe you think that certain types of global state should be
> > privileged - for instance, that all of the things which are arguments
> > to 'newMain' above are OK to have as global state, but that anything
> > else should be passed as function arguments, thus making
> > thread-localization moot. I disagree with this - I am a proponent of
> > extensibility, and think that the language should make as few things
> > as possible "built-in". I want to define my own application-specific
> > global state, and, additionally, I want to have it thread-global, not
> > process-global.
> 
> This can cause much fun with the FFI. If we change e.g. stdout to
> thread specific what should be do before each foreign call? Same
> with the other things that are related to the OS process in question.
> 
> A thread is a context of execution while a process is a context for
> resources. Would you like to have multiple Haskell processes inside
> one OS process?

If you want to think of it that way, then sure.

> I don't consider these very different:
> 1) use one thread from a pre-allocated pool to do a task
> 2) fork a new thread to do the task
> 
> With TLS they are vastly different.

If you don't consider them different, then you can start using (2)
instead of (1).

> > You asked for an example, but, because of the nature of this topic, it
> > would have to be a very large example to prove my point. Thread-local
> > variables are things that only become really useful in large programs. 
> > Instead, I've asked you to put yourself in my shoes - what if the bits
> > of context that you already take for granted in your programs had to
> > be thread-local? How would you cope, without thread-local variables,
> > in such a situation?
> 
> I have been using an application specific monad (newtyped transformer) and
> a clean set of functions so that the implementation is not hardcoded
> and can be changed easily. Thus I haven't had the same difficulties
> as you.
> 
> I don't think many of the process global resources would make sense
> on a per-thread basis and I am not against all global state.

You say "many", but the question is "are there any".

> > > But I would say that I think I would find having to know what thread
> > > a particular bit of code was running in in order to "grok it" very
> > > strange,
> > 
> > I agree that it is important to have code which is easy to understand.
> > 
> > Usually, functions run in the same thread as their caller, unless they
> > are passed to something with the word 'fork' in the name. That's a
> > good rule of thumb that is in fact sufficient to let you understand
> > the code I write. Also, if that's too much to remember, then since I'm
> > only proposing and using non-mutable thread-local state (i.e. it
> > behaves like a MonadReader), and since I'm not passing actions between
> > threads as Einar is, then you can forget about the 'fork' caveat.
> 
> The only problem appears when someone uses two libraries one written
> by me and an another written by you and wonders "why is my program
> failing in mysterious ways".

Can you give the API for your library? I have a hard time imagining
how it could not be obvious that a thread pool is being used.

> > I think the code would in fact be more difficult to "grok", if all of
> > the things which I want to be thread-local were instead passed around
> > as parameters, a la 'newMain'. This is simply because, in that
> > scenario, there would much more code to read, and it would be very
> > repetitive. If I used special monads for my state, then the situation
> > would be only slightly better - a single monad would not suffice, and
> > I'd be faced with a plethora of 'lift' functions and redefinitions of
> > 'catch', as well as long type signatures and a crowded namespace.
> 
> As said before the monadic approach can be quite clean. I haven't used
> implicit parameters that much, so I won't comment on them.

Perhaps you can give an example? As I said, a single monad won't
suffice for me, because different libraries only know about different
parts of the state. With TLS, one can delimit the scope of parameters
by making the references to them module-internal, for instance.

With monads, I imagine that I'll need for each parameter

(1) a MonadX class, with a liftX member
(2) a catchX function
(3) a MonadY instance, for each wrapped monad Y (thus the number of
such instances will be O(n^2) where n is the number of parameters)

With TLS, I need

(1) a declaration "x = unsafePerformIO $ newIOParam ..."

> > > unless there was some obvious technical reason why the
> > > thread local state needed to be thread local (can't think of any
> > > such reason right now).
> > 
> > Some things are not immediately obvious. If you don't like to think of
> > reasons, then just take my word for it that it would help me. A
> > facility for thread-local variables would be just another of many
> > facilities that programmers could choose from when designing their
> > code. I'm not asking you to change the way you program - I don't care
> > how other people program. I trust them to know what is best for their
> > particular application. It's none of my business, anyway.
> 
> I think we can agree to disagree on whether they are a good idea :-)
> 
> Mainly I am concerned with the ability to share and reuse code
> between different Haskell projects. We really don't want to make
> it hard to combine libraries because one uses much threading 
> and the other one TLS. I think this is the most important
> issue.

Note that even if I changed the definition of 'forkIO' in my code to
use a thread pool, the program semantics would stay the same, because
all of the per-request configuration is done in the request thread
itself.

> > Since Simon Marlow said that he had been considering a thread-local
> > variable facility, I merely wanted to voice my support:
> > 
> > http://www.mail-archive.com/haskell@haskell.org/msg18398.html
> > 
> > It seems that there are enough resources to implement one. The
> > discussion should not be about "do we allow this" but rather "what
> > should the API be".
> 
> There is nothing stopping you from implementing them.

Hmm? Can you give me an implementation based on the current set of
standard libraries in GHC? Or were you implying that there's nothing
stopping me from modifying the compiler to implement them?

Frederik

-- 
http://ofb.net/~frederik/