Finalizers etcetera

Wed Oct 9 06:15:20 EDT 2002

Alastair Reid <alastair at reid-consulting-uk.ltd.uk> writes:

> > (A mutex lock is required to ensure that the pending queue is not
> > traversed more than once, but that is all I think.)
> 
> I don't understand how this would work in single-threaded systems (like
> I understand NHC to be).

Sorry, I am probably abusing the terminology.  What I mean is that in
the following single-threaded situation:
  * GC has just completed
  * the pending finaliser queue is being run
  * a finaliser has sufficient allocation to trigger another GC
  * after that GC, the pending finaliser queue is run again
the final step is excluded.  i.e. the second time the RTS enters
the pending queue, it realises the queue is already in progress and
returns immediately without doing any work.  No blocking involved.

> Not having a preemptive scheduler means:
> 
> 1) That the code was not written under the additional constraints of
>    worrying about atomicity of data structure modifications.
> 
>    Here, the code referred to is mostly the C code used to implement
>    the Hugs runtime system and large chunks of the standard libraries.

So are you saying that if a GC were to occur in the middle of a
C-implemented Hugs primitive, it could be bad news?  Surely there must
be some mechanism for ensuring that allocation during a primitive
call is safe?  For instance, any primitives in nhc98's RTS that
allocate on the heap exclude the possibility of GC by checking there
is enough free space before doing anything.  (By contrast, FFI calls
(a) cannot heap-allocate, and (b) do not copy any heap pointers,
so a GC would be harmless.)

> 2) That there is no blocking mechanism handy with which to implement
>    the mutexes needed to make data structure modifications atomic.

Let's see if I grasp this right.  With a non-preemptive scheduler,
data structure modifications are guaranteed to be atomic, because all
context switches are explicit (e.g. at certain IO calls).  If anything
breaks this assumption (e.g. if the GC runs a finaliser pre-empting
the main thread), then atomicity is lost.  Ok, fair enough, I think
I see the problem.

> I find it a bit misleading because it seems to suggest that NHC will
> not suffer from the same problem because NHC doesn't have a
> cooperative scheduler.

nhc98 doesn't have a scheduler at all.  However, I think the point
you are getting at is atomicity.  I'm afraid I have been rather slow
to see that we are talking about /global variables/!  Protection of
data structure access is guaranteed in a single-threaded system by
simple referential transparency.  But you are suggesting that the
main use for finalisers is to fiddle "behind the scenes" with data
structures that are shared with other parts of the program.  Eeek.
Suddenly I'm not surprised that there are difficulties.

> That is, you have to provide mutexes (because the only useful
> finalizers are those that access shared mutable data structures).

I'm afraid I don't agree that the /only/ useful finalisers manipulate
global variables.  I can think of plenty of more tractable situations,
e.g. releasing the various pieces of storage for a foreign value built
of several components, where a finaliser written in Haskell would be
preferable to one in C.

However, given that you do want to manipulate shared global variables,
I still have a question.  Why not build a queue of pending finalisers,
make each one into a co-operative thread, and add those threads to the
pool of threads being managed by Hugs' normal scheduler?  It seems to
me that this guarantees the atomicity you need.  The only downside is:

> (because waiting until the main thread terminated or
> took another lock would be unacceptable).

So why is this unacceptable?  You need to delay the finaliser for
correctness.  By the assumptions of the co-operative scheduling
model, the only safe moment to run the finaliser is when the current
thread releases control.  But now you seem to say that this is bad,
because you want efficiency/timeliness, and efficiency means you need
pre-emption.  Make your mind up!  Either you accept the co-operative
scheduling model and put up with lack of timeliness, or you decide
that timeliness is more important and move to pre-emption.

Regards,
    Malcolm