[Haskell-cafe] Re: Lazy IO and closing of file handles

Claus Reinke claus.reinke at talk21.com
Wed Mar 21 08:09:54 EDT 2007


[trigger garbage collection when open runs out of free file descriptors, then try again]
>> so, instead of documenting limitations and workarounds, this issue should be
>> fixed in GHC as well.
>
> This may help in some cases but it cannot be relied upon. Finalizers are
> always run in a separate thread (must be, see
> http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html). Thus, even if
> you force a GC when handles are exhausted, as hugs seems to do, there is no
> guarantee that by the time the GC is done the finalizers have freed any
> handles (assuming that the GC run really detects any handles to be
> garbage).

useful reference to collect!-) but even that mentions giving back os resources such
as file descriptors as one of the simpler cases. running the GC/finalizers sequence
repeatedly until nothing more changes might be worth thinking about, as are possible
race conditions. here is the thread the paper is refering to as one of its origins:

    http://gcc.gnu.org/ml/java/2001-12/msg00113.html
    http://gcc.gnu.org/ml/java/2001-12/msg00390.html

i also like the idea mentioned as one of the alternatives in 3.1, where the finalizer does
not notify the object that is to become garbage, but a different manager object. in this
case, one might notify the i/o handler, and that could take care of avoiding trouble.

in my opinion, if my code or my finalizers hold on to resources i'd like to see freed,
then i'm responsible, even if i might need language help to remedy the situation.
but if i take care to avoid such references, and the system still runs out of resources
just because it can't be bothered to check right now whether it has some left to free,
there is nothing i can do about it (apart from complaining, that is!-).

of course, this isn't new. see, for instance, this thread view:
http://groups.google.com/group/fa.haskell/browse_thread/thread/2f1f855c8ba33a5/74d32070dbcc92fc?lnk=st&q=hugs+openFile+file+descriptor+garbage+collection&rnum=1#74d32070dbcc92fc

where Remi Turk points out System.Mem.performGC, and Simon Marlow
agrees that GHC should do more to free file descriptors, but also mentions that
performGC doesn't run finalizers.

actually, if i have readFile-based code that immediately processes the file contents
before the next readFile, as in Matthew's test code, my ghci (on windows) doesn't
seem to run out of file descriptors easily, but if i force a descriptor leak by leaving
unreferenced contents unprocessed, then performGC does seem to help (not that
this is ideal in general, as discussed in the thread above):

    import System.Environment
    import System.Mem
    import System.IO

    main = do
      n:f:_ <- getArgs
      (sequence (repeat (openFile f ReadMode)) >> return ()) `catch` (\_->return ())
      test1 (take (read n) $ repeat f)

    test1 files = mapM_ doStuff files where
      doStuff f = {- performGC >> -} readFile f >>= print.map length.take 10.lines

interestingly, if i do that, even Hugs seems to need the performGC?

claus

ps. one could even try to go further, and have virtual file descriptors, like virtual
    memory. but that is something for the os, i guess.
 



More information about the Haskell-Cafe mailing list