POpen, opening lots of processes

Glynn Clements glynn.clements at virgin.net
Fri Jan 9 01:33:19 EST 2004


Hal Daume III wrote:

> > What does the output from "ps" indicate?
> 
> It lists all the processes as defunct:
> 
> 19981 pts/5    Z      0:00 [suffixtree <defunct>]
> 19982 pts/5    Z      0:00 [suffixtree <defunct>]
> 19983 pts/5    Z      0:00 [suffixtree <defunct>]
> 19984 pts/5    Z      0:00 [suffixtree <defunct>]
> 19985 pts/5    Z      0:00 [suffixtree <defunct>]
> ...
> 
> > If you have any "live" processes (S or R state), it's probably because
> > the process' output hasn't been consumed, so the program hasn't
> > exit()ed yet. OTOH, if you have zombies (Z state), the program has
> > terminated but the parent (your program) hasn't called wait/waitpid
> > (the Haskell interface is getProcessStatus, getProcessGroupStatus or
> > getAnyProcessStatus).
> 
> I don't mind evaluating the contents returned strictly, but I can't figure 
> out how to force the process into a dead state...I don't see how any of 
> these three functions accomplishes that...what am I missing?

A "zombie" process (such as the above) is a process which has
terminated but which can't actually be removed from the system's
process table until the parent has retrieved its exit status.

That's where getProcessStatus etc (wait/waitpid at the C level) come
in; these functions block until a suitable[1] process has terminated,
and return its exit status. After which, the process can finally be
deleted (this is termed "reaping").

[1] getProcessStatus waits for a specific process,
getProcessGroupStatus waits for any process in a specific process
group, and getAnyProcessStatus waits for any child process.

So, you probably want something like:

do
	(stdout, stderr, pid) <- popen cmd args (Just input)
	-- consume stdout + stderr, e.g.:
	writeFile "/dev/null" stdout
	writeFile "/dev/null" stderr
	getProcessStatus True False pid

You need to ensure that the output is consumed before calling
getProcessStatus, otherwise getProcessStatus will block indefinitely
(i.e. deadlock).

IMNSHO, this is one area where lazy I/O sucks even more than usual. 
It's bad enough having it tie up descriptors, let alone processes.

It probably works fine for "simple, stupid programs", which spawn a
handful of child processes (which they don't bother to reap) and then
terminate shortly thereafter (a process whose parent has terminated is
"adopted" by the init process, which can be relied upon to reap it
when it terminates).

-- 
Glynn Clements <glynn.clements at virgin.net>


More information about the Glasgow-haskell-users mailing list