[Haskell-cafe] Re: sendfile leaking descriptors on Linux?

Jeremy Shaw jeremy at n-heptane.com
Sun Feb 21 20:17:06 EST 2010


On Sun, Feb 21, 2010 at 6:39 PM, Donn Cave <donn at avvanta.com> wrote:

> Quoth Jeremy Shaw <jeremy at n-heptane.com>,
> ...
> > What happens is the PS3 has closed the connection, and if you attempt
> > to send any more packets the PS3 will tell you it has closed the
> > connection and the write() / sendfile() call will raise SIGPIPE.
> ...
> > So far there is:
> >
> >   - no way for anyone besides Bardur to reproduce the problem
> >   - no sound explanation for why the PS3 client causes the error,
> >     but nothing else does
>
> I think in fact this invalidates your premise.  If the PS3 really
> closed its connection in the standard fashion, then it would be trivial
> to reproduce this problem with any other peer.  Evidently it doesn't,
> at least in this particular case, and that's why people are talking
> about TCP keep-alives, which address the defunct peer problem (within
> two hours, normally.)


The PS3 does do something though. If we were doing a write *and* read select
on the socket, the read select would wakeup. So, it is trying to notify us
that something has happened, but we are not seeing it because we are only
looking at the write select().

But I can not explain what the PS3 client is doing differently than the
other clients such that it does not cause the threadWaitWrite to wakeup.

Additionally, it is not clear that setting SO_KEEPALIVE will actually fix
anything. The documentation that I have read indicates that that may only
cause the read select() to wakeup not the write select(). Well, that is no
good, because that is supposedly what is happening with the PS3 client
already.

Anyway, part of the annoyance here is that in this particular case we
shouldn't need any timeouts to 'guess' that the client is 'probably dead'.
The client seems to be telling us that it has disconnected, but we are not
looking in the right place. And if we did try to write we would get a
sigPIPE error.

It is not the case the the client is unresponsive -- it is quite responsive.
The problem is that we are not looking in the right place for that response.

But, 'looking in the right place' is tricky. How do you tell hPut that it
should wakeup from threadWaitWrite if the Handle happens to be backed by a
socket, and threadWaitRead has data available? That does not even always
indicate an error condition, it can be a perfectly valid situation.

Well, before I think about that, I want to know what the PS3 client is doing
differently such that it is the only client that seems to exhibit this
behavior at the moment. If we do not understand the real difference between
what the PS3 and the C client are doing, then I don't think we can expect to
arrive at an appropriate fix.

- jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100221/c2a469ed/attachment.html


More information about the Haskell-Cafe mailing list