simpler I/O buffering [was: RE: An IO Question from a Newbie]

Tue Sep 16 03:04:01 EDT 2003

Dean Herington wrote:

> I've long thought that I/O buffering behavior--not just in Haskell, but 
> in most places I've encountered it--was unnecessarily complicated.  
> Perhaps it could be simplified dramatically by treating it as strictly a 
> performance optimization.

This isn't entirely possible; there will always be situations where it
matters as to exactly when and how the data gets passed to the OS. My
experience taught me that the simplest solution was never to use ANSI
stdio buffering in such situations.

> Here's a sketch of the approach.
> 
> Writing a sequence of characters across the interface I'm proposing is a 
> request by the writing program for those characters to appear at their 
> destination "soon".  Ideally, "soon" would be "immediately"; however, the 
> characters' appearance may deliberately be delayed ("buffered"), for 
> efficiency, as long as such delay is "unobtrusive" to a human user of the 
> program.  Buffering timeouts would depend on the device; for a terminal, 
> perhaps 50-100 ms would be appropriate.  Such an interval would tend not 
> to be noticeable to a human user but would be long enough to effectively 
> collect, say, an entire line of output for output "in one piece".  The use 
> of a reasonable timeout would avoid the confusing behavior where a 
> newline-less prompt doesn't appear until the prompted data is entered.
> 
> With this scheme, I/O buffering no longer has any real semantic content.  
> (In particular, the interface never guarantees indefinite delay in 
> outputting written characters.  Line buffering, if semantically important, 
> needs to be done above the level of this interface.)

That's already true, at least in C: if you output a line which is
longer than the buffer, the buffer will be flushed before it contains
a newline (i.e. the line won't be written atomically).

> Hence, buffering 
> control could be completely eliminated from the interface.  However, I 
> would retain it to provide (non-semantic) control over buffering.  The 
> optional buffer size currently has such an effect.  A timeout value could 
> be added for fine tuning.  (Note that a timeout of zero would have an 
> effect similar to Haskell's current NoBuffering.)  Lastly, the "flush" 
> operation would remain, as a hint that it's not worth waiting even the 
> limited timeout period before endeavoring to make the characters appear.
> 
> Is such an approach feasible?

Possibly.

As things stand, anyone who writes code which relies upon output being
held back until a flush is asking for trouble. So, your approach
wouldn't make it any harder to write correct code, although it might
make it significantly more obvious if code was incorrect.

AFAICT, the biggest problem would be providing an upper bound on the
delay, as that implies some form of preemptive concurrency.

> Has it been implemented anywhere?

Not that I know of.

> Would such behavior best be implemented by the operating system?

No. The OS (i.e. kernel) doesn't know anything about user-space
buffering. Furthermore, one of the main functions of user-space
buffering is to minimise the number of system calls, so putting it
into the OS would be pointless.

> Could it be implemented by the runtime system?

It depends what you mean by "the runtime system"; it would have to be
implemented in user-space.

-- 
Glynn Clements <glynn.clements at virgin.net>