Raw I/O library proposal, second (more pragmatic) draft

John Meacham john@repetae.net
Tue, 5 Aug 2003 14:00:17 -0700


On Tue, Aug 05, 2003 at 12:34:03AM -0700, Seth Kurtzberg wrote:
> On Tuesday, August 5, 2003, at 12:30  AM, Ben Rudiak-Gould wrote:
> >On Tue, 5 Aug 2003, Seth Kurtzberg wrote:
> >>For my purposes (transaction logging for my database server) I need to
> >>be able to guarantee that data is written to disk.  That is, it isn't
> >>enough to disable buffering in the compiler libraries (all libraries,
> >>more accurately), I need to also force the O/S to flush the data to
> >>disk.
> >>
> >>This is difficult to do in a portable manner, obviously, but if a
> >>practical way can be found it would have many uses in systems using
> >>transactional semantics.  It would also get rid of an FFI dependency
> >>for my code.
> >
> >My intended semantics for the osFlush function was always that it 
> >would do
> >its best to ensure that the data was "pushed as far as possible" toward
> >its final destination.
> >
> >If you need a guarantee, the function could be made to return a Bool, 
> >with
> >True indicating that it was absolutely sure that the data had made it 
> >all
> >the way. But I don't think that it could ever return True. It might be
> >running in a VMware sandbox without realizing it, for example. So 
> >you'll
> >probably have to run tests on your particular setup to see how well it
> >works.
> 
> That is certainly true, but to get even that far the semantics have to 
> exist.  You've answered my question; osFlush means (assuming that the 
> O/S can provide the functionality) flush to permanent storage.

There are three useful levels of flush that I can think of. flush all
userspace buffers to the OS, flush all data to disk. flush all data and
metadata to disk. the os interfaces would be fflush (well the internal
haskell equivalant), fdatasync, and fsync.

I think there is a use for all of them. in particular, being able to
flush to the os without doing an fsync is good for network traffic
because always fsyncing can mess with the normal TCP packet
consolodation logic. (on some OSes).  fdatasync vs fsync is useful.
fdatasync can be much faster with many types of filesystems and is
usually what people want.
        John


-- 
---------------------------------------------------------------------------
John Meacham - California Institute of Technology, Alum. - john@foo.net
---------------------------------------------------------------------------