[Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

Michael Snoyman michael at snoyman.com
Fri Mar 8 07:50:50 CET 2013


One clarification: it seems that sourceFile and sourceFileNoHandle have
virtually no difference in speed. The gap comes exclusively from sinkFile
vs sinkFileNoHandle. This makes me think that it might be a buffer copy
that's causing the slowdown, in which case the benchmark may in fact be
accurate.
On Mar 8, 2013 8:30 AM, "Michael Snoyman" <michael at snoyman.com> wrote:

> Hi all,
>
> I'm turning to the community for some help understanding some benchmark
> results[1]. I was curious to see how the new io-streams would work with
> conduit, as it looks like a far saner low-level approach than Handles. In
> fact, the API is so simple that the entire wrapper is just a few lines of
> code[2].
>
> I then added in some basic file copy benchmarks, comparing conduit+Handle
> (with ResourceT or bracket), conduit+io-streams, straight io-streams, and
> lazy I/O. All approaches fell into the same ballpark, with conduit+bracket
> and conduit+io-streams taking a slight lead. (I haven't analyzed that
> enough to know if it means anything, however.)
>
> Then I decided to pull up the NoHandle code I wrote a while ago for
> conduit. This code was written initially for Windows only, to work around
> the fact that System.IO.openFile does some file locking. To avoid using
> Handles, I wrote a simple FFI wrapper exposing open, read, and close system
> calls, ported it to POSIX, and hid it behind a Cabal flag. Out of
> curiosity, I decided to expose it and include it in the benchmark.
>
> The results are extreme. I've confirmed multiple times that the copy
> algorithm is in fact copying the file, so I don't think the test itself is
> cheating somehow. But I don't know how to explain the massive gap. I've run
> this on two different systems. The results you see linked are from my local
> machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle
> code was still 75% faster than the others.
>
> My initial guess is that I'm not properly tying into the IO manager, but I
> wanted to see if the community had any thoughts. The relevant pieces of
> code are [3][4][5].
>
> Michael
>
> [1] http://static.snoyman.com/streams.html
> [2]
> https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs
> [3]
> https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc
> [4]
> https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54
> [5]
> https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130308/c7a84ae2/attachment.htm>


More information about the Haskell-Cafe mailing list