<div dir="ltr">I would have expected sourceFileNoHandle to make the most difference, since that's one location (write) where you've obviously removed a copy. Does sourceFileNoHandle allocate less?<div><br></div><div style>
Incidentally, I've recently been making similar changes to IO code (removing buffer copies) and getting similar speedups. Although the results tend to be less pronounced in code that isn't strictly IO-bound.</div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Mar 8, 2013 at 2:50 PM, Michael Snoyman <span dir="ltr"><<a href="mailto:michael@snoyman.com" target="_blank">michael@snoyman.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p>One clarification: it seems that sourceFile and sourceFileNoHandle have virtually no difference in speed. The gap comes exclusively from sinkFile vs sinkFileNoHandle. This makes me think that it might be a buffer copy that's causing the slowdown, in which case the benchmark may in fact be accurate.</p>
<div class="HOEnZb"><div class="h5">
<div class="gmail_quote">On Mar 8, 2013 8:30 AM, "Michael Snoyman" <<a href="mailto:michael@snoyman.com" target="_blank">michael@snoyman.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">Hi all,</span></div><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><div>
<span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><br></span></div>I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].</span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)</span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.</span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.</span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].</span><div>
<br></div><div>Michael<br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><br><div><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">[1] </span><a href="http://static.snoyman.com/streams.html" target="_blank">http://static.snoyman.com/streams.html</a></div>
<div>[2] <span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif"><a href="https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs" target="_blank">https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs</a></span></div>
<div><span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">[3] <a href="https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc" target="_blank">https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc</a></span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">[4] <a href="https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54" target="_blank">https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54</a></span><br style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">
<span style="line-height:18px;font-size:13px;font-family:Arial,sans-serif">[5] <a href="https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167" target="_blank">https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167</a></span></div>
</div></div>
</blockquote></div>
</div></div><br>_______________________________________________<br>
Haskell-Cafe mailing list<br>
<a href="mailto:Haskell-Cafe@haskell.org">Haskell-Cafe@haskell.org</a><br>
<a href="http://www.haskell.org/mailman/listinfo/haskell-cafe" target="_blank">http://www.haskell.org/mailman/listinfo/haskell-cafe</a><br>
<br></blockquote></div><br></div>