<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>
<div>Using ByteStrings and the C calls does indeed speed things up a bit, but not much.</div><div><br></div><div>real<span class="Apple-tab-span" style="white-space:pre">        </span>0m6.053s</div><div>user<span class="Apple-tab-span" style="white-space:pre">        </span>0m1.480s</div><div>sys<span class="Apple-tab-span" style="white-space:pre">        </span>0m4.550s</div><div><br></div><div>For your interest:</div><div>The original version (with Strings and openFile): <a href="http://hpaste.org/73803" style="font-size: 10pt; ">http://hpaste.org/73803</a></div><div>Faster (with Strings and c_open): <a href="http://hpaste.org/73802">http://hpaste.org/73802</a></div><div>Even faster (with ByteStrings and c_open): <a href="http://hpaste.org/73801" style="font-size: 10pt; ">http://hpaste.org/73801</a></div><div><br></div><div>The problem may be that even with ByteStrings, we are stuck using show, and thus Strings, at some point.</div><div><br></div><div>Ideas?</div><div><br></div><br><div><div id="SkyDrivePlaceholder"></div>> From: johan.tibell@gmail.com<br>> Date: Mon, 27 Aug 2012 13:48:27 -0700<br>> Subject: Re: I/O overhead in opening and writing files<br>> To: arc38813@hotmail.com<br>> CC: glasgow-haskell-users@haskell.org<br>> <br>> On Mon, Aug 27, 2012 at 1:43 PM, J Baptist <arc38813@hotmail.com> wrote:<br>> > I'm looking into high-performance I/O, particularly on a tmpfs (in-memory)<br>> > filesystem. This involves creating lots of little files. Unfortunately, it<br>> > seems that Haskell's performance in this area is not comparable to that of<br>> > C. I assume that this is because of the overhead involved in opening and<br>> > closing files. Some cursory profiling confirmed this: most of the runtime of<br>> > the program is in taken by openFile, hPutStr, and hClose.<br>> ><br>> > I thought that it might be faster to call the C library functions exposed as<br>> > foreign imports in System.Posix.Internals, and thereby cut out some of<br>> > Haskell's overhead. This indeed improved performance, but the program is<br>> > still nearly twice as slow as the corresponding C program.<br>> ><br>> > I took some benchmarks. I wrote a program to create 500.000 files on a tmpfs<br>> > filesystem, and write an integer into each of them. I did this in C, using<br>> > the open; and twice in Haskell, using openFile and c_open. Here are the<br>> > results:<br>> ><br>> > C program, using open and friends (gcc 4.4.3)<br>> > real 0m4.614s<br>> > user 0m0.380s<br>> > sys 0m4.200s<br>> ><br>> > Haskell, using System.IO.openFile and friends (ghc 7.4.2)<br>> > real 0m14.892s<br>> > user 0m7.700s<br>> > sys 0m6.890s<br>> ><br>> > Haskell, using System.Posix.Internals.c_open and friends (ghc 7.4.2)<br>> > real 0m7.372s<br>> > user 0m2.390s<br>> > sys 0m4.570s<br>> ><br>> > Why question is: why is this so slow? Could the culprit be the marshaling<br>> > necessary to pass the parameters to the foreign functions? If I'm calling<br>> > the low-level function c_open anyway, shouldn't performance be closer to C?<br>> > Does anyone have suggestions for how to improve this?<br>> ><br>> > If anyone is interested, I can provide the code I used for these benchmarks.<br>> <br>> Please do. You can paste them at http://hpaste.org/<br>> <br>> Could you try using the Data.ByteString API. I don't have the code in<br>> front of me so I don't know if the System.Posix API uses Strings. If<br>> it does, that's most likely the issue.<br>> <br>> -- Johan<br></div>                                            </div></body>
</html>