Working character by character in Haskell

Albert Lai trebla@vex.net
18 Oct 2001 20:42:55 -0400


"Andre W B Furtado" <awfurtado@uol.com.br> writes:

[...]

> For example, suppose function doSomeStuffWith returns its own parameter.
> Using a 1.5MB file in this case, the Haskell program ends in almost 5
> seconds, while the C program ends in less than 0.5 seconds... Is my Haskell
> program too bad implemented? (I'm using GHC 4.08-1 under a Windows 98
> platform.)

I indeed think that your Haskell program is inefficient and is not a
translation of your C program.  It reverses a huge string, which takes
not only execution time but also garbage collection time and memory,
which entails even more time for OS overhead and swapping out other
things in the physical memory.

(Many programmers complain "my linear-time algorithm takes just 1
second on a 100M list, so why is it taking 5 seconds on a 200M list?"
They forget that because of issues such as cache locality and virtual
memory, their computers do not scale.)

I am wondering that if doSomeStuffWith is pure-functional, why are you
writing and using copyFile instead of just using map?  I mean:

main :: IO ()
main = do
 bmFile <- openFileEx "in.txt" (BinaryMode ReadMode)
 bmString <- hGetContents bmFile
 writeFile "out.txt" (map doSomestuffWith bmString)
 hClose bmFile

Because both hGetContents and map are lazy and use O(1) memory, this
program behaves exactly as your C program plus occasional garbage
collections that don't hurt too much.  In partcular, the OS will see
no difference (although the cache does).