[Haskell-cafe] Re: Processing of large files

Scott Turner p.turner at computer.org
Mon Nov 1 23:11:12 EST 2004


On 2004 November 01 Monday 16:48, Alexander N. Kogan wrote:
> Sorry, I don't understand. I thought the problem is in laziness - 
You're correct. The problem is laziness rather than I/O.
> my list 
> of tuples becomes ("qqq", 1+1+1+.....) etc and my program reads whole file
> before it starts processing. Am I right or not? If I'm right, how can I
> inform compiler that  my list of tuples should be strict?

The program does not read the whole file before processing the list. You might 
expect that it would given that most Haskell I/O take place in exactly the 
sequence specified.  But readFile is different and sets things up to read the 
file on demand, analogous to lazy evaluation.

The list of tuples _does_ need to be strict. Beyond that, as Ketil Malde said, 
you should not use foldl -- instead, foldl' is the best version to use when 
you are recalculating the result every time a new list item is processed.

To deal with the list of tuples, you can use 'seq' to ensure that its parts 
are evaluated.

For example, change 
     (a,b+1):xs
to
     let b' = b+1 in b' `seq` ((a,b'):xs)
'seq' means evaluate the first operand (to weak head normal form) prior to 
delivering the second operand as a result.  Similarly the expression 
    merge xs x
needs to be evaluated explicitly.




More information about the Haskell-Cafe mailing list