[Haskell-cafe] [newbie] processing large logs

Eugene Crosser crosser at average.org
Sun May 14 08:16:07 EDT 2006


Udo Stenzel wrote:
> Eugene Crosser wrote:
>> Having read "Yet another Haskell tutorial" (note on p.20), doesn't foldl
>> have to read the complete list before it can start processing it
>> (beginning from the last element)?  As opposed to foldr that can fetch
>> elements one by one as they are needed?
> 
> Both foldl and foldr start from the left of the list; dictated by the
> structure of the list datatype nothing else is possible.  The actual
> difference is that foldl passes an accumulator along and returns the
> final value of the accumulator.  This also means that foldl is tail
> recursive and foldr isn't.

I think that I get it now.  foldl will actually yield any result when it
hits the end of the list, while foldr will give you partial result (if
partial result makes any sense, that is) after each iteration.  And to
get any advantage of the latter, you need to be able to consume that
"partial result" element-by-element.  Right?

Anyway, I understand that you used 'seq' in your example as a way to
"strictify" the function that updates accumulator.  Could you (or
anyone) explain (in plain English, preferably:) the reason why 'seq' is
the way it is.  In the first place, why does it have the first argument
at all, and what should you put there?

Eugene

P.S. just FYI: after the changes, my benchmark program stops growing
with the growth of data set, and in compiled form it has the same RAM
footprint as the equivalent (interpreted) perl script.  Still, it
consumes 20 times more CPU...

P.P.S. Thanks people, you are really helpful!

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.haskell.org//pipermail/haskell-cafe/attachments/20060514/4ec8ca4f/signature-0001.bin


More information about the Haskell-Cafe mailing list