[Haskell-cafe] Patterns for processing large but finite streams

Ketil Malde ketil at malde.org
Fri Jul 1 13:18:45 CEST 2011


Eugene Kirpichov <ekirpichov at gmail.com> writes:
> 2011/7/1 Heinrich Apfelmus <apfelmus at quantentunnel.de>:
>> Eugene Kirpichov wrote:

>>> I'm rewriting timeplot to avoid holding the whole input in memory, and
>>> naturally a problem arises:

>> Plain old lazy lists?

Heretic! :-)

I generally have written a bunch of programs that do things that way,
and I think it works pretty well with a couple of caveats:

 1. Make sure you collect data into strict data structures.  Dangerous
 operations are addition and anything involving Data.Map.  And use
 foldl'.

 2. If you plan on working on multiple files, extra care might be needed
 to close them, or you'll run out of file descriptors.

As long as you avoid these pitfalls, the advantage is very clean and
simple code.

> Plain old lazy lists do not allow me to combine multiple concurrent
> computations, e.g. I cannot define average from sum and length.

Yes, this is clunky.  I'm not aware of any good solution.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants



More information about the Haskell-Cafe mailing list