[Haskell-cafe] Patterns for processing large but finite streams

Fri Jul 1 12:08:51 CEST 2011

After the list discussion, I'm surprised nobody mentioned Max Rabkin/Conal
Elliott's blog posts on folds and zips.

http://squing.blogspot.com/2008/11/beautiful-folding.html
http://conal.net/blog/posts/enhancing-a-zip/

They develop a formalism for zipping functions on lists.

Iteratee's `zip` set of functions are somewhat similar, but not quite the
same.  Specifically they still require multiple traversals of the data, but
only over a bounded portion of it, so they're much more efficient.  Of
course you could combine the above patterns with iteratees by creating
functions as above, then just running them with a 'fold'.

John L.

> From: Eugene Kirpichov <ekirpichov at gmail.com>
>
> Thanks but I'm afraid that's still not quite what I'm looking for;
> guess I'll have to define my desire by my implementation - so once
> it's ready I'll show the result to cafe :)
>
> 2011/7/1 Alexey Khudyakov <alexey.skladnoy at gmail.com>:
> > On Fri, Jul 1, 2011 at 12:54 PM, Eugene Kirpichov <ekirpichov at gmail.com>
> wrote:
> >> Alexey, your definition of "mean" does not look like "liftS2 (/) sum
> >> length" - you have to manually "fuse" these computations.
> >>
> > Well it was fused for numerical stability
> >
> >> I'm asking for a formalism that does this fusion automatically (and
> >> guaranteedly).
> >>
> > Joining accumulators is quite straightforward. So is joining of initial
> > state. Just creating a
> >> joinAcc :: (acc1 -> x -> acc1) -> (acc2 -> x -> acc2) -> (acc1,acc2) ->
> x -> (acc1,acc2)
> >> joinAcc f1 f2 (s1,s2) x = (f1 s1 x, f2 s2 x)
> >
> > Still you have to handle them separately.
> >> sum' = foldl (+) 0
> >> len ?= foldl (\n _ -> n+1) 0
> >> sumLen = foldl (joinAcc (+) (\n _ -> n+1)) (0,0)
> >
> > There is more regular approach but it only works with statistics.
> > (function which do not depend on order of elements in the sample)
> > For every statistics monoid for its evaluation could be constructed.
> > For example sum:
> >> newtype Sum a = Sum a
> >> instance Num a => Monoid (Sum a) where
> >> ? mempty = Sum 0
> >> ? mappend (Sum a) (Sum b) = Sum (a+b)
> >
> > Composition of these monoids becomes trivial. Just use
> >
> >
> > I pursued this approach in monoid-statistics[1] package.
> > It's reasonably well documented
> >
> > ?[1] http://hackage.haskell.org/package/monoid-statistics
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110701/3bac36e6/attachment.htm>