[Haskell-beginners] Problem "grouping" a list

João Paulo Pizani Flor joaopizani at gmail.com
Wed Nov 24 06:08:26 EST 2010


Thank you all for the helpful responses! The solution from Daniel Fischer
worked like a charm, the solution from Aai kinda worked, but I can't use
because I don't want to modify the order of the elements in the list...

And yes, my "criterion" was a bit ill-defined, so the very clever function
from Ozgur satisfied it :P The real goal is to maximize the number of groups
though.

I'll test the solutions more hardly today and try to make the functions from
Daniel even more clearer and elegant. I post the results back here!


Thanks a lot!

João Paulo Pizani Flor
joaopizani at gmail.com
Federal University of Santa Catarina


2010/11/23 Daniel Fischer <daniel.is.fischer at web.de>

> On Tuesday 23 November 2010 18:23:28, João Paulo Pizani Flor wrote:
> > Hello dear Haskellers!
> >
> > I've been a user and big fan of Haskell for a while, but only now I'm
> > writing my first "big" project in Haskell (some thousands of lines of
> > code perhaps). It's an interpreter for a programming language, the
> > source code is music! Yes, sheet music! :D
> >
> > OK, so my specific problem goes like this: I have a list of tuples
> >
> > :t  myList
> >
> > [ (Float, Integer) ]
> >
> > And I want to "group" this list into a nested list
> > groupAtoms :: [ (Float,Integer) ]  ->  [ [(Float,Integer)] ]
> >
> > Of course, I want that the concatenation of the groups yield me the
> > original list, i.e,  (foldl (++) [] . groupAtoms == id),
>
> Here, foldr is the better choice. (++) can start delivering its result
> before inspecting its second argument, hence
>
> foldr (++) [] (list1 : list2 : list3 : ...)
>  = list1 ++ (foldr (++) [] (list2 : list3 : ...))
>  = list1 ++ (list2 ++ (foldr (++) [] (list3 : ...)))
>
> can be consumed as it is constructed, so it can run in constant space, and
> it can stop early if not the entire result is needed. And it can deal with
> infinite lists.
>
> The prelude function concat does this.
>
> foldl on the other hand can't deliver anything before the whole input list
> has been traversed, in particular it doesn't terminate on infinite input,
> it needs at least O(length result) space and it can't stop early.
> Additionally, foldl (++) [] constructs left-nested (++)-applications, which
> are bad for performance.
>
> > and the criterion that defines a group is that:
> > "The sum of the first elements in the tuples comprising the list must be
> > greater than or equal to 1.0". That is, given a list of tuples, the
> > boolean predicate deciding whether this list is a PROPER group (True) or
> > TOO SMALL (False) is:
> > \g -> sum (map fst g)  >=  1.0
> >
> >
> > Although the criterion is very clear, I've tried hard until now and
> > couldn't come up with a function for producing the nested list based on
> > this grouping criterion.
>
> Split the task into two parts.
>
> 1. collect the first group and the remainder of the list as a pair of lists
> 2. cons the first group to what recursion on the remainder delivers
>
> Very much like the definition of `lines'.
>
> For your use-case, the condition when a group is complete doesn't only
> depend on the list element but also on the previous, concretely the running
> sum of the first components.
> For a general such function, you need
> - the stopping condition
> - an accumulation parameter
> - an update function for the accumulating parameter
> as arguments.
> For example (stupid names, sorry)
>
>
> breakAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> ([a],[a])
> breakAcc _    _    _   []       = ([],[])
> breakAcc cond comb acc xxs@(x:xs)
>    | cond acc                  = ([],xxs)
>    | otherwise                 = (x:ys,zs)
>      where
>        (ys, zs) = breakAcc cond comb (comb x acc) xs
>
> solves part 1, then
>
>
> groupsAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> [[a]]
> groupsAcc _    _    _   []  = []
> groupsAcc cond comb acc xs  = let (g,tl) = breakAcc cond comb acc xs
>                              in g : groupsAcc cond comb acc tl
>
> gives you part 2, and your particular function is
>
>
> groupAtoms :: [(Float,Integer)] -> [[(Float,Integer)]]
> groupAtoms = groupsAcc (>= 1.0) ((+) . fst) 0
>
>
> Of course the last group need not be proper.
> If there are long groups, the above can cause a space leak (as lines had in
> GHC < 7), that can be avoided by sacrificing a little non-strictness and
> replacing the let (g,tl) = ... in with a
>
> case ... of
>  (g,tl) -> g : groupsAcc ...
>
> or, if you need the non-strictness, with a slightly convoluted method as
> used for lines in GHC 7.
>
> > I am sure that the Haskell Hierarchical
> > Libraries have something to help me, but only now I see I'm still a big
> > noob :P
> >
> > Could someone please help me writing this function?
> >
> >
> > My best regards from Brazil,
> >
> > João Paulo Pizani Flor
> > joaopizani at gmail.com
> > Federal University of Santa Catarina
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20101124/04d42414/attachment.html


More information about the Beginners mailing list