[Haskell-cafe] Re: Implementing "unionAll"

Leon Smith leon.p.smith at gmail.com
Wed Feb 17 20:40:16 EST 2010


On Wed, Feb 17, 2010 at 6:58 AM, Heinrich Apfelmus
<apfelmus at quantentunnel.de> wrote:

> Ah, I meant to use the  union'  from your previous message, but I think
> that doesn't work because it doesn't have the crucial property that the case
>
>    union (VIP x xs) ys = ...
>
> does not pattern match on the second argument.

Ahh yes,  the funny thing is that I tested the code in my previous
message,  and it worked in the infinite case.   Then I replaced the
union' to pattern match on the second argument as well,  and tested it
on only finite cases,  and then released it.    Thus,  unionAll in
data-ordlist-0.4.1 doesn't work on an infinite number of lists.

So my original unionAll in data-ordlist-0.4  appears to work ok,   my
revised and simplified unionAll doesn't work at all.

> The easiest solution is simply to define
>
>    unionAll = nub . mergeAll
>        where
>        -- specialized definition of  nub
>        nub = map head . groupBy (==)

Incidentally,  data-ordlist has a (slightly different) version of nub
that does exactly what you want in this particular case.    Check out
the documentation for "nub" and "nubBy"

> But you're probably concerned that filtering for duplicates afterwards
> will be less efficient. After all, the (implicit) tree built by
> mergeAll  might needlessly compare a lot of equal elements.

Well,  yes and no.   Efficiency is good,  but this implementation does
not match my intention.    For example:

    unionAll [[1,1,2,2,2],[1,1,1,2]] == foldr union [] [...] == [1,1,1,2,2,2]

The "union" function preserves strictly ascending lists,  but it also
works on multisets as well,  returning an element as many times as the
maximum number of times in either list.    Thus, on an infinite number
of lists,   unionAll should return a particular element as many times
as the maximum number of times it appears in any single list.

On Wed, Feb 17, 2010 at 1:18 PM, Daniel Fischer
<daniel.is.fischer at web.de> wrote:
> Am Mittwoch 17 Februar 2010 18:59:42 schrieb Ozgur Akgun:
>> Ooops I thought the inner lists are possibly of infinite size.
>>
>
> Both, I think.

Yes,  both the inner and outer lists of an input to unionAll might be
infinite.    It's just that

    foldr union []

works fine if the inner lists are infinite,  but gets stuck in an
infinite non-productive list if the outer list is infinite.

Best,
Leon


More information about the Haskell-Cafe mailing list