[Haskell-cafe] Re: Grouping - Map / Reduce

Martin Huschenbett huschi at gmx.org
Wed Mar 25 04:58:14 EDT 2009


Dear Günther,

the map can't be consumed while it is constructed. At any point during 
its construction you don't know for any key in the map if it will appear 
in the not cosumed rest of the list again. This means you can't process 
any entry of the map because it may change later. The only point when 
nothing will change anymore is when the map is completely constructed.

Regards,

Martin.


GüŸnther Schmidt schrieb:
> Hi Ketil,
> 
> Ketil Malde schrieb:
>> "Gü?nther Schmidt" <gue.schmidt at web.de> writes:
>>
>>> let say I got an unordered lazy list of key/value pairs like
>>>
>>> [('a', 99), ('x', 42), ('a', 33) ... ]
>>>
>>> and I need to sum up all the values with the same keys.
>>>
>>> So far I wrote a naive implementation, using Data.Map, foldl and 
>>> insertWith.
>>
>> Data.Map.fromListWith (+)
>>
>>> The building of this map is of course a bottleneck, the successive
>>> processing needs to wait until the entire list is eventually consumed
>>> the Map is built and flattened again.
>>
>> Sure this is not an artifact of the laziness of foldl?
> 
> well I can't really see how the map could be consumed *while* it's still 
> being built, I just don't see it. (I'm using foldl' and insertWith', sry 
> for not saying so initially).
> 
>>
>>> Is there another way of doing this, something more "streaming
>>> architecture" like?
>>
>> I don't see how you can do this much better - for a small, fixed set
>> of keys, you could use an (STU) array for the sums, but it depends if
>> the added complexity is worth it.  You're already doing a single pass
>> over the data.
>>
>> -k
> 
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe


More information about the Haskell-Cafe mailing list