Proposal: more general unionWith for Data.Map

Johan Tibell johan.tibell at gmail.com
Wed Jan 25 00:27:32 CET 2012


On Tue, Jan 24, 2012 at 3:20 PM, Milan Straka <fox at ucw.cz> wrote:
> Hi,
>
>> merge :: (a -> b -> Maybe c) -> (a -> Maybe c) -> (b -> Maybe c) ->
>> Map k a -> Map k b -> Map k c
>> mergeWithKey :: (k -> a -> b -> Maybe c) -> (k -> a -> Maybe c) ->
>> (k-> b -> Maybe c) -> Map k a -> Map k b -> Map k c
>>
>> where the function arguments are
>> what to use when the key appears in both maps
>> what to use when the key appears in just the left map
>> what to use when the key appears in just the right map.
>>
>> It seems like it would be the ultimate fully general fast map merging routine.
>>
>> intersection = merge (Just . const) (const Nothing) (const Nothing)
>> union = merge (Just . const) Just Just
>> xor = merge ( const Nothing . const) Just Just
>>
>> and so forth...
>
> I agree with the ultimate generality, but the problem is efficiency. For
> example running
>  intersection (fromList [1..1000]) (fromList [1001..2000])
> will end very soon with empty result, but
>  merge (Just . const) (const Nothing) (const Nothing) (fromList [1..1000]) (fromList [1001..2000])
> will call the (const Nothing) for 2000 times.

I have managed to use very general functions inside
unordered-container with maintained performance by using this pattern:

    -- Internal only. Always inlined.
    merge' :: (a -> b -> Maybe c) -> (a -> Maybe c) -> (b -> Maybe c)
-> Map k a -> Map k b -> Map k c
    {-# INLINE merge' #-}

    merge :: (a -> b -> Maybe c) -> (a -> Maybe c) -> (b -> Maybe c)
-> Map k a -> Map k b -> Map k c
    merge = merge'
    {-# INLINABLE merge #-}

    intersection = merge' (Just . const) (const Nothing) (const Nothing)
    {-# INLINABLE intersection #-}

    union = merge' (Just . const) Just Just
    {-# INLINABLE union #-}

    xor = merge' (const Nothing . const) Just Just
    {-# INLINABLE xor #-}

By inlining merge the case-of-case, case-of-known-constructor, and
dead-code-elimination optimizations should be able to remove all
unused code.

-- Johan



More information about the Libraries mailing list