Proposal: more general unionWith for Data.Map

Christian Sattler sattler.christian at gmail.com
Tue Jan 24 18:35:11 CET 2012


Hi all,

There are some high-level operations on maps which take two tree 
traversals only because the interface fails to expose sufficiently 
general functions. This proposal is concerned with an analogue of 
unionWithKey of type Ord k => (k -> a -> a -> Maybe a) -> Map k a -> Map 
k a -> Map k a, with the intended semantics that if a key is present in 
both maps and the operation applied to the key and corresponding values 
returns Nothing, the key is deleted from the result.

Example use: sparse vectors are represented as maps from indices to 
non-zero coordinates. Often, a highly sparse vector u needs to be a 
added to a rather dense sparse vector v, i.e. we have m << n where m and 
n are the counts of non-zero elements in u and v, respectively. I need 
this operation to be O(m + n) as well as O(m * log(n)), and it should be 
as efficient as reasonable possible. This seems like a straightforward 
use case for unionWith since its hedge-union algorithm guarantees O(m * 
log(n)). The only problem occurs when u[i] = x and v[i] = -x for some 
index i and non-zero x, resulting in a zero entry I would rather have 
deleted from the result (this is a common case when dealing with 
matrices with small integer entries). The same problems occurs for 
coordinate-wise vector multiplication over non-integral domains, this 
time with intersectionWith.

Remarks:
- It would align the types of unionWith and intersectionWith with 
differenceWith.
- The concept of allowing operations with return type Maybe a to avoid 
an additional high-level operation is already expressed in functions 
like alter or mapMaybeWithKey.
- It would enable an efficient implementation of symmetricDifference = 
unionMaybeWithKey (const Nothing).
- This concerns Data.Map, Data.IntMap, Data.Set, Data.IntSet and the 
With[Key] versions of union/intersection/difference
- At least for Data.Map.unionWithKey, modifying the existing 
implementation seems trivial, requiring a merge instead of a join in the 
new case.

I apologize if some of these points have been brought up before.

Christian



More information about the Libraries mailing list