Proposal: add Int indexing functions to Data.Set

Wolfram Kahl kahl at cas.mcmaster.ca
Fri Apr 29 18:56:27 CEST 2011


On Fri, Apr 29, 2011 at 09:30:24AM -0700, Luis Casillas wrote:

 > I'm working on a project where the most natural implementation is to
 > use the members of runtime-constructed sets to index the elements of
 > an array.  The use of sets to represent the index sets is
 > independently motivated, because I need to operate on these sets using
 > operations like union, intersection, difference, mapping and
 > filtering, while preserving the member uniqueness property.
 >  
 > The lookupIndex, findIndex and elemAt functions would give me this set
 > member/array index translation "for free."  There are other ways to
 > construct such a mapping, but I think they suffer in comparison:

My use case, interfacing with a BDD-based library for fast
implementations of relation-algebraic operations, has the same
characteristics and motivations.


Similar to what has been mentioned previously by others,
I will choose the data structure for the API it exports
together with the API's specification and performance characteristics.
Even with the same specification for a common core API,
there is space for different implementations that offer
different API extensions and/or different performance characteristics.

Module APIs are not a very useful abstraction element in Haskell;
if we want the Data.Map (or Data.Set) API to be open for other implementations,
we should make it a class instead.

Since Data.Set wraps an implementation that is known to support
these index operations more efficiently than is possible to implement
by using that API, and these operations do have uses,
they should be exported.

For more discussion of design considerations of
implementations versus class interfaces,
see for example Edison:

@Article{Okasaki-2001,
  author =       {Chris Okasaki},
  title =        {An Overview of {Edison}},
  journal =      ENTCS,
  year =         2001,
  volume =    41,
  number =    1,
  pages = {60--73},
  DOIURL = {http://dx.doi.org/10.1016/S1571-0661(05)80546-8},
  DOI = {10.1016/S1571-0661(05)80546-8},
  note =      {(Original version in Haskell Workshop, September 2000,
                pp.\null{} 34--45)},
  abstract =    {Edison is a library of functional data structures
     implemented in Haskell. It supports three main families of abstractions:
      sequences, collections (e.g., sets and priority queues),
      and associative collections (e.g., finite maps).
      This paper summarizes the design of Edison,
      with particular attention to how that design is influenced
      by details of Haskell.}
}


<ShamelessPlug>
Somewhat related:

  http://www.cas.mcmaster.ca/~kahl/Haskell/ModuleTools/

@InProceedings{Kahl-2009_TFP,
  author =       {Wolfram Kahl},
  title =        {Haskell Module Tools for Liberating Type Class Design},
  crossref =  {TFP2009},
  pages =     {129--144},
  chapter = {9},
  WKsource =    {svn/WK/doc/wk/HASKELL/Restricted.lhs},
  abstract =    {Design of Haskell type class hierarchies for complex purposes,
                 including for standard containers, is a non-trivial exercise,
                 and evolution of such designs
                 is additionally hampered by the large overhead
                 of connecting to existing implementations.

                 We systematically discuss this overhead,
                 and propose a tool solution, implemented using the GHC API,
                 to automate its generation.}
}

@Book{TFP2009,
  title =     {Trends in Functional Programming, {TFP 2009}},
  booktitle = {Trends in Functional Programming, {TFP 2009}},
  year =      2010,
  editor =    {Zolt\'an Horv{\'a}th and Vikt\'oia Zs{\'o}k and Peter Achten and Pieter Koopman},
  address =   {UK},
  publisher = {Intellect},
  URL = {http://www.intellectbooks.co.uk/books/view-Book,id=4740/}
}

</ShamelessPlug>



Wolfram



More information about the Libraries mailing list