[Haskell] Can we do better than duplicate APIs? [was: Data.CompactString 0.3]

Jean-Philippe Bernardy jeanphilippe.bernardy at gmail.com
Sat Mar 24 02:21:41 EDT 2007


Hi,

Please look at http://darcs.haskell.org/packages/collections/doc/html/Data-Collections.html
for an effort to make most common operation on bulk types fit in a
single framework.

Also, we expect indexed types to solve, or at least alleviate, some
problems you mention in your "rant".
http://haskell.org/haskellwiki/GHC/Indexed_types

Cheers,
JP.

On 3/23/07, Benjamin Franksen <benjamin.franksen at bessy.de> wrote:
> [sorry for the somewhat longer rant, you may want to skip to the more
> technical questions at the end of the post]
>
> Twan van Laarhoven wrote:
> > I would like to announce version 0.3 of my Data.CompactString library.
> > Data.CompactString is a wrapper around Data.ByteString that represents a
> > Unicode string. This new version supports different encodings, as can be
> > seen from the data type:
> >
> > [...]
> >
> > Homepage:  http://twan.home.fmf.nl/compact-string/
> > Haddock:   http://twan.home.fmf.nl/compact-string/doc/html/
> > Source:    darcs get http://twan.home.fmf.nl/repos/compact-string
>
> After taking a look at the Haddock docs, I was impressed by the amount of
> repetition in the APIs. Not ony does Data.CompactString duplicate the whole
> Data.ByteString interface (~100 functions, adding some more for encoding
> and decoding), the whole interface is again repeated another four times,
> once for each supported encoding.
>
> Now, this is /not/ meant as a criticism of the compact-string package in
> particular. To the contrary, duplicating a fat interface for almost
> identical functionality is apparently state-of-the-art in Haskell library
> design, viz. the celebrated Data.Bytesting, whose API is similarly
> repetitive (see Data.List, Data.ByteString.Lazy, etc...), as well as
> Map/IntMap/SetIntSet etc. I greatly appreciate the effort that went into
> these libraries, and admire the elegance of the implementation as well as
> the stunning results wrt. efficiency gains etc.. However I fear that
> duplicating interfaces in this way will prove to be problematic in the long
> run.
>
> The problems I (for-)see are for maintenance and usability, both of which
> are of course two sides of the same coin. For the library implementer,
> maintenance will become more difficult, as ever more of such 'almost equal'
> interfaces must be maintained and kept in sync. One could use code
> generation or macro expansion to alleviate this, but IMO the necessity to
> use extra-language pre-processors points to a weakness in the language; it
> be much less complicated and more satisfying to use a language feature that
> avoids the repetition instead of generating code to facilitate it. On the
> other side of teh coin, usability suffers as one has to lookup the (almost)
> same function in more and more different (but 'almost equal') module
> interfaces, depending on whether the string in question is Char vs. Byte,
> strict vs. lazy, packed vs. unpacked, encoded in X or Y or Z..., especially
> since there is no guarantee that the function is /really/ spelled the same
> everywhere and also really does what the user expects.(*)
>
> I am certain that most, if not all, people involved with these new libraries
> are well aware of these infelicities. Of course, type classes come to mind
> as a possible solution. However, something seems to prevent developers from
> using them to capture e.g. a common String or ListLike interface. Whatever
> this 'something' is, I think it should be discussed and addressed, before
> the number of 'almost equal' APIs becomes unmanageable for users and
> maintainers.
>
> Here are some raw ideas:
>
> One reason why I think type classes have not (yet) been used to reduce the
> amount of API repetition is that Haskell doesn't (directly) support
> abstraction over type constraints nor over the number of type parameters
> (polykinded types?). Often such 'almost equal' module APIs differ in
> exactly these aspects, i.e. one has an additional type parameter, while yet
> another one needs slightly different or additional constraints on certain
> types. Oleg K. has shown that some if these limitations can be overcome w/o
> changing or adding features to the language, however these tricks are not
> easy to learn and apply.
>
> Another problem is the engineering question of how much to put into the
> class proper: there is a tension between keeping the class as simple as
> possible (few methods, many parametric functions) for maximum usability vs.
> making it large (many methods, less parametric functions) for maximum
> efficiency via specialized implementations. It is often hard to decide this
> question up front, i.e. before enough instances are available. (This has
> been stated as a cause for defering the decision for a common interface to
> list-like values or strings). Since the type of a function doesn't reveal
> whether it is a normal function with a class constraint or a real class
> method, I imagine a language feature that (somehow) enables me to
> specialize such a function for a particular instance even if it is not a
> proper class member.
>
> Or maybe we have come to the point where Haskell's lack of a 'real' module
> system, like e.g. in SML, actually starts to hurt? Can associated types
> come to the rescue?
>
> Cheers
> Ben
> --
> (*) I know that strictly speaking a class doesn't guarantee any semantic
> conformance either, but at least there is a common place to document the
> expected laws that all implementations should obey. With duplicated module
> APIs there is no such single place.
>
> _______________________________________________
> Haskell mailing list
> Haskell at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell
>


More information about the Haskell mailing list