Applications and libraries/Generic programming/Lightweight

From HaskellWiki
< Applications and libraries‎ | Generic programming
Revision as of 12:51, 11 May 2007 by Johanj (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Approach: A Lightweight Implementation of Generics and Dynamics

Required features/Portability

  • Requires Haskell 98 + existential types

  • Should run in GHC, Hugs, others?

Expressibility

  • Can do both producer and consumer functions.

  • Multiple representations needed for generic functions of different arities.

  • No local redefinitions.

Subset of data types covered

  • The family of types supported is based on sums of products (together with extensions). This means that mutually recursive types and nested are supported.

  • The extensions can include Haskell primitive types, function spaces and Dynamic types (which have special attention in the paper)

Usage

  • Library Writer: Responsible for the sums-of-products machinery, the representation(s) data type(s) to be defined in the library and defining the class and standard instances for "Representable". The library writer needs to be have a deep knowledge about generic programming.

  • Power User: Defines the boilerplate for new data types. Needs to be fairly familiar with the implementation.

  • User: defines generic functions by pattern matching on the structure of "Rep". This requires that he is comfortable with sums of products and isomorphisms. For convenience, the user should also provide a definition that uses the type class "Representable", which avoids the representations to be passed explicitly.

  • End User: If "Representable" is used, then the end can just use generic functions in the same way as any other ad-hoc (that is, type class overloaded) functions.

Error Messages

  • Some messages may involve errors with existential types, which may be hard to grasp for programmers.

Amount of work per data type (Boilerplate)

  • Isomorphisms between sums of products and data types.

  • Instances of Representable.

  • Smart constructors for representation of data types.

Extensibility

  • The Rep type (like all Haskell data types) is non-extensible. This means that the approach is not (obviously) extensible. However we can try to combine the approach with type classes to get some form of extensibility (RepLib proposes one solution like this). Alternatively, Ralf Hinze and Andres Loeh proposed "Open datatypes and functions" for tackling this problem.

Reasoning

TODO

Performance considerations

  • Sums of products based and therefore involving considerable performance penalties. However, we could use partial evaluators to remove these intermediate layers.

  • Representations of types are passed at run-time, and interpreted. Passing and interpreting types at run-time is hard to avoid.

Helpful extra features

  • GADTs (already implemented in GHC) allow more direct definitions of generic functions, since there is no need to apply the type coercions explicitly

  • A boilerplate generation mechanism. This would effectively mean that power users would not be necessary.

  • Open data types and functions in order for extensibility.

  • Kind polymorphism, for eliminating the need for multiple representations.

Discussion

(Bruno Oliveira)

An interesting approach which is quite lightweight and that is fairly easy to use (specially if we would not need to handle any of the boilerplate). It is a bit outdated because with GADTs, the use can be slightly simplified. However, this also takes us further away from Haskell 98 or even Haskell' (since GADTs will not be there). I think that LIGD is still a reference for a generic programming library that uses a data type for type representations and sums of products as a family of data types.

A drawback may be performance, since we need to convert a data type into its isomorphic sum of products representation.

(James Cheney)

From what I've seen, Ralf's "Fun with phantom types", "generics for the masses" and from what I've heard Stephanie's RepLib (caveat: haven't read that paper carefully) are in a similar spirit and subsume much or all of what's in the LIGD paper. Also, much more now appears to be known about complementary things like GADTs, open/extensible types and functions which could help fix some of the limitations of the original LIGD approach.

Assuming that GADTs do eventually become a standard feature, I'd definitely be in favor of using them instead of explicit to/from conversions in an implementation of LIGD, for efficiency if nothing else. In addition, as the FCPT paper notes, there are some natural seeming things that you can only do if equations are known to the type system. This was one motivation for our subsequent tech report on "first class phantom types", which essentially proposed extending Haskell 98 with equation guards on datatypes, i.e. GADTs. But, once you have GADTs a lot of things appear to get simpler, and simulating first-class generics/dynamics via type representations seem to be just one example.

(Johan Jeuring)

I think LIGD is quite a nice library, that needs almost no Haskell extensions. But it has its limitations.