HaskellWiki

Haskell | Wiki community | Recent changes
Random page | Special pages

 

Not logged in
Log in | Help

Note: new account creation has been disabled as an anti-spam measure.

Applications and libraries/Generic programming/Smash

< Applications and libraries | Generic programming

Contents

1 Approach: Smash your boilerplate

2 Required features/Portability


3 Expressibility

4 Subset of data types covered

5 Usage

6 Error Messages

7 Amount of work per data type (Boilerplate)

8 Extensibility

9 Reasoning

TODO ??? what does that mean?

10 Performance considerations

11 Helpful extra features

12 Discussion

Our generic functions are composed of specific and generic parts. A specific part tells what to do if the input term happens to be of a particular type. The generic part tells us what to do with all other terms. The specific part is just an HList of ordinary functions, with different argument types. If the type of the input term matches the argument type of one of the functions in the list, we apply that function. If no specific function applies, we do the generic action, implicit in the traversal strategy (for example, apply the generic function to the subterms and reduce the results).

SYB1, when traversing terms and invoking user functions for subterms of a particular type, relies on a run-time typecase. The latter requires run-time type representation, which is provided by the typeclass Typeable. The typeclass implements the method `cast' for a safe cast from the value of `generic type' to the value of the specific type.

We observe that the typecase, at the type level, has always existed in Haskell (although that fact was not perhaps realized until HList). That typecase, which does not need any `cast' operation, is the type equality predicate TypeEq. The type-level typecase is the essence of the present approach. Since the handler for each subterm is chosen statically, there is no need for run-time type representation.

The set of traversal strategies is extensible. To the pre-defined gmap, reduction, and two-terms-in-lockstep strategies, the programmer may at any time add a new class of traversals. The message `Smash-along' shows one such extension, lazy gmap, and its application to mapping of terms that do not exist.

We should note the duality of `stapply' with respect to typeclasses. Let us consider

class C a where fn :: a -> Int
instance C Bool where fn x = if x then 10 else 20
instance C Char where fn x = fromEnum x

The typeclass C declares the function fn overloaded over the argument types. When typechecking an application "fn x", the compiler selects the instance of "fn" by matching the type of "x" against the types of C's instances. The same functionality can be implemented with `stapply':

data ResFailure -- an abstract data type with no non-bottom values
                -- and no defined operations. An attempt to actually
                -- use it will trigger a type error
 
fn' x = 
    stapply ((\ (x::Bool) -> if x then (10::Int) else 20) :+:
             (\ (x::Char) -> fromEnum x) :+:
	     HNil) 
            x
	    (undefined::ResFailure)
 
testtyc_fn1 = fn' True
testtyc_fn2 = fn' 'x'
-- testtyc_fn3 = show $ fn' () -- type error, can't show ResFailure 

(The function `stapply' has the `default' clause, for all other data types. One can do the same with the typeclasses, with the help of overlapping instances). With `stapply', the set of concrete processors is given explicitly as a list rather than implicitly as a set of all typeclass instances in effect. The implementation of `stapply' reifies typechecker's instance selection algorithm. Since the concrete processors are given to stapply in a list in a certain order, stapply easily deals with `overlapping' instances: the first matching processing function is chosen.

The stapply approach is comparable to vlookup in LIGD/Dynamics.lhs. However, in our case the `overriding', the dispatch, is decided statically.

The overloaded function fn differs from fn' in that the set of fn instances is open (and can be extended at any time by defining a new instance of the class C). In contrast, all instances of fn' are explicitly enumerated and their set is closed. This is however an artifact of the particular definition fn'. We could just as well write

fn1_insts = (SCons (\ (x::Bool) -> if x then 10 else 20)
                (SCons (\ (x::Char) -> fromEnum x)
		  SNil))
fn1' x = stapply fn1_insts x (undefined::ResFailure)
-- in another module, importing fn1_insts
fn1' x = stapply (SCons (\ () -> 100) fn1_insts) x (undefined::ResFailure)

Since fn1_insts is a list, we can inspect the instances, rearrange and remove them. These operations are not available with the typeclass instances. Thus stapply approach is more flexible and powerful. Finally, the set of stapply instances can be made open, as described in [http://www.haskell.org/pipermail/haskell/2006-October/018684.html Infinite, open, statically constrained HLists]


The present approach shares with SYB the requirement that specific processing functions (the ones that handle terms of particular types) must be monomorphic. That requirement can be relaxed, as described in [http://pobox.com/~oleg/ftp/Haskell/poly2.txt Type-class overloaded functions: second-order typeclass programming with backtracking]

Retrieved from "http://www.haskell.org/haskellwiki/Applications_and_libraries/Generic_programming/Smash"

This page has been accessed 1,000 times. This page was last modified 00:59, 30 July 2007. Recent content is available under a simple permissive license.