[Haskell-cafe] Typeclass vs. Prolog programming

Thu Sep 28 02:55:12 EDT 2006

This message is intended as a long answer to Michael Shulman's
question (Re: variadic functions and typeCast) and Jason Dagit's
question (Re: Duplicate Instance problem). Incidentally, the short
answer to Jason Dagit's question is `constraints are disregarded
during instance selection'. The answer to Michael Shulman is yes.

There is a great temptation to read the following declarations

	class Pred a b
	instance (Ord b, Eq b) => Pred [Int] b
	instance Pred Bool Bool

as a Prolog program:

	pred([int],B) <- ord(B),eq(B).
	pred(bool,bool).

(In Prolog, the names of variables are capitalized and the names of
constants begin with a lower-case letter. In Haskell type expressions,
it's the other way around).

There are significant differences between the evaluation of the two
programs. The Prolog evaluator, given a goal 'pred([X],bool)', scans
the rules in their order of appearance, trying to unify the goal with
rule's head.  If successful, the evaluator then tries to solve the
goals in the body of the rule, again strictly in their order of
appearance. On failure, it backtracks. In the above example, the goal
unifies with the head of the first rule, yielding the substitution
{X->int,B->bool}. So, the evaluator will try to solve the goal
ord(bool), and then, if successful, eq(bool). On failure, the
evaluator will try to unify the goal with the fact 'pred(bool,bool)'.
So, Prolog follows the top-to-bottom, left-to-right strategy with
backtracking -- so called SLD resolution.

The evaluation of the typeclass program has two significant
differences. First, there is no backtracking. That makes the
evaluation deterministic, and, furthermore, independent on the order
of the instances and the order of constraints. That is a very good
thing. For comparison, consider XSLT 1.0, Sec 5.5, and the whole mess
with import precedences and priorities. The NodeTest pattern is
assigned the priority -0.5. Naturals are not enough for them.

The second difference from the Prolog evaluation is the use of
matching rather than the full unification when selecting a rule. For
example, the goal "Pred [x] Bool" does _not_ match the head of the
first Pred instance. The type variable 'x' in the goal is not
considered variable during the instance selection.

The absence of backtracking also means that the Haskell typechecker
does not need to consider the constraints when selecting the
instance. If the head of the instance matches the goal, the
typechecker commits to that instance, and only then checks for the
constraints such as Ord b and Eq b in our example. The failure to
(eventually) satisfy these constraints is an unrecoverable type error.

Thus, the following instances in Jason Dagit's question:
	instance Num a       => StringValue a where
	instance RealFloat a => StringValue a where
are identical for the instance selection purposes, where only the
instance head matters.

The absence of backtracking and head-matching rather than the full
unification do not mean that the typeclass language is
deficient. The TypeCast generalizes head-matching to controlled
unification. Indeed, if we write the first instance as

	instance (TypeCast z Int, Ord b, Eq b) => Pred [z] b

then the goal "Pred [x] Bool" will match the head, giving the
substitution {z->x, b->Bool}. The typechecker commits to the instance
and adds to the current constraints
	TypeCast x Int, Ord Bool, Eq Bool
The latter two are obviously satisfied and so discharged. The former
leads to the substitution {x->Int}. So, the net effect is as if we
_unified_ "Pred [x] Bool" with "Pred [Int] b", rather than merely
matched. Compared with Prolog, we have more flexibility: each rule may
specify which of its arguments should be unified against the
corresponding arguments of the goal, and which should be merely
matched.

The backtracking can also be emulated -- but it has to be programmed
explicitly. We have to program the typechecker in the typechecker. That
is not so difficult: the `Smash-your-boilerplate' message essentially
implements the (conventional, backtracking-less) instance resolution
algorithm in the typechecker itself. Please compare gsize in SYB3 with
gsize in `Smash'. The special processing cases in the former are
specified as methods in typeclass instances. With `Smash', exactly the
same information is given in a heterogenous list.  Perhaps the former
is more convenient, but the latter is more flexible as we can add,
inspect, and remove `instances'. The class SApply at the end of the
`Smash' message implements the instance selection algorithm.