With the `-fglasgow-exts`

flag, GHC lets you declare
a data type with no constructors. For example:

data S -- S :: * data T a -- T :: * -> *

Syntactically, the declaration lacks the "= constrs" part. The
type can be parameterised over types of any kind, but if the kind is
not `*`

then an explicit kind annotation must be used
(see Section 7.4.7, “Explicitly-kinded quantification”).

Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining "phantom types".

GHC allows type constructors, classes, and type variables to be operators, and to be written infix, very much like expressions. More specifically:

A type constructor or class can be an operator, beginning with a colon; e.g.

`:*:`

. The lexical syntax is the same as that for data constructors.Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.

data a :*: b = Foo a b type a :+: b = Either a b class a :=: b where ... data (a :**: b) x = Baz a b x type (a :++: b) y = Either (a,b) y

Types, and class constraints, can be written infix. For example

x :: Int :*: Bool f :: (a :=: b) => a -> b

A type variable can be an (unqualified) operator e.g.

`+`

. The lexical syntax is the same as that for variable operators, excluding "(.)", "(!)", and "(*)". In a binding position, the operator must be parenthesised. For example:type T (+) = Int + Int f :: T Either f = Left 3 liftA2 :: Arrow (~>) => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c) liftA2 = ...

Back-quotes work as for expressions, both for type constructors and type variables; e.g.

`Int `Either` Bool`

, or`Int `a` Bool`

. Similarly, parentheses work the same; e.g.`(:*:) Int Bool`

.Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:

infixl 7 T, :*:

sets the fixity for both type constructor

`T`

and data constructor`T`

, and similarly for`:*:`

.`Int `a` Bool`

.Function arrow is

`infixr`

with fixity 0. (This might change; I'm not sure what it should be.)

Type synonyms are like macros at the type level, and
GHC does validity checking on types *only after expanding type synonyms*.
That means that GHC can be very much more liberal about type synonyms than Haskell 98:

You can write a

`forall`

(including overloading) in a type synonym, thus:type Discard a = forall b. Show b => a -> b -> (a, String) f :: Discard a f x y = (x, show y) g :: Discard Int -> (Int,String) -- A rank-2 type g f = f 3 True

You can write an unboxed tuple in a type synonym:

type Pr = (# Int, Int #) h :: Int -> Pr h x = (# x, x #)

You can apply a type synonym to a forall type:

type Foo a = a -> a -> Bool f :: Foo (forall b. b->b)

After expanding the synonym,

`f`

has the legal (in GHC) type:f :: (forall b. b->b) -> (forall b. b->b) -> Bool

You can apply a type synonym to a partially applied type synonym:

type Generic i o = forall x. i x -> o x type Id x = x foo :: Generic Id []

After expanding the synonym,

`foo`

has the legal (in GHC) type:foo :: forall x. x -> [x]

GHC currently does kind checking before expanding synonyms (though even that could be changed.)

After expanding type synonyms, GHC does validity checking on types, looking for the following mal-formedness which isn't detected simply by kind checking:

Type constructor applied to a type involving for-alls.

Unboxed tuple on left of an arrow.

Partially-applied type synonym.

So, for example, this will be rejected:

type Pr = (# Int, Int #) h :: Pr -> Int h x = ...

because GHC does not allow unboxed tuples on the left of a function arrow.

The idea of using existential quantification in data type declarations
was suggested by Perry, and implemented in Hope+ (Nigel Perry, *The Implementation
of Practical Functional Programming Languages*, PhD Thesis, University of
London, 1991). It was later formalised by Laufer and Odersky
(*Polymorphic type inference and abstract data types*,
TOPLAS, 16(5), pp1411-1430, 1994).
It's been in Lennart
Augustsson's **hbc** Haskell compiler for several years, and
proved very useful. Here's the idea. Consider the declaration:

data Foo = forall a. MkFoo a (a -> Bool) | Nil

The data type `Foo`

has two constructors with types:

MkFoo :: forall a. a -> (a -> Bool) -> Foo Nil :: Foo

Notice that the type variable `a`

in the type of `MkFoo`

does not appear in the data type itself, which is plain `Foo`

.
For example, the following expression is fine:

[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, `(MkFoo 3 even)`

packages an integer with a function
`even`

that maps an integer to `Bool`

; and ```
MkFoo 'c'
isUpper
```

packages a character with a compatible function. These
two things are each of type `Foo`

and can be put in a list.

What can we do with a value of type `Foo`

?. In particular,
what happens when we pattern-match on `MkFoo`

?

f (MkFoo val fn) = ???

Since all we know about `val`

and `fn`

is that they
are compatible, the only (useful) thing we can do with them is to
apply `fn`

to `val`

to get a boolean. For example:

f :: Foo -> Bool f (MkFoo val fn) = fn val

What this allows us to do is to package heterogenous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way.

What has this to do with *existential* quantification?
Simply that `MkFoo`

has the (nearly) isomorphic type

MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinary
*universally* quantified type given above, thereby avoiding
adding a new existential quantification construct.

An easy extension is to allow arbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you'd expect:

Baz1 :: forall a. Eq a => a -> a -> Baz Baz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on `Baz1`

the matched values can be compared
for equality, and when pattern matching on `Baz2`

the first matched
value can be converted to a string (as well as applying the function to it).
So this program is legal:

f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" f (Baz2 v fn) = show (fn v)

Operationally, in a dictionary-passing implementation, the
constructors `Baz1`

and `Baz2`

must store the
dictionaries for `Eq`

and `Show`

respectively, and
extract it on pattern matching.

Notice the way that the syntax fits smoothly with that used for universal quantification earlier.

GHC allows existentials to be used with records syntax as well. For example:

data Counter a = forall self. NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a }

Here `tag`

is a public field, with a well-typed selector
function `tag :: Counter a -> a`

. The `self`

type is hidden from the outside; any attempt to apply `_this`

,
`_inc`

or `_output`

as functions will raise a
compile-time error. In other words, *GHC defines a record selector function
only for fields whose type does not mention the existentially-quantified variables*.
(This example used an underscore in the fields for which record selectors
will not be defined, but that is only programming style; GHC ignores them.)

To make use of these hidden fields, we need to create some helper functions:

inc :: Counter a -> Counter a inc (NewCounter x i d t) = NewCounter { _this = i x, _inc = i, _display = d, tag = t } display :: Counter a -> IO () display NewCounter{ _this = x, _display = d } = d x

Now we can define counters with different underlying implementations:

counterA :: Counter String counterA = NewCounter { _this = 0, _inc = (1+), _display = print, tag = "A" } counterB :: Counter String counterB = NewCounter { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" } main = do display (inc counterA) -- prints "1" display (inc (inc counterB)) -- prints "##"

In GADT declarations (see Section 7.5, “Generalised Algebraic Data Types (GADTs)”), the explicit
`forall`

may be omitted. For example, we can express
the same `Counter a`

using GADT:

data Counter a where NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a } :: Counter a

At the moment, record update syntax is only supported for Haskell 98 data types,
so the following function does *not* work:

-- This is invalid; use explicit NewCounter instead for now setTag :: Counter a -> a -> Counter a setTag obj t = obj{ tag = t }

There are several restrictions on the ways in which existentially-quantified constructors can be use.

When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:

f1 (MkFoo a f) = a

Here, the type bound by

`MkFoo`

"escapes", because`a`

is the result of`f1`

. One way to see why this is wrong is to ask what type`f1`

has:f1 :: Foo -> a -- Weird!

What is this "

`a`

" in the result type? Clearly we don't mean this:f1 :: forall a. Foo -> a -- Wrong!

The original program is just plain wrong. Here's another sort of error

f2 (Baz1 a b) (Baz1 p q) = a==q

It's ok to say

`a==b`

or`p==q`

, but`a==q`

is wrong because it equates the two distinct types arising from the two`Baz1`

constructors.You can't pattern-match on an existentially quantified constructor in a

`let`

or`where`

group of bindings. So this is illegal:f3 x = a==b where { Baz1 a b = x }

Instead, use a

`case`

expression:f3 x = case x of Baz1 a b -> a==b

In general, you can only pattern-match on an existentially-quantified constructor in a

`case`

expression or in the patterns of a function definition. The reason for this restriction is really an implementation one. Type-checking binding groups is already a nightmare without existentials complicating the picture. Also an existential pattern binding at the top level of a module doesn't make sense, because it's not clear how to prevent the existentially-quantified type "escaping". So for now, there's a simple-to-state restriction. We'll see how annoying it is.You can't use existential quantification for

`newtype`

declarations. So this is illegal:newtype T = forall a. Ord a => MkT a

Reason: a value of type

`T`

must be represented as a pair of a dictionary for`Ord t`

and a value of type`t`

. That contradicts the idea that`newtype`

should have no concrete representation. You can get just the same efficiency and effect by using`data`

instead of`newtype`

. If there is no overloading involved, then there is more of a case for allowing an existentially-quantified`newtype`

, because the`data`

version does carry an implementation cost, but single-field existentially quantified constructors aren't much use. So the simple restriction (no existential stuff on`newtype`

) stands, unless there are convincing reasons to change it.You can't use

`deriving`

to define instances of a data type with existentially quantified data constructors. Reason: in most cases it would not make sense. For example:#data T = forall a. MkT [a] deriving( Eq )

To derive

`Eq`

in the standard way we would need to have equality between the single component of two`MkT`

constructors:instance Eq T where (MkT a) == (MkT b) = ???

But

`a`

and`b`

have distinct types, and so can't be compared. It's just about possible to imagine examples in which the derived instance would make sense, but it seems altogether simpler simply to prohibit such declarations. Define your own instances!

This section, and the next one, documents GHC's type-class extensions. There's lots of background in the paper Type classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer).

All the extensions are enabled by the `-fglasgow-exts`

flag.

Multi-parameter type classes are permitted. For example:

class Collection c a where union :: c a -> c a -> c a ...etc.

There are no restrictions on the context in a class declaration (which introduces superclasses), except that the class hierarchy must be acyclic. So these class declarations are OK:

class Functor (m k) => FiniteMap m k where ... class (Monad m, Monad (t m)) => Transform t m where lift :: m a -> (t m) a

As in Haskell 98, The class hierarchy must be acyclic. However, the definition of "acyclic" involves only the superclass relationships. For example, this is OK:

class C a where { op :: D b => a -> b -> b } class C a => D a where { ... }

Here, `C`

is a superclass of `D`

, but it's OK for a
class operation `op`

of `C`

to mention `D`

. (It
would not be OK for `D`

to be a superclass of `C`

.)

Haskell 98 prohibits class method types to mention constraints on the class type variable, thus:

class Seq s a where fromList :: [a] -> s a elem :: Eq a => a -> s a -> Bool

The type of `elem`

is illegal in Haskell 98, because it
contains the constraint `Eq a`

, constrains only the
class type variable (in this case `a`

).
GHC lifts this restriction.

Functional dependencies are implemented as described by Mark Jones in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .

Functional dependencies are introduced by a vertical bar in the syntax of a class declaration; e.g.

class (Monad m) => MonadState s m | m -> s where ... class Foo a b c | a b -> c where ...

There should be more documentation, but there isn't (yet). Yell if you need it.

In a class declaration, all of the class type variables must be reachable (in the sense mentioned in Section 7.4.5, “Type signatures”) from the free variables of each method type. For example:

class Coll s a where empty :: s insert :: s -> a -> s

is not OK, because the type of `empty`

doesn't mention
`a`

. Functional dependencies can make the type variable
reachable:

class Coll s a | s -> a where empty :: s insert :: s -> a -> s

Alternatively `Coll`

might be rewritten

class Coll s a where empty :: s a insert :: s a -> a -> s a

which makes the connection between the type of a collection of
`a`

's (namely `(s a)`

) and the element type `a`

.
Occasionally this really doesn't work, in which case you can split the
class like this:

class CollE s where empty :: s class CollE s => Coll s a where insert :: s -> a -> s

The following description of the motivation and use of functional dependencies is taken from the Hugs user manual, reproduced here (with minor changes) by kind permission of Mark Jones.

Consider the following class, intended as part of a library for collection types:

class Collects e ce where empty :: ce insert :: e -> ce -> ce member :: e -> ce -> Bool

The type variable e used here represents the element type, while ce is the type of the container itself. Within this framework, we might want to define instances of this class for lists or characteristic functions (both of which can be used to represent collections of any equality type), bit sets (which can be used to represent collections of characters), or hash tables (which can be used to represent any collection whose elements have a hash function). Omitting standard implementation details, this would lead to the following declarations:

instance Eq e => Collects e [e] where ... instance Eq e => Collects e (e -> Bool) where ... instance Collects Char BitSet where ... instance (Hashable e, Collects a ce) => Collects e (Array Int ce) where ...

All this looks quite promising; we have a class and a range of interesting implementations. Unfortunately, there are some serious problems with the class declaration. First, the empty function has an ambiguous type:

empty :: Collects e ce => ce

By "ambiguous" we mean that there is a type variable e that appears on the left
of the `=>`

symbol, but not on the right. The problem with
this is that, according to the theoretical foundations of Haskell overloading,
we cannot guarantee a well-defined semantics for any term with an ambiguous
type.

We can sidestep this specific problem by removing the empty member from the class declaration. However, although the remaining members, insert and member, do not have ambiguous types, we still run into problems when we try to use them. For example, consider the following two functions:

f x y = insert x . insert y g = f True 'a'

for which GHC infers the following types:

f :: (Collects a c, Collects b c) => a -> b -> c -> c g :: (Collects Bool c, Collects Char c) => c -> c

Notice that the type for f allows the two parameters x and y to be assigned different types, even though it attempts to insert each of the two values, one after the other, into the same collection. If we're trying to model collections that contain only one type of value, then this is clearly an inaccurate type. Worse still, the definition for g is accepted, without causing a type error. As a result, the error in this code will not be flagged at the point where it appears. Instead, it will show up only when we try to use g, which might even be in a different module.

Faced with the problems described above, some Haskell programmers might be tempted to use something like the following version of the class declaration:

class Collects e c where empty :: c e insert :: e -> c e -> c e member :: e -> c e -> Bool

The key difference here is that we abstract over the type constructor c that is
used to form the collection type c e, and not over that collection type itself,
represented by ce in the original class declaration. This avoids the immediate
problems that we mentioned above: empty has type ```
Collects e c => c
e
```

, which is not ambiguous.

The function f from the previous section has a more accurate type:

f :: (Collects e c) => e -> e -> c e -> c e

The function g from the previous section is now rejected with a type error as
we would hope because the type of f does not allow the two arguments to have
different types.
This, then, is an example of a multiple parameter class that does actually work
quite well in practice, without ambiguity problems.
There is, however, a catch. This version of the Collects class is nowhere near
as general as the original class seemed to be: only one of the four instances
for `Collects`

given above can be used with this version of Collects because only one of
them---the instance for lists---has a collection type that can be written in
the form c e, for some type constructor c, and element type e.

To get a more useful version of the Collects class, Hugs provides a mechanism that allows programmers to specify dependencies between the parameters of a multiple parameter class (For readers with an interest in theoretical foundations and previous work: The use of dependency information can be seen both as a generalization of the proposal for `parametric type classes' that was put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's later framework for "improvement" of qualified types. The underlying ideas are also discussed in a more theoretical and abstract setting in a manuscript [implparam], where they are identified as one point in a general design space for systems of implicit parameterization.). To start with an abstract example, consider a declaration such as:

class C a b where ...

which tells us simply that C can be thought of as a binary relation on types (or type constructors, depending on the kinds of a and b). Extra clauses can be included in the definition of classes to add information about dependencies between parameters, as in the following examples:

class D a b | a -> b where ... class E a b | a -> b, b -> a where ...

The notation `a -> b`

used here between the | and where
symbols --- not to be
confused with a function type --- indicates that the a parameter uniquely
determines the b parameter, and might be read as "a determines b." Thus D is
not just a relation, but actually a (partial) function. Similarly, from the two
dependencies that are included in the definition of E, we can see that E
represents a (partial) one-one mapping between types.

More generally, dependencies take the form `x1 ... xn -> y1 ... ym`

,
where x1, ..., xn, and y1, ..., yn are type variables with n>0 and
m>=0, meaning that the y parameters are uniquely determined by the x
parameters. Spaces can be used as separators if more than one variable appears
on any single side of a dependency, as in `t -> a b`

. Note that a class may be
annotated with multiple dependencies using commas as separators, as in the
definition of E above. Some dependencies that we can write in this notation are
redundant, and will be rejected because they don't serve any useful
purpose, and may instead indicate an error in the program. Examples of
dependencies like this include `a -> a `

,
`a -> a a `

,
`a -> `

, etc. There can also be
some redundancy if multiple dependencies are given, as in
`a->b`

,
`b->c `

, `a->c `

, and
in which some subset implies the remaining dependencies. Examples like this are
not treated as errors. Note that dependencies appear only in class
declarations, and not in any other part of the language. In particular, the
syntax for instance declarations, class constraints, and types is completely
unchanged.

By including dependencies in a class declaration, we provide a mechanism for the programmer to specify each multiple parameter class more precisely. The compiler, on the other hand, is responsible for ensuring that the set of instances that are in scope at any given point in the program is consistent with any declared dependencies. For example, the following pair of instance declarations cannot appear together in the same scope because they violate the dependency for D, even though either one on its own would be acceptable:

instance D Bool Int where ... instance D Bool Char where ...

Note also that the following declaration is not allowed, even by itself:

instance D [a] b where ...

The problem here is that this instance would allow one particular choice of [a] to be associated with more than one choice for b, which contradicts the dependency specified in the definition of D. More generally, this means that, in any instance of the form:

instance D t s where ...

for some particular types t and s, the only variables that can appear in s are the ones that appear in t, and hence, if the type t is known, then s will be uniquely determined.

The benefit of including dependency information is that it allows us to define
more general multiple parameter classes, without ambiguity problems, and with
the benefit of more accurate types. To illustrate this, we return to the
collection class example, and annotate the original definition of `Collects`

with a simple dependency:

class Collects e ce | ce -> e where empty :: ce insert :: e -> ce -> ce member :: e -> ce -> Bool

The dependency `ce -> e`

here specifies that the type e of elements is uniquely
determined by the type of the collection ce. Note that both parameters of
Collects are of kind *; there are no constructor classes here. Note too that
all of the instances of Collects that we gave earlier can be used
together with this new definition.

What about the ambiguity problems that we encountered with the original definition? The empty function still has type Collects e ce => ce, but it is no longer necessary to regard that as an ambiguous type: Although the variable e does not appear on the right of the => symbol, the dependency for class Collects tells us that it is uniquely determined by ce, which does appear on the right of the => symbol. Hence the context in which empty is used can still give enough information to determine types for both ce and e, without ambiguity. More generally, we need only regard a type as ambiguous if it contains a variable on the left of the => that is not uniquely determined (either directly or indirectly) by the variables on the right.

Dependencies also help to produce more accurate types for user defined functions, and hence to provide earlier detection of errors, and less cluttered types for programmers to work with. Recall the previous definition for a function f:

f x y = insert x y = insert x . insert y

for which we originally obtained a type:

f :: (Collects a c, Collects b c) => a -> b -> c -> c

Given the dependency information that we have for Collects, however, we can deduce that a and b must be equal because they both appear as the second parameter in a Collects constraint with the same first parameter c. Hence we can infer a shorter and more accurate type for f:

f :: (Collects a c) => a -> a -> c -> c

In a similar way, the earlier definition of g will now be flagged as a type error.

Although we have given only a few examples here, it should be clear that the addition of dependency information can help to make multiple parameter classes more useful in practice, avoiding ambiguity problems, and allowing more general sets of instance declarations.

An instance declaration has the form

instance (`assertion`

_{1}, ...,`assertion`

_{n}) =>`class`

`type`

_{1}...`type`

_{m}where ...

The part before the "`=>`

" is the
*context*, while the part after the
"`=>`

" is the *head* of the instance declaration.

In Haskell 98 the head of an instance declaration
must be of the form `C (T a1 ... an)`

, where
`C`

is the class, `T`

is a type constructor,
and the `a1 ... an`

are distinct type variables.
Furthermore, the assertions in the context of the instance declaration
must be of the form `C a`

where `a`

is a type variable that occurs in the head.

The `-fglasgow-exts`

flag loosens these restrictions
considerably. Firstly, multi-parameter type classes are permitted. Secondly,
the context and head of the instance declaration can each consist of arbitrary
(well-kinded) assertions `(C t1 ... tn)`

subject only to the
following rules:

For each assertion in the context:

No type variable has more occurrences in the assertion than in the head

The assertion has fewer constructors and variables (taken together and counting repetitions) than the head

The coverage condition. For each functional dependency,

`tvs`

_{left}`->`

`tvs`

_{right}, of the class, every type variable in S(`tvs`

_{right}) must appear in S(`tvs`

_{left}), where S is the substitution mapping each type variable in the class declaration to the corresponding type in the instance declaration.

These restrictions ensure that context reduction terminates: each reduction step makes the problem smaller by at least one constructor. For example, the following would make the type checker loop if it wasn't excluded:

instance C a => C a where ...

For example, these are OK:

instance C Int [a] -- Multiple parameters instance Eq (S [a]) -- Structured type in head -- Repeated type variable in head instance C4 a a => C4 [a] [a] instance Stateful (ST s) (MutVar s) -- Head can consist of type variables only instance C a instance (Eq a, Show b) => C2 a b -- Non-type variables in context instance Show (s a) => Show (Sized s a) instance C2 Int a => C3 Bool [a] instance C2 Int a => C3 [a] b

But these are not:

-- Context assertion no smaller than head instance C a => C a where ... -- (C b b) has more more occurrences of b than the head instance C b b => Foo [b] where ...

The same restrictions apply to instances generated by
`deriving`

clauses. Thus the following is accepted:

data MinHeap h a = H a (h a) deriving (Show)

because the derived instance

instance (Show a, Show (h a)) => Show (MinHeap h a)

conforms to the above rules.

A useful idiom permitted by the above rules is as follows. If one allows overlapping instance declarations then it's quite convenient to have a "default instance" declaration that applies if something more specific does not:

instance C a where op = ... -- Default

You can find lots of background material about the reason for these restrictions in the paper Understanding functional dependencies via Constraint Handling Rules.

Sometimes even the rules of Section 7.4.4.1, “Relaxed rules for instance declarations” are too onerous. For example, sometimes you might want to use the following to get the effect of a "class synonym":

class (C1 a, C2 a, C3 a) => C a where { } instance (C1 a, C2 a, C3 a) => C a where { }

This allows you to write shorter signatures:

f :: C a => ...

instead of

f :: (C1 a, C2 a, C3 a) => ...

The restrictions on functional dependencies (Section 7.4.3, “Functional dependencies ”) are particularly troublesome. It is tempting to introduce type variables in the context that do not appear in the head, something that is excluded by the normal rules. For example:

class HasConverter a b | a -> b where convert :: a -> b data Foo a = MkFoo a instance (HasConverter a b,Show b) => Show (Foo a) where show (MkFoo value) = show (convert value)

This is dangerous territory, however. Here, for example, is a program that would make the typechecker loop:

class D a class F a b | a->b instance F [a] [[a]] instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head

Similarly, it can be tempting to lift the coverage condition:

class Mul a b c | a b -> c where (.*.) :: a -> b -> c instance Mul Int Int Int where (.*.) = (*) instance Mul Int Float Float where x .*. y = fromIntegral x * y instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v

The third instance declaration does not obey the coverage condition; and indeed the (somewhat strange) definition:

f = \ b x y -> if b then x .*. [y] else y

makes instance inference go into a loop, because it requires the constraint
`(Mul a [b] b)`

.

Nevertheless, GHC allows you to experiment with more liberal rules. If you use
the experimental flag `-fallow-undecidable-instances`

, you can use arbitrary
types in both an instance context and instance head. Termination is ensured by having a
fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with `-fcontext-stack=`

*N*.

In general, *GHC requires that that it be unambiguous which instance
declaration
should be used to resolve a type-class constraint*. This behaviour
can be modified by two flags: `-fallow-overlapping-instances`

and `-fallow-incoherent-instances`

, as this section discusses. Both these
flags are dynamic flags, and can be set on a per-module basis, using
an `OPTIONS_GHC`

pragma if desired (Section 4.1.2, “command line options in source files”).

When GHC tries to resolve, say, the constraint `C Int Bool`

,
it tries to match every instance declaration against the
constraint,
by instantiating the head of the instance declaration. For example, consider
these declarations:

instance context1 => C Int a where ... -- (A) instance context2 => C a Bool where ... -- (B) instance context3 => C Int [a] where ... -- (C) instance context4 => C Int [Int] where ... -- (D)

The instances (A) and (B) match the constraint `C Int Bool`

,
but (C) and (D) do not. When matching, GHC takes
no account of the context of the instance declaration
(`context1`

etc).
GHC's default behaviour is that *exactly one instance must match the
constraint it is trying to resolve*.
It is fine for there to be a *potential* of overlap (by
including both declarations (A) and (B), say); an error is only reported if a
particular constraint matches more than one.

The `-fallow-overlapping-instances`

flag instructs GHC to allow
more than one instance to match, provided there is a most specific one. For
example, the constraint `C Int [Int]`

matches instances (A),
(C) and (D), but the last is more specific, and hence is chosen. If there is no
most-specific match, the program is rejected.

However, GHC is conservative about committing to an overlapping instance. For example:

f :: [b] -> [b] f x = ...

Suppose that from the RHS of `f`

we get the constraint
`C Int [b]`

. But
GHC does not commit to instance (C), because in a particular
call of `f`

, `b`

might be instantiate
to `Int`

, in which case instance (D) would be more specific still.
So GHC rejects the program. If you add the flag `-fallow-incoherent-instances`

,
GHC will instead pick (C), without complaining about
the problem of subsequent instantiations.

The willingness to be overlapped or incoherent is a property of
the *instance declaration* itself, controlled by the
presence or otherwise of the `-fallow-overlapping-instances`

and `-fallow-incoherent-instances`

flags when that mdodule is
being defined. Neither flag is required in a module that imports and uses the
instance declaration. Specifically, during the lookup process:

An instance declaration is ignored during the lookup process if (a) a more specific match is found, and (b) the instance declaration was compiled with

`-fallow-overlapping-instances`

. The flag setting for the more-specific instance does not matter.Suppose an instance declaration does not matche the constraint being looked up, but does unify with it, so that it might match when the constraint is further instantiated. Usually GHC will regard this as a reason for not committing to some other constraint. But if the instance declaration was compiled with

`-fallow-incoherent-instances`

, GHC will skip the "does-it-unify?" check for that declaration.

These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know.

If an instance declaration is compiled without
`-fallow-overlapping-instances`

,
then that instance can never be overlapped. This could perhaps be
inconvenient. Perhaps the rule should instead say that the
*overlapping* instance declaration should be compiled in
this way, rather than the *overlapped* one. Perhaps overlap
at a usage site should be permitted regardless of how the instance declarations
are compiled, if the `-fallow-overlapping-instances`

flag is
used at the usage site. (Mind you, the exact usage site can occasionally be
hard to pin down.) We are interested to receive feedback on these points.

The `-fallow-incoherent-instances`

flag implies the
`-fallow-overlapping-instances`

flag, but not vice versa.

*Unlike Haskell 98, instance heads may use type
synonyms*. (The instance "head" is the bit after the "=>" in an instance decl.)
As always, using a type synonym is just shorthand for
writing the RHS of the type synonym definition. For example:

type Point = (Int,Int) instance C Point where ... instance C [Point] where ...

is legal. However, if you added

instance C (Int,Int) where ...

as well, then the compiler will complain about the overlapping (actually, identical) instance declarations. As always, type synonyms must be fully applied. You cannot, for example, write:

type P a = [[a]] instance Monad P where ...

This design decision is independent of all the others, and easily reversed, but it makes sense to me.

Unlike Haskell 98, constraints in types do *not* have to be of
the form *(class type-variable)* or
*(class (type-variable type-variable ...))*. Thus,
these type signatures are perfectly OK

g :: Eq [a] => ... g :: Ord (T a ()) => ...

GHC imposes the following restrictions on the constraints in a type signature. Consider the type:

forall tv1..tvn (c1, ...,cn) => type

(Here, we write the "foralls" explicitly, although the Haskell source language omits them; in Haskell 98, all the free type variables of an explicit source-language type signature are universally quantified, except for the class type variables in a class declaration. However, in GHC, you can give the foralls if you want. See Section 7.4.8, “Arbitrary-rank polymorphism ”).

*Each universally quantified type variable*. A type variable`tvi`

must be reachable from`type`

`a`

is "reachable" if it it appears in the same constraint as either a type variable free in in`type`

, or another reachable type variable. A value with a type that does not obey this reachability restriction cannot be used without introducing ambiguity; that is why the type is rejected. Here, for example, is an illegal type:forall a. Eq a => Int

When a value with this type was used, the constraint

`Eq tv`

would be introduced where`tv`

is a fresh type variable, and (in the dictionary-translation implementation) the value would be applied to a dictionary for`Eq tv`

. The difficulty is that we can never know which instance of`Eq`

to use because we never get any more information about`tv`

.Note that the reachability condition is weaker than saying that

`a`

is functionally dependent on a type variable free in`type`

(see Section 7.4.3, “Functional dependencies ”). The reason for this is there might be a "hidden" dependency, in a superclass perhaps. So "reachable" is a conservative approximation to "functionally dependent". For example, consider:class C a b | a -> b where ... class C a b => D a b where ... f :: forall a b. D a b => a -> a

This is fine, because in fact

`a`

does functionally determine`b`

but that is not immediately apparent from`f`

's type.*Every constraint*. For example, this type is OK because`ci`

must mention at least one of the universally quantified type variables`tvi`

`C a b`

mentions the universally quantified type variable`b`

:forall a. C a b => burble

The next type is illegal because the constraint

`Eq b`

does not mention`a`

:forall a. Eq b => burble

The reason for this restriction is milder than the other one. The excluded types are never useful or necessary (because the offending context doesn't need to be witnessed at this point; it can be floated out). Furthermore, floating them out increases sharing. Lastly, excluding them is a conservative choice; it leaves a patch of territory free in case we need it later.

It is often convenient to use generalised type synonyms (see Section 7.4.1.3, “Liberalised type synonyms”) at the right hand end of an arrow, thus:

type Discard a = forall b. a -> b -> a g :: Int -> Discard Int g x y z = x+y

Simply expanding the type synonym would give

g :: Int -> (forall b. Int -> b -> Int)

but GHC "hoists" the `forall`

to give the isomorphic type

g :: forall b. Int -> Int -> b -> Int

In general, the rule is this: *to determine the type specified by any explicit
user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
performs the transformation:*

type1-> forall a1..an.context2=>type2==> forall a1..an.context2=>type1->type2

(In fact, GHC tries to retain as much synonym information as possible for use in
error messages, but that is a usability issue.) This rule applies, of course, whether
or not the `forall`

comes from a synonym. For example, here is another
valid way to write `g`

's type signature:

g :: Int -> Int -> forall b. b -> Int

When doing this hoisting operation, GHC eliminates duplicate constraints. For example:

type Foo a = (?x::Int) => Bool -> a g :: Foo (Foo Int)

means

g :: (?x::Int) => Bool -> Bool -> Int

Implicit parameters are implemented as described in "Implicit parameters: dynamic scoping with static types", J Lewis, MB Shields, E Meijer, J Launchbury, 27th ACM Symposium on Principles of Programming Languages (POPL'00), Boston, Jan 2000.

(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.)

Implicit parameter support is enabled with the option
`-fimplicit-params`

.

A variable is called *dynamically bound* when it is bound by the calling
context of a function and *statically bound* when bound by the callee's
context. In Haskell, all variables are statically bound. Dynamic
binding of variables is a notion that goes back to Lisp, but was later
discarded in more modern incarnations, such as Scheme. Dynamic binding
can be very confusing in an untyped language, and unfortunately, typed
languages, in particular Hindley-Milner typed languages like Haskell,
only support static scoping of variables.

However, by a simple extension to the type class system of Haskell, we
can support dynamic binding. Basically, we express the use of a
dynamically bound variable as a constraint on the type. These
constraints lead to types of the form `(?x::t') => t`

, which says "this
function uses a dynamically-bound variable `?x`

of type `t'`

". For
example, the following expresses the type of a sort function,
implicitly parameterized by a comparison function named `cmp`

.

sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]

The dynamic binding constraints are just a new form of predicate in the type class system.

An implicit parameter occurs in an expression using the special form `?x`

,
where `x`

is
any valid identifier (e.g. `ord ?x`

is a valid expression).
Use of this construct also introduces a new
dynamic-binding constraint in the type of the expression.
For example, the following definition
shows how we can define an implicitly parameterized sort function in
terms of an explicitly parameterized `sortBy`

function:

sortBy :: (a -> a -> Bool) -> [a] -> [a] sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] sort = sortBy ?cmp

Dynamic binding constraints behave just like other type class
constraints in that they are automatically propagated. Thus, when a
function is used, its implicit parameters are inherited by the
function that called it. For example, our `sort`

function might be used
to pick out the least value in a list:

least :: (?cmp :: a -> a -> Bool) => [a] -> a least xs = head (sort xs)

Without lifting a finger, the `?cmp`

parameter is
propagated to become a parameter of `least`

as well. With explicit
parameters, the default is that parameters must always be explicit
propagated. With implicit parameters, the default is to always
propagate them.

An implicit-parameter type constraint differs from other type class constraints in the
following way: All uses of a particular implicit parameter must have
the same type. This means that the type of `(?x, ?x)`

is `(?x::a) => (a,a)`

, and not
`(?x::a, ?x::b) => (a, b)`

, as would be the case for type
class constraints.

You can't have an implicit parameter in the context of a class or instance declaration. For example, both these declarations are illegal:

class (?x::Int) => C a where ... instance (?x::a) => Foo [a] where ...

Reason: exactly which implicit parameter you pick up depends on exactly where you invoke a function. But the ``invocation'' of instance declarations is done behind the scenes by the compiler, so it's hard to figure out exactly where it is done. Easiest thing is to outlaw the offending types.

Implicit-parameter constraints do not cause ambiguity. For example, consider:

f :: (?x :: [a]) => Int -> Int f n = n + length ?x g :: (Read a, Show a) => String -> String g s = show (read s)

Here, `g`

has an ambiguous type, and is rejected, but `f`

is fine. The binding for `?x`

at `f`

's call site is
quite unambiguous, and fixes the type `a`

.

An implicit parameter is *bound* using the standard
`let`

or `where`

binding forms.
For example, we define the `min`

function by binding
`cmp`

.

min :: [a] -> a min = let ?cmp = (<=) in least

A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
bindings can occur, except at top level. That is, they can occur in a `let`

(including in a list comprehension, or do-notation, or pattern guards),
or a `where`

clause.
Note the following points:

An implicit-parameter binding group must be a collection of simple bindings to implicit-style variables (no function-style bindings, and no type signatures); these bindings are neither polymorphic or recursive.

You may not mix implicit-parameter bindings with ordinary bindings in a single

`let`

expression; use two nested`let`

s instead. (In the case of`where`

you are stuck, since you can't nest`where`

clauses.)You may put multiple implicit-parameter bindings in a single binding group; but they are

*not*treated as a mutually recursive group (as ordinary`let`

bindings are). Instead they are treated as a non-recursive group, simultaneously binding all the implicit parameter. The bindings are not nested, and may be re-ordered without changing the meaning of the program. For example, consider:f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y

The use of

`?x`

in the binding for`?y`

does not "see" the binding for`?x`

, so the type of`f`

isf :: (?x::Int) => Int -> Int

Consider these two definitions:

len1 :: [a] -> Int len1 xs = let ?acc = 0 in len_acc1 xs len_acc1 [] = ?acc len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs ------------ len2 :: [a] -> Int len2 xs = let ?acc = 0 in len_acc2 xs len_acc2 :: (?acc :: Int) => [a] -> Int len_acc2 [] = ?acc len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs

The only difference between the two groups is that in the second group
`len_acc`

is given a type signature.
In the former case, `len_acc1`

is monomorphic in its own
right-hand side, so the implicit parameter `?acc`

is not
passed to the recursive call. In the latter case, because `len_acc2`

has a type signature, the recursive call is made to the
*polymoprhic* version, which takes `?acc`

as an implicit parameter. So we get the following results in GHCi:

Prog> len1 "hello" 0 Prog> len2 "hello" 5

Adding a type signature dramatically changes the result! This is a rather counter-intuitive phenomenon, worth watching out for.

GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the Haskell Report) to implicit parameters. For example, consider:

f :: Int -> Int f v = let ?x = 0 in let y = ?x + v in let ?x = 5 in y

Since the binding for `y`

falls under the Monomorphism
Restriction it is not generalised, so the type of `y`

is
simply `Int`

, not `(?x::Int) => Int`

.
Hence, `(f 9)`

returns result `9`

.
If you add a type signature for `y`

, then `y`

will get type `(?x::Int) => Int`

, so the occurrence of
`y`

in the body of the `let`

will see the
inner binding of `?x`

, so `(f 9)`

will return
`14`

.

Haskell infers the kind of each type variable. Sometimes it is nice to be able to give the kind explicitly as (machine-checked) documentation, just as it is nice to give a type signature for a function. On some occasions, it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) John Hughes had to define the data type:

data Set cxt a = Set [a] | Unused (cxt a -> ())

The only use for the `Unused`

constructor was to force the correct
kind for the type variable `cxt`

.

GHC now instead allows you to specify the kind of a type variable directly, wherever a type variable is explicitly bound. Namely:

`data`

declarations:data Set (cxt :: * -> *) a = Set [a]

`type`

declarations:type T (f :: * -> *) = f Int

`class`

declarations:class (Eq a) => C (f :: * -> *) a where ...

`forall`

's in type signatures:f :: forall (cxt :: * -> *). Set cxt Int

The parentheses are required. Some of the spaces are required too, to
separate the lexemes. If you write `(f::*->*)`

you
will get a parse error, because "`::*->*`

" is a
single lexeme in Haskell.

As part of the same extension, you can put kind annotations in types as well. Thus:

f :: (Int :: *) -> Int g :: forall a. a -> (a :: *)

The syntax is

atype ::= '(' ctype '::' kind ')

The parentheses are required.

Haskell type signatures are implicitly quantified. The new keyword `forall`

allows us to say exactly what this means. For example:

g :: b -> b

means this:

g :: forall b. (b -> b)

The two are treated identically.

However, GHC's type system supports *arbitrary-rank*
explicit universal quantification in
types.
For example, all the following types are legal:

f1 :: forall a b. a -> b -> a g1 :: forall a b. (Ord a, Eq b) => a -> b -> a f2 :: (forall a. a->a) -> Int -> Int g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool

Here, `f1`

and `g1`

are rank-1 types, and
can be written in standard Haskell (e.g. `f1 :: a->b->a`

).
The `forall`

makes explicit the universal quantification that
is implicitly added by Haskell.

The functions `f2`

and `g2`

have rank-2 types;
the `forall`

is on the left of a function arrow. As `g2`

shows, the polymorphic type on the left of the function arrow can be overloaded.

The function `f3`

has a rank-3 type;
it has rank-2 types on the left of a function arrow.

GHC allows types of arbitrary rank; you can nest `forall`

s
arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but
that restriction has now been lifted.)
In particular, a forall-type (also called a "type scheme"),
including an operational type class context, is legal:

On the left of a function arrow

On the right of a function arrow (see Section 7.4.5.2, “For-all hoisting”)

As the argument of a constructor, or type of a field, in a data type declaration. For example, any of the

`f1,f2,f3,g1,g2`

above would be valid field type signatures.As the type of an implicit parameter

In a pattern type signature (see Section 7.4.10, “Lexically scoped type variables ”)

There is one place you cannot put a `forall`

:
you cannot instantiate a type variable with a forall-type. So you cannot
make a forall-type the argument of a type constructor. So these types are illegal:

x1 :: [forall a. a->a] x2 :: (forall a. a->a, Int) x3 :: Maybe (forall a. a->a)

Of course `forall`

becomes a keyword; you can't use `forall`

as
a type variable any more!

In a `data`

or `newtype`

declaration one can quantify
the types of the constructor arguments. Here are several examples:

data T a = T1 (forall b. b -> b -> b) a data MonadT m = MkMonad { return :: forall a. a -> m a, bind :: forall a b. m a -> (a -> m b) -> m b } newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])

The constructors have rank-2 types:

T1 :: forall a. (forall b. b -> b -> b) -> a -> T a MkMonad :: forall m. (forall a. a -> m a) -> (forall a b. m a -> (a -> m b) -> m b) -> MonadT m MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle

Notice that you don't need to use a `forall`

if there's an
explicit context. For example in the first argument of the
constructor `MkSwizzle`

, an implicit "`forall a.`

" is
prefixed to the argument type. The implicit `forall`

quantifies all type variables that are not already in scope, and are
mentioned in the type quantified over.

As for type signatures, implicit quantification happens for non-overloaded types too. So if you write this:

data T a = MkT (Either a b) (b -> b)

it's just as if you had written this:

data T a = MkT (forall b. Either a b) (forall b. b -> b)

That is, since the type variable `b`

isn't in scope, it's
implicitly universally quantified. (Arguably, it would be better
to *require* explicit quantification on constructor arguments
where that is what is wanted. Feedback welcomed.)

You construct values of types `T1, MonadT, Swizzle`

by applying
the constructor to suitable values, just as usual. For example,

a1 :: T Int a1 = T1 (\xy->x) 3 a2, a3 :: Swizzle a2 = MkSwizzle sort a3 = MkSwizzle reverse a4 :: MonadT Maybe a4 = let r x = Just x b m k = case m of Just y -> k y Nothing -> Nothing in MkMonad r b mkTs :: (forall b. b -> b -> b) -> a -> [T a] mkTs f x y = [T1 f x, T1 f y]

The type of the argument can, as usual, be more general than the type
required, as `(MkSwizzle reverse)`

shows. (`reverse`

does not need the `Ord`

constraint.)

When you use pattern matching, the bound variables may now have polymorphic types. For example:

f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd') g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] g (MkSwizzle s) xs f = s (map f (s xs)) h :: MonadT m -> [m a] -> m [a] h m [] = return m [] h m (x:xs) = bind m x $ \y -> bind m (h m xs) $ \ys -> return m (y:ys)

In the function `h`

we use the record selectors `return`

and `bind`

to extract the polymorphic bind and return functions
from the `MonadT`

data structure, rather than using pattern
matching.

In general, type inference for arbitrary-rank types is undecidable. GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) to get a decidable algorithm by requiring some help from the programmer. We do not yet have a formal specification of "some help" but the rule is this:

*For a lambda-bound or case-bound variable, x, either the programmer
provides an explicit polymorphic type for x, or GHC's type inference will assume
that x's type has no foralls in it*.

What does it mean to "provide" an explicit type for x? You can do that by giving a type signature for x directly, using a pattern type signature (Section 7.4.10, “Lexically scoped type variables ”), thus:

\ f :: (forall a. a->a) -> (f True, f 'c')

Alternatively, you can give a type signature to the enclosing context, which GHC can "push down" to find the type for the variable:

(\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)

Here the type signature on the expression can be pushed inwards to give a type signature for f. Similarly, and more commonly, one can give a type signature for the function itself:

h :: (forall a. a->a) -> (Bool,Char) h f = (f True, f 'c')

You don't need to give a type signature if the lambda bound variable is a constructor argument. Here is an example we saw earlier:

f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd')

Here we do not need to give a type signature to `w`

, because
it is an argument of constructor `T1`

and that tells GHC all
it needs to know.

GHC performs implicit quantification as follows. *At the top level (only) of
user-written types, if and only if there is no explicit forall,
GHC finds all the type variables mentioned in the type that are not already
in scope, and universally quantifies them.* For example, the following pairs are
equivalent:

f :: a -> a f :: forall a. a -> a g (x::a) = let h :: a -> b -> b h x y = y in ... g (x::a) = let h :: forall b. a -> b -> b h x y = y in ...

Notice that GHC does *not* find the innermost possible quantification
point. For example:

f :: (a -> a) -> Int -- MEANS f :: forall a. (a -> a) -> Int -- NOT f :: (forall a. a -> a) -> Int g :: (Ord a => a -> a) -> Int -- MEANS the illegal type g :: forall a. (Ord a => a -> a) -> Int -- NOT g :: (forall a. Ord a => a -> a) -> Int

The latter produces an illegal type, which you might think is silly, but at least the rule is simple. If you want the latter type, you can write your for-alls explicitly. Indeed, doing so is strongly advised for rank-2 types.

GHC supports *impredicative polymorphism*. This means
that you can call a polymorphic function at a polymorphic type, and
parameterise data structures over polymorphic types. For example:

f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char]) f (Just g) = Just (g [3], g "hello") f Nothing = Nothing

Notice here that the `Maybe`

type is parameterised by the
*polymorphic* type ```
(forall a. [a] ->
[a])
```

.

The technical details of this extension are described in the paper Boxy types: type inference for higher-rank types and impredicativity, which appeared at ICFP 2006.

GHC supports *lexically scoped type variables*, without
which some type signatures are simply impossible to write. For example:

f :: forall a. [a] -> [a] f xs = ys ++ ys where ys :: [a] ys = reverse xs

The type signature for `f`

brings the type variable `a`

into scope; it scopes over
the entire definition of `f`

.
In particular, it is in scope at the type signature for `ys`

.
In Haskell 98 it is not possible to declare
a type for `ys`

; a major benefit of scoped type variables is that
it becomes possible to do so.

Lexically-scoped type variables are enabled by
`-fglasgow-exts`

.

Note: GHC 6.6 contains substantial changes to the way that scoped type variables work, compared to earlier releases. Read this section carefully!

The design follows the following principles

A scoped type variable stands for a type

*variable*, and not for a*type*. (This is a change from GHC's earlier design.)Furthermore, distinct lexical type variables stand for distinct type variables. This means that every programmer-written type signature (includin one that contains free scoped type variables) denotes a

*rigid*type; that is, the type is fully known to the type checker, and no inference is involved.Lexical type variables may be alpha-renamed freely, without changing the program.

A *lexically scoped type variable* can be bound by:

A declaration type signature (Section 7.4.10.2, “Declaration type signatures”)

A pattern type signature (Section 7.4.10.3, “Pattern type signatures”)

Class and instance declarations (Section 7.4.10.4, “Class and instance declarations”)

In Haskell, a programmer-written type signature is implicitly quantifed over
its free type variables (Section
4.1.2
of the Haskel Report).
Lexically scoped type variables affect this implicit quantification rules
as follows: any type variable that is in scope is *not* universally
quantified. For example, if type variable `a`

is in scope,
then

(e :: a -> a) means (e :: a -> a) (e :: b -> b) means (e :: forall b. b->b) (e :: a -> b) means (e :: forall b. a->b)

A declaration type signature that has *explicit*
quantification (using `forall`

) brings into scope the
explicitly-quantified
type variables, in the definition of the named function(s). For example:

f :: forall a. [a] -> [a] f (x:xs) = xs ++ [ x :: a ]

The "`forall a`

" brings "`a`

" into scope in
the definition of "`f`

".

This only happens if the quantification in `f`

's type
signature is explicit. For example:

g :: [a] -> [a] g (x:xs) = xs ++ [ x :: a ]

This program will be rejected, because "`a`

" does not scope
over the definition of "`f`

", so "`x::a`

"
means "`x::forall a. a`

" by Haskell's usual implicit
quantification rules.

A type signature may occur in any pattern; this is a *pattern type
signature*.
For example:

-- f and g assume that 'a' is already in scope f = \(x::Int, y::a) -> x g (x::a) = x h ((x,y) :: (Int,Bool)) = (y,x)

In the case where all the type variables in the pattern type sigature are already in scope (i.e. bound by the enclosing context), matters are simple: the signature simply constrains the type of the pattern in the obvious way.

There is only one situation in which you can write a pattern type signature that mentions a type variable that is not already in scope, namely in pattern match of an existential data constructor. For example:

data T = forall a. MkT [a] k :: T -> T k (MkT [t::a]) = MkT t3 where t3::[a] = [t,t,t]

Here, the pattern type signature `(t::a)`

mentions a lexical type
variable that is not already in scope. Indeed, it cannot already be in scope,
because it is bound by the pattern match. GHC's rule is that in this situation
(and only then), a pattern type signature can mention a type variable that is
not already in scope; the effect is to bring it into scope, standing for the
existentially-bound type variable.

If this seems a little odd, we think so too. But we must have
*some* way to bring such type variables into scope, else we
could not name existentially-bound type variables in subequent type signatures.

This is (now) the *only* situation in which a pattern type
signature is allowed to mention a lexical variable that is not already in
scope.
For example, both `f`

and `g`

would be
illegal if `a`

was not already in scope.

Haskell 98 allows the programmer to add "`deriving( Eq, Ord )`

" to a data type
declaration, to generate a standard instance declaration for classes specified in the `deriving`

clause.
In Haskell 98, the only classes that may appear in the `deriving`

clause are the standard
classes `Eq`

, `Ord`

,
`Enum`

, `Ix`

, `Bounded`

, `Read`

, and `Show`

.

GHC extends this list with two more classes that may be automatically derived
(provided the `-fglasgow-exts`

flag is specified):
`Typeable`

, and `Data`

. These classes are defined in the library
modules `Data.Typeable`

and `Data.Generics`

respectively, and the
appropriate class must be in scope before it can be mentioned in the `deriving`

clause.

An instance of `Typeable`

can only be derived if the
data type has seven or fewer type parameters, all of kind `*`

.
The reason for this is that the `Typeable`

class is derived using the scheme
described in
Scrap More Boilerplate: Reflection, Zips, and Generalised Casts
.
(Section 7.4 of the paper describes the multiple `Typeable`

classes that
are used, and only `Typeable1`

up to
`Typeable7`

are provided in the library.)
In other cases, there is nothing to stop the programmer writing a `TypableX`

class, whose kind suits that of the data type constructor, and
then writing the data type instance by hand.

When you define an abstract type using `newtype`

, you may want
the new type to inherit some instances from its representation. In
Haskell 98, you can inherit instances of `Eq`

, `Ord`

,
`Enum`

and `Bounded`

by deriving them, but for any
other classes you have to write an explicit instance declaration. For
example, if you define

newtype Dollars = Dollars Int

and you want to use arithmetic on `Dollars`

, you have to
explicitly define an instance of `Num`

:

instance Num Dollars where Dollars a + Dollars b = Dollars (a+b) ...

All the instance does is apply and remove the `newtype`

constructor. It is particularly galling that, since the constructor
doesn't appear at run-time, this instance declaration defines a
dictionary which is *wholly equivalent* to the `Int`

dictionary, only slower!

GHC now permits such instances to be derived instead, so one can write

newtype Dollars = Dollars Int deriving (Eq,Show,Num)

and the implementation uses the *same* `Num`

dictionary
for `Dollars`

as for `Int`

. Notionally, the compiler
derives an instance declaration of the form

instance Num Int => Num Dollars

which just adds or removes the `newtype`

constructor according to the type.

We can also derive instances of constructor classes in a similar way. For example, suppose we have implemented state and failure monad transformers, such that

instance Monad m => Monad (State s m) instance Monad m => Monad (Failure m)

In Haskell 98, we can define a parsing monad by

type Parser tok m a = State [tok] (Failure m) a

which is automatically a monad thanks to the instance declarations
above. With the extension, we can make the parser type abstract,
without needing to write an instance of class `Monad`

, via

newtype Parser tok m a = Parser (State [tok] (Failure m) a) deriving Monad

In this case the derived instance declaration is of the form

instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)

Notice that, since `Monad`

is a constructor class, the
instance is a *partial application* of the new type, not the
entire left hand side. We can imagine that the type declaration is
``eta-converted'' to generate the context of the instance
declaration.

We can even derive instances of multi-parameter classes, provided the
newtype is the last class parameter. In this case, a ``partial
application'' of the class appears in the `deriving`

clause. For example, given the class

class StateMonad s m | m -> s where ... instance Monad m => StateMonad s (State s m) where ...

then we can derive an instance of `StateMonad`

for `Parser`

s by

newtype Parser tok m a = Parser (State [tok] (Failure m) a) deriving (Monad, StateMonad [tok])

The derived instance is obtained by completing the application of the class to the new type:

instance StateMonad [tok] (State [tok] (Failure m)) => StateMonad [tok] (Parser tok m)

As a result of this extension, all derived instances in newtype
declarations are treated uniformly (and implemented just by reusing
the dictionary for the representation type), *except*
`Show`

and `Read`

, which really behave differently for
the newtype and its representation.

Derived instance declarations are constructed as follows. Consider the declaration (after expansion of any type synonyms)

newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)

where

The type

`t`

is an arbitrary typeThe

`vk+1...vn`

are type variables which do not occur in`t`

, andThe

`ci`

are partial applications of classes of the form`C t1'...tj'`

, where the arity of`C`

is exactly`j+1`

. That is,`C`

lacks exactly one type argument.None of the

`ci`

is`Read`

,`Show`

,`Typeable`

, or`Data`

. These classes should not "look through" the type or its constructor. You can still derive these classes for a newtype, but it happens in the usual way, not via this new mechanism.

Then, for each `ci`

, the derived instance
declaration is:

instance ci (t vk+1...v) => ci (T v1...vp)

where `p`

is chosen so that `T v1...vp`

is of the
right *kind* for the last parameter of class `Ci`

.

As an example which does *not* work, consider

newtype NonMonad m s = NonMonad (State s m s) deriving Monad

Here we cannot derive the instance

instance Monad (State s m) => Monad (NonMonad m)

because the type variable `s`

occurs in `State s m`

,
and so cannot be "eta-converted" away. It is a good thing that this
`deriving`

clause is rejected, because `NonMonad m`

is
not, in fact, a monad --- for the same reason. Try defining
`>>=`

with the correct type: you won't be able to.

Notice also that the *order* of class parameters becomes
important, since we can only derive instances for the last one. If the
`StateMonad`

class above were instead defined as

class StateMonad m s | m -> s where ...

then we would not have been able to derive an instance for the
`Parser`

type above. We hypothesise that multi-parameter
classes usually have one "main" parameter for which deriving new
instances is most interesting.

Lastly, all of this applies only for classes other than
`Read`

, `Show`

, `Typeable`

,
and `Data`

, for which the built-in derivation applies (section
4.3.3. of the Haskell Report).
(For the standard classes `Eq`

, `Ord`

,
`Ix`

, and `Bounded`

it is immaterial whether
the standard method is used or the one described here.)

The Haskell Report specifies that a group of bindings (at top level, or in a
`let`

or `where`

) should be sorted into
strongly-connected components, and then type-checked in dependency order
(Haskell
Report, Section 4.5.1).
As each group is type-checked, any binders of the group that
have
an explicit type signature are put in the type environment with the specified
polymorphic type,
and all others are monomorphic until the group is generalised
(Haskell Report, Section 4.5.2).

Following a suggestion of Mark Jones, in his paper
Typing Haskell in
Haskell,
GHC implements a more general scheme. If `-fglasgow-exts`

is
specified:
*the dependency analysis ignores references to variables that have an explicit
type signature*.
As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will
typecheck. For example, consider:

f :: Eq a => a -> Bool f x = (x == x) || g True || g "Yes" g y = (y <= y) || f True

This is rejected by Haskell 98, but under Jones's scheme the definition for
`g`

is typechecked first, separately from that for
`f`

,
because the reference to `f`

in `g`

's right
hand side is ingored by the dependency analysis. Then `g`

's
type is generalised, to get

g :: Ord a => a -> Bool

Now, the defintion for `f`

is typechecked, with this type for
`g`

in the type environment.

The same refined dependency analysis also allows the type signatures of
mutually-recursive functions to have different contexts, something that is illegal in
Haskell 98 (Section 4.5.2, last sentence). With
`-fglasgow-exts`

GHC only insists that the type signatures of a *refined* group have identical
type signatures; in practice this means that only variables bound by the same
pattern binding must have the same context. For example, this is fine:

f :: Eq a => a -> Bool f x = (x == x) || g True g :: Ord a => a -> Bool g y = (y <= y) || f True