Difference between revisions of "GADTs for dummies"

From HaskellWiki
Jump to navigation Jump to search
 
Line 13: Line 13:
 
constructors. For example,
 
constructors. For example,
   
  +
<haskell>
 
data Either a b = Left a | Right b
 
data Either a b = Left a | Right b
  +
</haskell>
   
 
declares type constructor "Either" and two data constructors "Left"
 
declares type constructor "Either" and two data constructors "Left"
 
and "Right". Ordinary Haskell functions works with data constructors:
 
and "Right". Ordinary Haskell functions works with data constructors:
   
  +
<haskell>
 
isLeft (Left a) = True
 
isLeft (Left a) = True
 
isLeft (Right b) = False
 
isLeft (Right b) = False
  +
</haskell>
   
 
but there is also a way to work with type constructors!
 
but there is also a way to work with type constructors!
   
  +
<haskell>
 
type X a = Either a a
 
type X a = Either a a
  +
</haskell>
   
 
declares TYPE FUNCTION named "X". It's parameter "a" must be some type
 
declares TYPE FUNCTION named "X". It's parameter "a" must be some type
Line 45: Line 51:
 
pattern matching:
 
pattern matching:
   
  +
<haskell>
 
type F [a] = Set a
 
type F [a] = Set a
  +
</haskell>
   
 
multiple statements (that are meaningful only in presence of pattern matching):
 
multiple statements (that are meaningful only in presence of pattern matching):
   
  +
<haskell>
 
type F Bool = Char
 
type F Bool = Char
 
F String = Int
 
F String = Int
  +
</haskell>
   
 
and recursion (that is, indeed, needs two previous extensions):
 
and recursion (that is, indeed, needs two previous extensions):
   
  +
<haskell>
 
type F [a] = F a
 
type F [a] = F a
 
F (Map a b) = F b
 
F (Map a b) = F b
 
F (Set a) = F a
 
F (Set a) = F a
 
F a = a
 
F a = a
  +
</haskell>
   
 
Last definition, as you may already guess, calculates simple base type
 
Last definition, as you may already guess, calculates simple base type
Line 65: Line 77:
 
Let's don't forget about statement guards:
 
Let's don't forget about statement guards:
   
  +
<haskell>
 
type F a | IsSimple a == TrueType = a
 
type F a | IsSimple a == TrueType = a
  +
</haskell>
   
 
Here we define type function F only for simple datatypes by using in
 
Here we define type function F only for simple datatypes by using in
 
guard type function "IsSimple":
 
guard type function "IsSimple":
   
  +
<haskell>
 
type IsSimple Bool = TrueType
 
type IsSimple Bool = TrueType
 
IsSimple Int = TrueType
 
IsSimple Int = TrueType
Line 78: Line 93:
 
data TrueType = T
 
data TrueType = T
 
data FalseType = F
 
data FalseType = F
  +
</haskell>
   
   
Line 83: Line 99:
 
let's consider a way to write this shorter:
 
let's consider a way to write this shorter:
   
  +
<haskell>
 
type F a | IsSimple a = a
 
type F a | IsSimple a = a
   
Line 89: Line 106:
 
....
 
....
 
IsSimple Double
 
IsSimple Double
  +
</haskell>
   
 
Here, we just defined list of simple types, the implied result of all
 
Here, we just defined list of simple types, the implied result of all
Line 97: Line 115:
 
complex type trying to decide whether it contains "Int" anywhere?
 
complex type trying to decide whether it contains "Int" anywhere?
   
  +
<haskell>
 
type HasInt Int
 
type HasInt Int
 
HasInt [a] = HasInt a
 
HasInt [a] = HasInt a
Line 102: Line 121:
 
HasInt (Map a b) | HasInt a
 
HasInt (Map a b) | HasInt a
 
HasInt (Map a b) | HasInt b
 
HasInt (Map a b) | HasInt b
  +
</haskell>
   
 
or type function that substitutes one type with another inside
 
or type function that substitutes one type with another inside
 
arbitrary-deep types:
 
arbitrary-deep types:
   
  +
<haskell>
 
type Replace t a b | t==a = b
 
type Replace t a b | t==a = b
 
Replace [t] a b = [Replace t a b]
 
Replace [t] a b = [Replace t a b]
Line 111: Line 132:
 
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b)
 
Replace (Map t1 t2) a b = Map (Replace t1 a b) (Replace t2 a b)
 
Replace t a b = t
 
Replace t a b = t
  +
</haskell>
 
 
   
Line 119: Line 141:
 
extension - type functions that may have MULTIPLE VALUES. Say,
 
extension - type functions that may have MULTIPLE VALUES. Say,
   
  +
<haskell>
 
type Collection a = [a]
 
type Collection a = [a]
 
Collection a = Set a
 
Collection a = Set a
 
Collection a = Map b a
 
Collection a = Map b a
  +
</haskell>
   
 
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as
 
So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as
Line 142: Line 166:
 
We can also represent multiple-value function as predicate:
 
We can also represent multiple-value function as predicate:
   
  +
<haskell>
 
type Collection a [a]
 
type Collection a [a]
 
Collection a (Set a)
 
Collection a (Set a)
 
Collection a (Map b a)
 
Collection a (Map b a)
  +
</haskell>
   
 
If you remember Prolog, you should guess that predicate, in contrast to
 
If you remember Prolog, you should guess that predicate, in contrast to
Line 150: Line 176:
 
parameter from other ones. For example, in this hypothetical definition:
 
parameter from other ones. For example, in this hypothetical definition:
   
  +
<haskell>
 
head | Collection Int a :: a -> Int
 
head | Collection Int a :: a -> Int
  +
</haskell>
   
 
we define 'head' function for any Collection containing Ints.
 
we define 'head' function for any Collection containing Ints.
Line 156: Line 184:
 
And in this, again, hypothetical definition:
 
And in this, again, hypothetical definition:
   
  +
<haskell>
 
data Safe c | Collection c a = Safe c a
 
data Safe c | Collection c a = Safe c a
  +
</haskell>
   
 
we deduced element type 'a' from collection type 'c' passed as the
 
we deduced element type 'a' from collection type 'c' passed as the
Line 171: Line 201:
 
TYPE CLASSES! Let's translate all our examples to their language:
 
TYPE CLASSES! Let's translate all our examples to their language:
   
  +
<haskell>
 
class IsSimple a
 
class IsSimple a
 
instance IsSimple Bool
 
instance IsSimple Bool
Line 176: Line 207:
 
....
 
....
 
instance IsSimple Double
 
instance IsSimple Double
  +
</haskell>
   
 
Haskell'98 standard supports type classes with only one parameter that
 
Haskell'98 standard supports type classes with only one parameter that
Line 182: Line 214:
 
arbitrarily-complex type functions
 
arbitrarily-complex type functions
   
  +
<haskell>
 
class Collection a c
 
class Collection a c
 
instance Collection a [a]
 
instance Collection a [a]
 
instance Collection a (Set a)
 
instance Collection a (Set a)
 
instance Collection a (Map b a)
 
instance Collection a (Map b a)
  +
</haskell>
   
 
All the "hypothetical" Haskell extensions we investigated earlier -
 
All the "hypothetical" Haskell extensions we investigated earlier -
Line 192: Line 226:
 
Pattern matching:
 
Pattern matching:
   
  +
<haskell>
 
instance Collection a [a]
 
instance Collection a [a]
  +
</haskell>
   
 
Multiple statements:
 
Multiple statements:
   
  +
<haskell>
 
instance Collection a [a]
 
instance Collection a [a]
 
instance Collection a (Set a)
 
instance Collection a (Set a)
  +
</haskell>
   
 
Recursion:
 
Recursion:
   
  +
<haskell>
 
instance (Collection a c) => Collection a [c]
 
instance (Collection a c) => Collection a [c]
  +
</haskell>
   
 
Pattern guards:
 
Pattern guards:
   
  +
<haskell>
 
instance (IsSimple a) => Collection a (UArray a)
 
instance (IsSimple a) => Collection a (UArray a)
  +
</haskell>
   
   
Line 212: Line 254:
 
its elements or indexes:
 
its elements or indexes:
   
  +
<haskell>
 
class HasInt a
 
class HasInt a
 
instance HasInt Int
 
instance HasInt Int
Line 217: Line 260:
 
instance (HasInt a) => HasInt (Map a b)
 
instance (HasInt a) => HasInt (Map a b)
 
instance (HasInt b) => HasInt (Map a b)
 
instance (HasInt b) => HasInt (Map a b)
  +
</haskell>
   
   
Line 222: Line 266:
 
'b' in type 't' and return result as 'res':
 
'b' in type 't' and return result as 'res':
   
  +
<haskell>
 
class Replace t a b res
 
class Replace t a b res
 
instance Replace t a a t
 
instance Replace t a a t
Line 230: Line 275:
 
=> Replace (Map t1 t2) a b (Map res1 res2)
 
=> Replace (Map t1 t2) a b (Map res1 res2)
 
instance Replace t a b t
 
instance Replace t a b t
  +
</haskell>
   
 
You can compare it to the hypothetical definition we given earlier.
 
You can compare it to the hypothetical definition we given earlier.
Line 245: Line 291:
 
which can be used to explicitly prohibit unwanted matches:
 
which can be used to explicitly prohibit unwanted matches:
   
  +
<haskell>
 
instance Replace t a a t
 
instance Replace t a a t
 
instance (a/=b) => Replace [t] a b [Replace t a b]
 
instance (a/=b) => Replace [t] a b [Replace t a b]
 
instance (a/=b, t/=[_]) => Replace t a b t
 
instance (a/=b, t/=[_]) => Replace t a b t
  +
</haskell>
   
 
You can check that these instances are no more overlaps.
 
You can check that these instances are no more overlaps.
Line 256: Line 304:
 
type classes provide - member functions. For example:
 
type classes provide - member functions. For example:
   
  +
<haskell>
 
class Collection a c where
 
class Collection a c where
 
foldr1 :: (a -> a -> a) -> c -> a
 
foldr1 :: (a -> a -> a) -> c -> a
Line 264: Line 313:
 
sum :: (Num a, Collection a c) => c -> a
 
sum :: (Num a, Collection a c) => c -> a
 
sum = foldr1 (+)
 
sum = foldr1 (+)
  +
</haskell>
   
   
Line 269: Line 319:
 
declarations like this:
 
declarations like this:
   
  +
<haskell>
 
data Safe c = (Collection c a) => Safe c a
 
data Safe c = (Collection c a) => Safe c a
  +
</haskell>
   
 
but afaik this is also not yet implemented
 
but afaik this is also not yet implemented
Line 275: Line 327:
   
 
UNIFICATION
 
UNIFICATION
  +
...
   
   
Line 291: Line 344:
 
following definition:
 
following definition:
   
  +
<haskell>
 
data Maybe a = Just a | Nothing
 
data Maybe a = Just a | Nothing
  +
</haskell>
   
 
defines type-to-data constructors function "Maybe" that has parameter
 
defines type-to-data constructors function "Maybe" that has parameter
Line 298: Line 353:
 
used above for multi-value type functions:
 
used above for multi-value type functions:
   
  +
<haskell>
 
data Maybe a = Just a
 
data Maybe a = Just a
 
Maybe a = Nothing
 
Maybe a = Nothing
  +
</haskell>
   
 
Or how about this:
 
Or how about this:
   
  +
<haskell>
 
data List a = Cons a (List a)
 
data List a = Cons a (List a)
 
List a = Nil
 
List a = Nil
  +
</haskell>
   
 
and this:
 
and this:
   
  +
<haskell>
 
data Either a b = Left a
 
data Either a b = Left a
 
Either a b = Right b
 
Either a b = Right b
  +
</haskell>
   
 
But how are flexible "data" definitions? As you should remember,
 
But how are flexible "data" definitions? As you should remember,
Line 328: Line 389:
 
and instead of
 
and instead of
   
  +
<haskell>
 
data Either a b = Left a
 
data Either a b = Left a
 
Either a b = Right b
 
Either a b = Right b
  +
</haskell>
   
 
we write just
 
we write just
   
  +
<haskell>
 
data Either a b = Left a
 
data Either a b = Left a
 
| Right b
 
| Right b
  +
</haskell>
   
   
Line 341: Line 406:
 
statements! How about this:
 
statements! How about this:
   
  +
<haskell>
 
data T String = D1 Int
 
data T String = D1 Int
 
T Bool = D2
 
T Bool = D2
 
T [a] = D3 (a,a)
 
T [a] = D3 (a,a)
  +
</haskell>
   
 
Amazed? After all, GADTs seems really very simple and obvious
 
Amazed? After all, GADTs seems really very simple and obvious
Line 357: Line 424:
 
be specified directly:
 
be specified directly:
   
  +
<haskell>
 
data Term a where
 
data Term a where
 
Lit :: Int -> Term Int
 
Lit :: Int -> Term Int
 
Pair :: Term a -> Term b -> Term (a,b)
 
Pair :: Term a -> Term b -> Term (a,b)
 
...
 
...
  +
</haskell>
   
 
In a function that performs pattern matching on Term, the pattern
 
In a function that performs pattern matching on Term, the pattern
Line 366: Line 435:
 
consider this function:
 
consider this function:
   
  +
<haskell>
 
eval :: Term a -> a
 
eval :: Term a -> a
 
eval (Lit i) = i
 
eval (Lit i) = i
 
eval (Pair a b) = (eval a, eval b)
 
eval (Pair a b) = (eval a, eval b)
 
...
 
...
  +
</haskell>
   
 
If the argument matches Lit, it must have been built with a
 
If the argument matches Lit, it must have been built with a
Line 429: Line 500:
   
   
  +
</haskell>
 
data family Map k :: * -> *
 
data family Map k :: * -> *
   
 
data instance Map () v = MapUnit (Maybe v)
 
data instance Map () v = MapUnit (Maybe v)
 
data instance Map (a, b) v = MapPair (Map a (Map b v))
 
data instance Map (a, b) v = MapPair (Map a (Map b v))
  +
</haskell>
   
   
Line 441: Line 514:
 
let's consider well-known 'data' declarations:
 
let's consider well-known 'data' declarations:
   
  +
</haskell>
 
data T a = D a a Int
 
data T a = D a a Int
  +
</haskell>
   
 
it can be seen as function 'T' from type 'a' to some data constructor.
 
it can be seen as function 'T' from type 'a' to some data constructor.
  +
 
'T Bool', for example, gives result 'D Bool Bool Int', while
 
'T Bool', for example, gives result 'D Bool Bool Int', while
  +
 
'T [Int]' gives result 'D [Int] [Int] Int'.
 
'T [Int]' gives result 'D [Int] [Int] Int'.
   
 
'data' declaration can also have several "results", say
 
'data' declaration can also have several "results", say
   
  +
</haskell>
 
data Either a b = Left a | Right b
 
data Either a b = Left a | Right b
  +
</haskell>
   
 
and "result" of 'Either Int String' can be either "Left Int" or "Right
 
and "result" of 'Either Int String' can be either "Left Int" or "Right
Line 464: Line 543:
 
form of hint:
 
form of hint:
   
  +
</haskell>
 
type Collection :: a c | c->a
 
type Collection :: a c | c->a
 
Collection a [a]
 
Collection a [a]
 
Collection a (Set a)
 
Collection a (Set a)
 
Collection a (Map b a)
 
Collection a (Map b a)
  +
</haskell>
   
 
The first line i added tell the compiler that Collection predicate has
 
The first line i added tell the compiler that Collection predicate has
Line 474: Line 555:
 
different element types for the same collection:
 
different element types for the same collection:
   
  +
</haskell>
 
type Collection :: a c | c->a
 
type Collection :: a c | c->a
 
Collection a (Map b a)
 
Collection a (Map b a)
 
Collection b (Map b a) -- error! prohibits functional dependency
 
Collection b (Map b a) -- error! prohibits functional dependency
  +
</haskell>
   
 
Of course, Collection is just a function from 'c' to 'a', but if we
 
Of course, Collection is just a function from 'c' to 'a', but if we
 
will define it directly as a function:
 
will define it directly as a function:
   
  +
</haskell>
 
type Collection [a] = a
 
type Collection [a] = a
 
Collection (Set a) = a
 
Collection (Set a) = a
 
Collection (Map b a) = a
 
Collection (Map b a) = a
  +
</haskell>
   
 
- it can't be used in 'head' definition above. Moreover, using
 
- it can't be used in 'head' definition above. Moreover, using
 
functional dependencies we can define bi-directional functions:
 
functional dependencies we can define bi-directional functions:
   
  +
</haskell>
 
type TwoTimesBigger :: a b | a->b, b->a
 
type TwoTimesBigger :: a b | a->b, b->a
 
TwoTimesBigger Int8 Int16
 
TwoTimesBigger Int8 Int16
 
TwoTimesBigger Int16 Int32
 
TwoTimesBigger Int16 Int32
 
TwoTimesBigger Int32 Int64
 
TwoTimesBigger Int32 Int64
  +
</haskell>
   
 
or predicates with 3, 4 or more parameters with any relations between
 
or predicates with 3, 4 or more parameters with any relations between

Revision as of 09:12, 15 November 2007

long time i don't understood what GADT is and how they can be used. it was sort of conspiracy of silence - people who understand GADTs think that this is obvious thing and don't need any explanation but i still can't understood.

now i got an idea and think that it was really obvious :) and i want to share my understanding - may be my way to realize GADTs can help someone else. so

Type functions

"data" declaration is a way to declare both type constructor and data constructors. For example,

data Either a b = Left a | Right b

declares type constructor "Either" and two data constructors "Left" and "Right". Ordinary Haskell functions works with data constructors:

isLeft (Left a) = True
isLeft (Right b) = False

but there is also a way to work with type constructors!

type X a = Either a a

declares TYPE FUNCTION named "X". It's parameter "a" must be some type and it returns some type as result. We can't use "X" on data values, but we can use it on type values. Type constructors declared with "data" statements and type functions declared with "type" statements used together to build arbitrarily complex types. In such "computations" type constructors serves as basic "values" and type functions as a way to process them.

Indeed, type functions in Haskell is very limited comparing to ordinary functions - they don't support pattern matching on left part, nor multiple statements, nor recursion


Hypothetical Haskell extension - Full-featured type functions

Let's build hypothetical Haskell extension, that mimics for type functions the well-known ways to define ordinary functions, including pattern matching:

type F [a] = Set a

multiple statements (that are meaningful only in presence of pattern matching):

type F Bool   = Char
     F String = Int

and recursion (that is, indeed, needs two previous extensions):

type F [a]       = F a
     F (Map a b) = F b
     F (Set a)   = F a
     F a         = a

Last definition, as you may already guess, calculates simple base type of arbitrarily-nested collection: F [[[Set Int]]] = Int


Let's don't forget about statement guards:

type F a | IsSimple a == TrueType  = a

Here we define type function F only for simple datatypes by using in guard type function "IsSimple":

type IsSimple Bool = TrueType
     IsSimple Int = TrueType
     ....
     IsSimple Double = TrueType
     IsSimple a = FalseType

data TrueType = T
data FalseType = F


These definitions seem a bit odd, and while we are in imaginary land, let's consider a way to write this shorter:

type F a | IsSimple a  = a

type IsSimple Bool
     IsSimple Int
     ....
     IsSimple Double

Here, we just defined list of simple types, the implied result of all written statements for "IsSimple" is True value, and False value for anything else. So, "IsSimple" essentially is no less than TYPE PREDICATE!

I really love it! :) How about constructing predicate that diggers complex type trying to decide whether it contains "Int" anywhere?

type HasInt Int
     HasInt [a] = HasInt a
     HasInt (Set a) = HasInt a
     HasInt (Map a b) | HasInt a
     HasInt (Map a b) | HasInt b

or type function that substitutes one type with another inside arbitrary-deep types:

type Replace t a b | t==a    =  b
     Replace [t] a b         =  [Replace t a b]
     Replace (Set t) a b     =  Set (Replace t a b)
     Replace (Map t1 t2) a b =  Map (Replace t1 a b) (Replace t2 a b)
     Replace t a b           =  t


One more hypothetical extension - multi-value type functions

Let's add more fun! We will introduce one more hypothetical Haskell extension - type functions that may have MULTIPLE VALUES. Say,

type Collection a = [a]
     Collection a = Set a
     Collection a = Map b a

So, "Collection Int" has "[Int]", "Set Int" and "Map String Int" as its values, i.e. different collection types with elements of type "Int".

Pay attention to that in last statement of "Collection" definition we've used type variable "b" that was not mentioned on the left side nor defined in any other way. It's perfectly possible - anyway "Collection" function has multiple values, so using on the right side some free variable that can be replaced with any type is not a problem at all - the "Map Bool Int", "Map [Int] Int" and "Map Int Int" all are possible values of "Collection Int" along with "[Int]" and "Set Int".

On the first look, it seems that multiple-value functions are meaningless - they can't be used to define datatypes, because we need concrete types here. But on the second look :) we can find them useful to define type constraints and type families.

We can also represent multiple-value function as predicate:

type Collection a [a]
     Collection a (Set a)
     Collection a (Map b a)

If you remember Prolog, you should guess that predicate, in contrast to function, is multi-purpose thing - it can be used to deduce any parameter from other ones. For example, in this hypothetical definition:

head | Collection Int a  ::  a -> Int

we define 'head' function for any Collection containing Ints.

And in this, again, hypothetical definition:

data Safe c  | Collection c a =  Safe c a

we deduced element type 'a' from collection type 'c' passed as the parameter to the type constructor.


Back to real Haskell - type classes

Reading all those glorious examples you may be wondering - why Haskell don't yet supports full-featured type functions? Hold your breath... Haskell already contains them and at least GHC implements all the mentioned abilities more than 10 years ago! They just was named... TYPE CLASSES! Let's translate all our examples to their language:

class IsSimple a
instance IsSimple Bool
instance IsSimple Int
     ....
instance IsSimple Double

Haskell'98 standard supports type classes with only one parameter that limits us to defining only type predicates like this one. But GHC and Hugs supports multi-parameter type classes that allows us to define arbitrarily-complex type functions

class Collection a c
instance Collection a [a]
instance Collection a (Set a)
instance Collection a (Map b a)

All the "hypothetical" Haskell extensions we investigated earlier - actually implemented at the type class level!

Pattern matching:

instance Collection a [a]

Multiple statements:

instance Collection a [a]
instance Collection a (Set a)

Recursion:

instance (Collection a c) => Collection a [c]

Pattern guards:

instance (IsSimple a) => Collection a (UArray a)


Let's define type class which contains any collection which uses Int as its elements or indexes:

class HasInt a
instance HasInt Int
instance (HasInt a) => HasInt [a]
instance (HasInt a) => HasInt (Map a b)
instance (HasInt b) => HasInt (Map a b)


Anther example is a class that replaces all occurrences of 'a' with 'b' in type 't' and return result as 'res':

class Replace t a b res
instance Replace t a a t
instance Replace [t] a b [Replace t a b]
instance (Replace t a b res)
          => Replace (Set t) a b (Set res)
instance (Replace t1 a b res1, Replace t2 a b res2)
          => Replace (Map t1 t2) a b (Map res1 res2)
instance Replace t a b t

You can compare it to the hypothetical definition we given earlier. It's important to note that type class instances, as opposite to function statements, are not checked in order. Instead, most _specific_ instance automatically selected. So, in Replace case, the last instance that is most general will be selected only if all other are failed to match and that is that we want.

In many other cases this automatic selection is not powerful enough and we are forced to use some artificial tricks or complain to the language developers. The two most well-known language extensions proposed to solve such problems are instance priorities, which allow to explicitly specify instance selection order, and '/=' constraints, which can be used to explicitly prohibit unwanted matches:

instance Replace t a a t
instance (a/=b)   => Replace [t] a b [Replace t a b]
instance (a/=b, t/=[_]) => Replace t a b t

You can check that these instances are no more overlaps.


At practice, type-level arithmetics by itself is not very useful. It becomes really strong weapon when combined with another feature that type classes provide - member functions. For example:

class Collection a c where
    foldr1 :: (a -> a -> a) -> c -> a
    
class Num a where
    (+) :: a -> a -> a

sum :: (Num a, Collection a c) =>  c -> a
sum = foldr1 (+)


I'll be also glad to see possibility to use type classes in data declarations like this:

data Safe c = (Collection c a) => Safe c a

but afaik this is also not yet implemented


UNIFICATION ...


Back to GADTs

If you are wonder how relates all these interesting type manipulations to GADTs, now is the time to give you answer. As you know, Haskell contains highly developed ways to express data-to-data functions. Now we also know that Haskell contains rich facilities to write type-to-type functions in form of "type" statements and type classes. But how "data" statements fits in this infrastructure?

My answer: they just defines type-to-data constructors translation. Moreover, this translation may give multiple results. Say, the following definition:

data Maybe a = Just a | Nothing

defines type-to-data constructors function "Maybe" that has parameter "a" and for each "a" has two possible results - "Just a" and "Nothing". We can rewrite it in the same hypothetical syntax that was used above for multi-value type functions:

data Maybe a = Just a
     Maybe a = Nothing

Or how about this:

data List a = Cons a (List a)
     List a = Nil

and this:

data Either a b = Left a
     Either a b = Right b

But how are flexible "data" definitions? As you should remember, "type" definitions was very limited in their features, while type classes, vice versa, much more developed than ordinary Haskell functions facilities. What about features of "data" definitions examined as sort of functions?

On the one side, they supports multiple statements and multiple results and can be recursive, like the "List" definition above. On the other side, that's all - no pattern matching or even type constants on the left side and no guards.

Lack of pattern matching means that left side can contain only free type variables, that in turn means that left sides of all "data" statements for one type will be essentially the same. Therefore, repeated left sides in multi-statement "data" definitions are omitted and instead of

data Either a b = Left a
     Either a b = Right b

we write just

data Either a b = Left a
                | Right b


And here finally comes the GADTs! It's just a way to define data types using pattern matching and constants on the left side of "data" statements! How about this:

data T String = D1 Int
     T Bool   = D2
     T [a]    = D3 (a,a)

Amazed? After all, GADTs seems really very simple and obvious extension to data type definition facilities.




The idea is to allow a data constructor's return type to be specified directly:

data Term a where
    Lit :: Int ->  Term Int
    Pair :: Term a -> Term b -> Term (a,b)
    ...

In a function that performs pattern matching on Term, the pattern match gives type as well as value information. For example, consider this function:

eval :: Term a -> a
eval (Lit i)     = i
eval (Pair a b)  = (eval a, eval b)
...

If the argument matches Lit, it must have been built with a Lit constructor, so type 'a' must be Int, and hence we can return 'i' (an Int) in the right hand side. The same objections applies to the Pair constructor.




Further reading

The best paper on type level arithmetic using type classes i've seen is "Faking it: simulating dependent types in Haskell" ( http://www.cs.nott.ac.uk/~ctm/faking.ps.gz ). Most part of my article is just duplicates his work.

The great demonstration of type-level arithmetic is TypeNats package which "defines type-level natural numbers and arithmetic operations on them including addition, subtraction, multiplication, division and GCD" ( darcs get --partial --tag '0.1' http://www.eecs.tufts.edu/~rdocki01/typenats/ )

I should also mention here Oleg Kiselyov page on type-level programming in Haskell: http://okmij.org/ftp/Haskell/types.html


There are plenty of GADT-related papers, but best for beginners remains the "Fun with phantom types" (http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf). Phantom types is another name of GADT. You should also know that this paper uses old GADT syntax. This paper is must-read because it contains numerous examples of practical GADT usage - theme completely omitted from my article.


Other GADT-related papers i know:

"Dynamic Optimization for Functional Reactive Programming using Generalized Algebraic Data Types" http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf

"Phantom types" (actually more scientific version of "Fun with phantom types") http://citeseer.ist.psu.edu/rd/0,606209,1,0.25,Download/http:qSqqSqwww.informatik.uni-bonn.deqSq~ralfqSqpublicationsqSqPhantom.ps.gz

"Phantom types and subtyping" http://arxiv.org/ps/cs.PL/0403034

"Existentially quantified type classes" by Stuckey, Sulzmann and Wazny (URL?)





Random rubbish from previous versions of article

</haskell> data family Map k :: * -> *

data instance Map () v = MapUnit (Maybe v) data instance Map (a, b) v = MapPair (Map a (Map b v)) </haskell>




let's consider well-known 'data' declarations:

</haskell> data T a = D a a Int </haskell>

it can be seen as function 'T' from type 'a' to some data constructor.

'T Bool', for example, gives result 'D Bool Bool Int', while

'T [Int]' gives result 'D [Int] [Int] Int'.

'data' declaration can also have several "results", say

</haskell> data Either a b = Left a | Right b </haskell>

and "result" of 'Either Int String' can be either "Left Int" or "Right String"




Well, to give compiler confidence that 'a' can be deduced in just one way from 'c', we can add some form of hint:

</haskell> type Collection :: a c | c->a

    Collection a [a]
    Collection a (Set a)
    Collection a (Map b a)

</haskell>

The first line i added tell the compiler that Collection predicate has two parameters and the second parameter determines the first. Based on this restriction, compiler can detect and prohibit attempts to define different element types for the same collection:

</haskell> type Collection :: a c | c->a

    Collection a (Map b a)
    Collection b (Map b a)  -- error! prohibits functional dependency

</haskell>

Of course, Collection is just a function from 'c' to 'a', but if we will define it directly as a function:

</haskell> type Collection [a] = a

    Collection (Set a)    = a
    Collection (Map b a)  = a

</haskell>

- it can't be used in 'head' definition above. Moreover, using functional dependencies we can define bi-directional functions:

</haskell> type TwoTimesBigger :: a b | a->b, b->a

    TwoTimesBigger Int8 Int16
    TwoTimesBigger Int16 Int32
    TwoTimesBigger Int32 Int64

</haskell>

or predicates with 3, 4 or more parameters with any relations between them. It's a great power!