GADT Type Checking GHC 6.10 versus older GHC

Jason Dagit dagit at
Mon Nov 24 08:30:49 EST 2008

On Mon, Nov 24, 2008 at 12:23 AM, Simon Peyton-Jones
<simonpj at>wrote:

>    In my case, we had rigid type signatures all over the place.  The wiki
> document says that the type must be rigid at the point of the match.  I
> guess that's what we were violating.  If the code I posted isn't supposed to
> type check then I would like to report, as user feedback, that GADTs have
> become unwieldy.
> I grant that it's less convenient than one would like.  The difficulty is
> that GADTs get you into territory where it's easy to write programs that
> have multiple **incomparable** types.   That is, there is no "best" type
> (unlike Hindley-Milner).  So we pretty much have to ask the programmer to
> express the type.
> Once we are in that territory, we need to give simple rules that say when a
> type signature is needed.   I know that I have not yet found a way to
> express these rules -- perhaps GHC's users can help.  My initial shot is
> I couldn't figure out how to fix that code by just adding a type signature.
> Did you try giving a type signature to the (entire) case expression, as I
> suggested?  That should do it.
Perhaps I don't understand the suggestion, but for me the only way I could
fix it was to put all the pattern matches into local functions with type
signatures.  I can show you the diffs if you would find it useful to see how
'programmers in the wild' react to this.

> I urge you to consider designing a modified or new syntactic form for
> working with GADT pattern matches.  The quasi-dependent typing that GADTs
> give developers is very powerful and it would seem that GHC Haskell with
> GADTs is as close to dependent typing that developers writing "real-world"
> software can get.  I know of no other production ready compilers that
> provide dependent or close to dependent typing.  Dependent typing seems to
> be a growing area of interest.  For these reasons I think it's important for
> GHC to focus on making them pleasanter to work with again; even if it means
> adding to the language again.
> If I knew how to do that, I'd love to.  Here's one idea you might not like:
> restrict GADT matching to function definitions only (not case expressions),
> and require a type signature for such pattern matches.  That wouldn't
> require adding new stuff.  But GHC's current story is a bit more flexible.
In practice that's what we are doing.  Any new code we write using our GADT
data types is written in exactly that style because removing
case-expressions on each GHC release is mildly annoying.  As a user of GHC I
can't predict when a case-expression on a GADT will be valid and when it
give an error.

Thankfully there ended up being less than 10 functions that we had to modify
to become compatible with ghc-6.10.  Some tasks look more daunting from the
outset than they are in reality.

> I also feel that the type errors given when working with existential types,
> especially GADTs with existentials, are confusing.  I think mostly because
> the types of the sub-expressions in the program are not visible to the user.
>  More introspection into the inferred types would help users.  I have some
> ideas on how to improve this, what the output should look like and I would
> like to implement it too, but I haven't had a chance to start a prototype
> yet.
> I agree they are confusing.  I always encourage people to suggest
> alternative wording for an error message that would make it less confusing
> (leaving aside how feasible or otherwise it'd be to generate such wording).
> So next time you are confused but figure it out, do suggest an alternative.
The first two stumbling blocks I ever had with the type errors was the
meaning of rigid type and the difference between inferred and expected type.

These days neither of those are an issue for me, but maybe its useful to
hear.  I take rigid type to mean a type that I specified.  It may not be
100% accurate but it's close enough for intuiton.  I think the problem with
inferred vs. expected is that the difference in their meanings is too
narrow.  My understanding is that the expected type is based on rigid types
and the inferred type is based on the type inference.  Using words that are
more polar in meaning, but still accurate, could potentially help here.
 Such as inferred vs. provided.  But this is very minor.

I think the real problem here is that the type checker does a very
non-trivial amount of reasoning about your code then it spits out some well
written error messages with a good deal of static only context.  What I mean
is, the errors do a good job of giving me back static information thus
saving me some time of looking up the context and also helping me zero in on
the trouble area, but it doesn't help me to understand the why of the error.

I think the information that is still lacking is knowledge about the process
that the type checker used to arrive at its conclusion.  Every time I've
been really stumped by the type checker I used the exact same trick to get
unstuck.  I mimicked a type checker on paper to the best of my ability until
I had annotated a paper version of the code with types at each
sub-expression.  Then I compared that against the output from GHC.  Only
once I had that annotated version of the code could I sit back and think
about what was wrong with the code to fix it.

Obviously this annotation could be done by a machine.  I think if the error
message could dump your program annotated with types to html that we could
achieve this introspection.  I think there is a lot of fine tuning that
would need to be done here, but that basic idea could be carried very far.
 For example, rendering all the eigenvariables that result from existential
types in a different alphabet, such as greek, may be a useful fine tuning.

I picture this as a -ddump-types flag or something of that nature and it
would need to be able to bump the inferred versus expected types as well.
 Eventually it may even be handy if the programmer could 'single step'
through the type check process.

I would like to work on the above feature, and I really probably should have
developed that before my work with GADTs, but it's hard to know ahead of
time what will be a hurdle.  I admit I know very little about how GHC is
implemented, so do you think that the above should be possible?  Do you have
advice on getting started over that which can be found on the ghc trac?

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Glasgow-haskell-users mailing list