Should exhaustiveness testing be on by default?

Mon May 18 07:06:51 EDT 2009

> I'm not sure I'd want -Wall on by default (though being -Wall clean is
> very good). But exhaustive pattern checking might well help out a lot of
> people coming from untyped backgrounds.
> 
>    http://ocaml.janestreet.com/?q=node/64
> 
> Ron's also wondering why exhaustive pattern checking isn't on ?
> 
> Anyone know why it isn't the default?

The answer to the latter is probably just that it is imprecise. Don't
read on if that's all you wanted to hear;-)

But before anyone goes suggesting to change the defaults, consider 
that it would be solving the wrong problem

- exhaustiveness of Haskell matching is not decidable, so you end up
    with an approximation; if you want to enforce this particular
    approximation, you end up with "defensive programming"

- defensive programming may look good in blame-driven development
    methods, but is actually bad for fault-tolerance (google for "Erlang
    defensive programming")

- if you are serious about correctness, rather than the appearance of
    correctness, there are other approaches:

    - non-defensive programming with external recovery management
        is the "industry-standard" (well, in Erlang industries, at least;-);

        the idea being that trying to handle all cases everywhere makes
        for unreadable code that doesn't address the problem, while the
        combination of checking cases on entry/handling the expected
        case locally/handling unexpected cases non-locally has been
        demonstrated to work in high-reliability scenarios.

    - checking for non-crashing rather than syntactic coverage is slightly
        better, but will likely only demonstrate that there are cases where
        you can't put in a useful answer locally; 

        you can add local information to the error before raising it (which
        would still be reported as a crash), but trying to handle a case 
        locally (just to avoid the local crash) when you don't have the 
        information to do so is just going to hide bugs (which is not what 
        a coding style or warning should encourage); 

        the way forward, if you believe Erlangers, is to embrace localised
        crashes (google for Erlang motto: "let it crash";-), and have the
        infrastructure in place to automatically add useful information to
        crash messages [1], to programatically deal with crashes (preferably 
        in a separate process on an independent machine), and to factor
        out the recovery code from the business-as-usual code (which is
        why Erlang programs with good fault-tolerance characteristics
        are often shorter than non-Erlang programs without). 

        The added benefit is that when a crash happens, in spite of serious
        efforts spent on proving/testing/model checking, the system doesn't 
        just go "oops!".

    - if you want coverage to have value semantically, you need to 
        restrict the expressiveness of matching. In Haskell/OCaml, that
        would probably mean extensible variants/records, or possibly 
        GADTs, (and no guards), so that you get a type error when 
        *using* a value that is not covered (instead of a syntax error 
        when not covering a value that may not ever be used).

        In particular, if you really care about such things, you should
        check for the good cases once, then convert to types that do
        not have the bad cases. That still doesn't help you with the entry
        check, where the best option for the unexpected case might still
        be to raise an exception, but such a typing strategy makes 
        exhaustiveness checking slightly more useful.

In management brief: enforcing exhaustiveness testing without any 
other measures just ensures that your team members will never tell
you about things they do not know how to handle, so the first sign
of trouble you see is when everything comes down. Establishing a
non-local framework for dealing with local non-exhaustiveness and
encouraging team members to raise alarms when they see trouble
gives your team a chance to notice and recover from issues before
they become critical.

Perhaps syntactic exhaustiveness testing should be renamed to
certified-everything-swept-under-the-rug testing?-)

Note: this is no excuse for using things like 'head' unguardedly in
a large code base! Just that the solution is not to add an empty
case to 'head', but to ensure that 'head' is never called with an
empty list, so "exhaustiveness everywhere" is the wrong target. 
Adding an empty case to 'head' is to raise an alert with useful 
info, working around current weaknesses in the error-reporting 
infrastructure [1], after we've already run into an unhandled 
unexpected case.

If we get a "non-exhaustive" warning, it might mean a variety
of things (you might want to document things like 3/4 in the
types):

1 nothing (error in exhaustiveness checking)
2 nothing (approximation in exhaustiveness checking)
3 nothing (missing cases are proven never to arrive here)
4 nothing (exceptional cases crash here, are dealt with properly
    elsewhere)
5 wrong type used (you shouldn't even try to make this match
    exhaustive, you should use a type that encodes only the
    cases you handle)

beside the naive expectation

- missing case (need to cover the case)

These warnings need to be used with care, and with a 
conscious choice of strategy/framework, not by default. If
you have a good strategy/framework in place, switch them
on, even make them errors, but doing so by default would
only encourage bad coding practices.

Just another opinion (should I've put it on a blog instead?-)
Claus

"Did you want to talk about the weather or were you just
 making chit-chat?" Weather man, Groundhog Day

[1] http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack