7.3. Syntactic extensions

7.3.1. Unicode syntax

The language extension -XUnicodeSyntax enables Unicode characters to be used to stand for certain ASCII character sequences. The following alternatives are provided:

ASCIIUnicode alternativeCode pointName
::::0x2237PROPORTION
=>0x21D2RIGHTWARDS DOUBLE ARROW
forall0x2200FOR ALL
->0x2192RIGHTWARDS ARROW
<-0x2190LEFTWARDS ARROW
-<0x2919LEFTWARDS ARROW-TAIL
>-0x291ARIGHTWARDS ARROW-TAIL
-<< 0x291BLEFTWARDS DOUBLE ARROW-TAIL
>>- 0x291CRIGHTWARDS DOUBLE ARROW-TAIL
*0x2605BLACK STAR

7.3.2. The magic hash

The language extension -XMagicHash allows "#" as a postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is a valid type constructor or data constructor.

The hash sign does not change semantics at all. We tend to use variable names ending in "#" for unboxed values or types (e.g. Int#), but there is no requirement to do so; they are just plain ordinary variables. Nor does the -XMagicHash extension bring anything into scope. For example, to bring Int# into scope you must import GHC.Prim (see Section 7.2, “Unboxed types and primitive operations”); the -XMagicHash extension then allows you to refer to the Int# that is now in scope.

The -XMagicHash also enables some new forms of literals (see Section 7.2.1, “Unboxed types ”):

  • 'x'# has type Char#

  • "foo"# has type Addr#

  • 3# has type Int#. In general, any Haskell integer lexeme followed by a # is an Int# literal, e.g. -0x3A# as well as 32#

    .
  • 3## has type Word#. In general, any non-negative Haskell integer lexeme followed by ## is a Word#.

  • 3.2# has type Float#.

  • 3.2## has type Double#

7.3.3. Hierarchical Modules

GHC supports a small extension to the syntax of module names: a module name is allowed to contain a dot ‘.’. This is also known as the “hierarchical module namespace” extension, because it extends the normally flat Haskell module namespace into a more flexible hierarchy of modules.

This extension has very little impact on the language itself; modules names are always fully qualified, so you can just think of the fully qualified module name as the module name. In particular, this means that the full module name must be given after the module keyword at the beginning of the module; for example, the module A.B.C must begin

module A.B.C

It is a common strategy to use the as keyword to save some typing when using qualified names with hierarchical modules. For example:

import qualified Control.Monad.ST.Strict as ST

For details on how GHC searches for source and interface files in the presence of hierarchical modules, see Section 4.7.3, “The search path”.

GHC comes with a large collection of libraries arranged hierarchically; see the accompanying library documentation. More libraries to install are available from HackageDB.

7.3.4. Pattern guards

The discussion that follows is an abbreviated version of Simon Peyton Jones's original proposal. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)

Suppose we have an abstract data type of finite maps, with a lookup operation:

lookup :: FiniteMap -> Int -> Maybe Int

The lookup returns Nothing if the supplied key is not in the domain of the mapping, and (Just v) otherwise, where v is the value that the key maps to. Now consider the following definition:

clunky env var1 var2 | ok1 && ok2 = val1 + val2
| otherwise  = var1 + var2
where
  m1 = lookup env var1
  m2 = lookup env var2
  ok1 = maybeToBool m1
  ok2 = maybeToBool m2
  val1 = expectJust m1
  val2 = expectJust m2

The auxiliary functions are

maybeToBool :: Maybe a -> Bool
maybeToBool (Just x) = True
maybeToBool Nothing  = False

expectJust :: Maybe a -> a
expectJust (Just x) = x
expectJust Nothing  = error "Unexpected Nothing"

What is clunky doing? The guard ok1 && ok2 checks that both lookups succeed, using maybeToBool to convert the Maybe types to booleans. The (lazily evaluated) expectJust calls extract the values from the results of the lookups, and binds the returned values to val1 and val2 respectively. If either lookup fails, then clunky takes the otherwise case and returns the sum of its arguments.

This is certainly legal Haskell, but it is a tremendously verbose and un-obvious way to achieve the desired effect. Arguably, a more direct way to write clunky would be to use case expressions:

clunky env var1 var2 = case lookup env var1 of
  Nothing -> fail
  Just val1 -> case lookup env var2 of
    Nothing -> fail
    Just val2 -> val1 + val2
where
  fail = var1 + var2

This is a bit shorter, but hardly better. Of course, we can rewrite any set of pattern-matching, guarded equations as case expressions; that is precisely what the compiler does when compiling equations! The reason that Haskell provides guarded equations is because they allow us to write down the cases we want to consider, one at a time, independently of each other. This structure is hidden in the case version. Two of the right-hand sides are really the same (fail), and the whole expression tends to become more and more indented.

Here is how I would write clunky:

clunky env var1 var2
  | Just val1 <- lookup env var1
  , Just val2 <- lookup env var2
  = val1 + val2
...other equations for clunky...

The semantics should be clear enough. The qualifiers are matched in order. For a <- qualifier, which I call a pattern guard, the right hand side is evaluated and matched against the pattern on the left. If the match fails then the whole guard fails and the next equation is tried. If it succeeds, then the appropriate binding takes place, and the next qualifier is matched, in the augmented environment. Unlike list comprehensions, however, the type of the expression to the right of the <- is the same as the type of the pattern to its left. The bindings introduced by pattern guards scope over all the remaining guard qualifiers, and over the right hand side of the equation.

Just as with list comprehensions, boolean expressions can be freely mixed with among the pattern guards. For example:

f x | [y] <- x
    , y > 3
    , Just z <- h y
    = ...

Haskell's current guards therefore emerge as a special case, in which the qualifier list has just one element, a boolean expression.

7.3.5. View patterns

View patterns are enabled by the flag -XViewPatterns. More information and examples of view patterns can be found on the Wiki page.

View patterns are somewhat like pattern guards that can be nested inside of other patterns. They are a convenient way of pattern-matching against values of abstract types. For example, in a programming language implementation, we might represent the syntax of the types of the language as follows:

type Typ

data TypView = Unit
             | Arrow Typ Typ

view :: Type -> TypeView

-- additional operations for constructing Typ's ...

The representation of Typ is held abstract, permitting implementations to use a fancy representation (e.g., hash-consing to manage sharing). Without view patterns, using this signature a little inconvenient:

size :: Typ -> Integer
size t = case view t of
  Unit -> 1
  Arrow t1 t2 -> size t1 + size t2

It is necessary to iterate the case, rather than using an equational function definition. And the situation is even worse when the matching against t is buried deep inside another pattern.

View patterns permit calling the view function inside the pattern and matching against the result:

size (view -> Unit) = 1
size (view -> Arrow t1 t2) = size t1 + size t2

That is, we add a new form of pattern, written expression -> pattern that means "apply the expression to whatever we're trying to match against, and then match the result of that application against the pattern". The expression can be any Haskell expression of function type, and view patterns can be used wherever patterns are used.

The semantics of a pattern ( exp -> pat ) are as follows:

  • Scoping:

    The variables bound by the view pattern are the variables bound by pat.

    Any variables in exp are bound occurrences, but variables bound "to the left" in a pattern are in scope. This feature permits, for example, one argument to a function to be used in the view of another argument. For example, the function clunky from Section 7.3.4, “Pattern guards” can be written using view patterns as follows:

    clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
    ...other equations for clunky...
    

    More precisely, the scoping rules are:

    • In a single pattern, variables bound by patterns to the left of a view pattern expression are in scope. For example:

      example :: Maybe ((String -> Integer,Integer), String) -> Bool
      example Just ((f,_), f -> 4) = True
      

      Additionally, in function definitions, variables bound by matching earlier curried arguments may be used in view pattern expressions in later arguments:

      example :: (String -> Integer) -> String -> Bool
      example f (f -> 4) = True
      

      That is, the scoping is the same as it would be if the curried arguments were collected into a tuple.

    • In mutually recursive bindings, such as let, where, or the top level, view patterns in one declaration may not mention variables bound by other declarations. That is, each declaration must be self-contained. For example, the following program is not allowed:

      let {(x -> y) = e1 ;
           (y -> x) = e2 } in x
      

      (For some amplification on this design choice see Trac #4061.)

  • Typing: If exp has type T1 -> T2 and pat matches a T2, then the whole view pattern matches a T1.

  • Matching: To the equations in Section 3.17.3 of the Haskell 98 Report, add the following:

    case v of { (e -> p) -> e1 ; _ -> e2 }
     =
    case (e v) of { p -> e1 ; _ -> e2 }
    

    That is, to match a variable v against a pattern ( exp -> pat ), evaluate ( exp v ) and match the result against pat.

  • Efficiency: When the same view function is applied in multiple branches of a function definition or a case expression (e.g., in size above), GHC makes an attempt to collect these applications into a single nested case expression, so that the view function is only applied once. Pattern compilation in GHC follows the matrix algorithm described in Chapter 4 of The Implementation of Functional Programming Languages. When the top rows of the first column of a matrix are all view patterns with the "same" expression, these patterns are transformed into a single nested case. This includes, for example, adjacent view patterns that line up in a tuple, as in

    f ((view -> A, p1), p2) = e1
    f ((view -> B, p3), p4) = e2
    

    The current notion of when two view pattern expressions are "the same" is very restricted: it is not even full syntactic equality. However, it does include variables, literals, applications, and tuples; e.g., two instances of view ("hi", "there") will be collected. However, the current implementation does not compare up to alpha-equivalence, so two instances of (x, view x -> y) will not be coalesced.

7.3.6. n+k patterns

n+k pattern support is enabled by default. To disable it, you can use the -XNoNPlusKPatterns flag.

7.3.7. The recursive do-notation

The do-notation of Haskell 98 does not allow recursive bindings, that is, the variables bound in a do-expression are visible only in the textually following code block. Compare this to a let-expression, where bound variables are visible in the entire binding group. It turns out that several applications can benefit from recursive bindings in the do-notation. The -XDoRec flag provides the necessary syntactic support.

Here is a simple (albeit contrived) example:

{-# LANGUAGE DoRec #-}
justOnes = do { rec { xs <- Just (1:xs) }
              ; return (map negate xs) }

As you can guess justOnes will evaluate to Just [-1,-1,-1,....

The background and motivation for recursive do-notation is described in A recursive do for Haskell, by Levent Erkok, John Launchbury, Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania. The theory behind monadic value recursion is explained further in Erkok's thesis Value Recursion in Monadic Computations. However, note that GHC uses a different syntax than the one described in these documents.

7.3.7.1. Details of recursive do-notation

The recursive do-notation is enabled with the flag -XDoRec or, equivalently, the LANGUAGE pragma DoRec. It introduces the single new keyword "rec", which wraps a mutually-recursive group of monadic statements, producing a single statement.

Similar to a let statement, the variables bound in the rec are visible throughout the rec group, and below it. For example, compare

do { a <- getChar              do { a <- getChar
   ; let { r1 = f a r2	           ; rec { r1 <- f a r2
         ; r2 = g r1 }	                 ; r2 <- g r1 }
   ; return (r1 ++ r2) }          ; return (r1 ++ r2) }

In both cases, r1 and r2 are available both throughout the let or rec block, and in the statements that follow it. The difference is that let is non-monadic, while rec is monadic. (In Haskell let is really letrec, of course.)

The static and dynamic semantics of rec can be described as follows:

  • First, similar to let-bindings, the rec is broken into minimal recursive groups, a process known as segmentation. For example:

    rec { a <- getChar      ===>     a <- getChar
        ; b <- f a c                 rec { b <- f a c
        ; c <- f b a                     ; c <- f b a }
        ; putChar c }                putChar c
    

    The details of segmentation are described in Section 3.2 of A recursive do for Haskell. Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper describes, also has a semantic effect (unless the monad satisfies the right-shrinking law).

  • Then each resulting rec is desugared, using a call to Control.Monad.Fix.mfix. For example, the rec group in the preceding example is desugared like this:

    rec { b <- f a c     ===>    (b,c) <- mfix (\~(b,c) -> do { b <- f a c
        ; c <- f b a }                                        ; c <- f b a
                                                              ; return (b,c) })
    

    In general, the statment rec ss is desugared to the statement

    vs <- mfix (\~vs -> do { ss; return vs })
    

    where vs is a tuple of the variables bound by ss.

    The original rec typechecks exactly when the above desugared version would do so. For example, this means that the variables vs are all monomorphic in the statements following the rec, because they are bound by a lambda.

    The mfix function is defined in the MonadFix class, in Control.Monad.Fix, thus:

    class Monad m => MonadFix m where
       mfix :: (a -> m a) -> m a
    

Here are some other important points in using the recursive-do notation:

  • It is enabled with the flag -XDoRec, which is in turn implied by -fglasgow-exts.

  • If recursive bindings are required for a monad, then that monad must be declared an instance of the MonadFix class.

  • The following instances of MonadFix are automatically provided: List, Maybe, IO. Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class for Haskell's internal state monad (strict and lazy, respectively).

  • Like let and where bindings, name shadowing is not allowed within a rec; that is, all the names bound in a single rec must be distinct (Section 3.3 of the paper).

  • It supports rebindable syntax (see Section 7.3.11, “Rebindable syntax and the implicit Prelude import”).

7.3.7.2.  Mdo-notation (deprecated)

GHC used to support the flag -XRecursiveDo, which enabled the keyword mdo, precisely as described in A recursive do for Haskell, but this is now deprecated. Instead of mdo { Q; e }, write do { rec Q; e }.

Historical note: The old implementation of the mdo-notation (and most of the existing documents) used the name MonadRec for the class and the corresponding library. This name is not supported by GHC.

7.3.8. Parallel List Comprehensions

Parallel list comprehensions are a natural extension to list comprehensions. List comprehensions can be thought of as a nice syntax for writing maps and filters. Parallel comprehensions extend this to include the zipWith family.

A parallel list comprehension has multiple independent branches of qualifier lists, each separated by a `|' symbol. For example, the following zips together two lists:

   [ (x, y) | x <- xs | y <- ys ]

The behavior of parallel list comprehensions follows that of zip, in that the resulting list will have the same length as the shortest branch.

We can define parallel list comprehensions by translation to regular comprehensions. Here's the basic idea:

Given a parallel comprehension of the form:

   [ e | p1 <- e11, p2 <- e12, ...
       | q1 <- e21, q2 <- e22, ...
       ...
   ]

This will be translated to:

   [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
                                         [(q1,q2) | q1 <- e21, q2 <- e22, ...]
                                         ...
   ]

where `zipN' is the appropriate zip for the given number of branches.

7.3.9. Generalised (SQL-Like) List Comprehensions

Generalised list comprehensions are a further enhancement to the list comprehension syntactic sugar to allow operations such as sorting and grouping which are familiar from SQL. They are fully described in the paper Comprehensive comprehensions: comprehensions with "order by" and "group by", except that the syntax we use differs slightly from the paper.

The extension is enabled with the flag -XTransformListComp.

Here is an example:

employees = [ ("Simon", "MS", 80)
, ("Erik", "MS", 100)
, ("Phil", "Ed", 40)
, ("Gordon", "Ed", 45)
, ("Paul", "Yale", 60)]

output = [ (the dept, sum salary)
| (name, dept, salary) <- employees
, then group by dept
, then sortWith by (sum salary)
, then take 5 ]

In this example, the list output would take on the value:

[("Yale", 60), ("Ed", 85), ("MS", 180)]

There are three new keywords: group, by, and using. (The function sortWith is not a keyword; it is an ordinary function that is exported by GHC.Exts.)

There are five new forms of comprehension qualifier, all introduced by the (existing) keyword then:

  • then f
    
    This statement requires that f have the type forall a. [a] -> [a]. You can see an example of its use in the motivating example, as this form is used to apply take 5.
  • then f by e
    

    This form is similar to the previous one, but allows you to create a function which will be passed as the first argument to f. As a consequence f must have the type forall a. (a -> t) -> [a] -> [a]. As you can see from the type, this function lets f "project out" some information from the elements of the list it is transforming.

    An example is shown in the opening example, where sortWith is supplied with a function that lets it find out the sum salary for any item in the list comprehension it transforms.

  • then group by e using f
    

    This is the most general of the grouping-type statements. In this form, f is required to have type forall a. (a -> t) -> [a] -> [[a]]. As with the then f by e case above, the first argument is a function supplied to f by the compiler which lets it compute e on every element of the list being transformed. However, unlike the non-grouping case, f additionally partitions the list into a number of sublists: this means that at every point after this statement, binders occurring before it in the comprehension refer to lists of possible values, not single values. To help understand this, let's look at an example:

    -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
    groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
    groupRuns f = groupBy (\x y -> f x == f y)
    
    output = [ (the x, y)
    | x <- ([1..3] ++ [1..2])
    , y <- [4..6]
    , then group by x using groupRuns ]
    

    This results in the variable output taking on the value below:

    [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
    

    Note that we have used the the function to change the type of x from a list to its original numeric type. The variable y, in contrast, is left unchanged from the list form introduced by the grouping.

  • then group by e
    

    This form of grouping is essentially the same as the one described above. However, since no function to use for the grouping has been supplied it will fall back on the groupWith function defined in GHC.Exts. This is the form of the group statement that we made use of in the opening example.

  • then group using f
    

    With this form of the group statement, f is required to simply have the type forall a. [a] -> [[a]], which will be used to group up the comprehension so far directly. An example of this form is as follows:

    output = [ x
    | y <- [1..5]
    , x <- "hello"
    , then group using inits]
    

    This will yield a list containing every prefix of the word "hello" written out 5 times:

    ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
    

7.3.10. Monad comprehensions

Monad comprehensions generalise the list comprehension notation, including parallel comprehensions (Section 7.3.8, “Parallel List Comprehensions”) and transform comprehensions (Section 7.3.9, “Generalised (SQL-Like) List Comprehensions”) to work for any monad.

Monad comprehensions support:

  • Bindings:

    [ x + y | x <- Just 1, y <- Just 2 ]
    

    Bindings are translated with the (>>=) and return functions to the usual do-notation:

    do x <- Just 1
       y <- Just 2
       return (x+y)
    
  • Guards:

    [ x | x <- [1..10], x <= 5 ]
    

    Guards are translated with the guard function, which requires a MonadPlus instance:

    do x <- [1..10]
       guard (x <= 5)
       return x
    
  • Transform statements (as with -XTransformListComp):

    [ x+y | x <- [1..10], y <- [1..x], then take 2 ]
    

    This translates to:

    do (x,y) <- take 2 (do x <- [1..10]
                           y <- [1..x]
                           return (x,y))
       return (x+y)
    
  • Group statements (as with -XTransformListComp):

    [ x | x <- [1,1,2,2,3], then group by x ]
    [ x | x <- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
    [ x | x <- [1,1,2,2,3], then group using myGroup ]
    

    The basic then group by e statement is translated using the mgroupWith function, which requires a MonadGroup instance, defined in Control.Monad.Group:

    do x <- mgroupWith (do x <- [1,1,2,2,3]
                           return x)
       return x
    

    Note that the type of x is changed by the grouping statement.

    The grouping function can also be defined with the using keyword.

  • Parallel statements (as with -XParallelListComp):

    [ (x+y) | x <- [1..10]
            | y <- [11..20]
            ]
    

    Parallel statements are translated using the mzip function, which requires a MonadZip instance defined in Control.Monad.Zip:

    do (x,y) <- mzip (do x <- [1..10]
                         return x)
                     (do y <- [11..20]
                         return y)
       return (x+y)
    

All these features are enabled by default if the MonadComprehensions extension is enabled. The types and more detailed examples on how to use comprehensions are explained in the previous chapters Section 7.3.9, “Generalised (SQL-Like) List Comprehensions” and Section 7.3.8, “Parallel List Comprehensions”. In general you just have to replace the type [a] with the type Monad m => m a for monad comprehensions.

Note: Even though most of these examples are using the list monad, monad comprehensions work for any monad. The base package offers all necessary instances for lists, which make MonadComprehensions backward compatible to built-in, transform and parallel list comprehensions.

More formally, the desugaring is as follows. We write D[ e | Q] to mean the desugaring of the monad comprehension [ e | Q]:

Expressions: e
Declarations: d
Lists of qualifiers: Q,R,S

-- Basic forms
D[ e | ]               = return e
D[ e | p <- e, Q ]     = e >>= \p -> D[ e | Q ]
D[ e | e, Q ]          = guard e >> \p -> D[ e | Q ]
D[ e | let d, Q ]      = let d in D[ e | Q ]

-- Parallel comprehensions (iterate for multiple parallel branches)
D[ e | (Q | R), S ]    = mzip D[ Qv | Q ] D[ Rv | R ] >>= \(Qv,Rv) -> D[ e | S ]

-- Transform comprehensions
D[ e | Q then f, R ]                  = f D[ Qv | Q ] >>= \Qv -> D[ e | R ]

D[ e | Q then f by b, R ]             = f (\Qv -> b) D[ Qv | Q ] >>= \Qv -> D[ e | R ]

D[ e | Q then group using f, R ]      = f D[ Qv | Q ] >>= \ys ->
                                        case (fmap selQv1 ys, ..., fmap selQvn ys) of
                                 	     Qv -> D[ e | R ]

D[ e | Q then group by b, R ]         = D[ e | Q then group by b using mgroupWith, R ]

D[ e | Q then group by b using f, R ] = f (\Qv -> b) D[ Qv | Q ] >>= \ys ->
                                        case (fmap selQv1 ys, ..., fmap selQvn ys) of
                                           Qv -> D[ e | R ]

where  Qv is the tuple of variables bound by Q (and used subsequently)
       selQvi is a selector mapping Qv to the ith component of Qv

Operator     Standard binding       Expected type
--------------------------------------------------------------------
return       GHC.Base               t1 -> m t2
(>>=)        GHC.Base               m1 t1 -> (t2 -> m2 t3) -> m3 t3
(>>)         GHC.Base               m1 t1 -> m2 t2         -> m3 t3
guard        Control.Monad          t1 -> m t2
fmap         GHC.Base               forall a b. (a->b) -> n a -> n b
mgroupWith   Control.Monad.Group    forall a. (a -> t) -> m1 a -> m2 (n a)
mzip         Control.Monad.Zip      forall a b. m a -> m b -> m (a,b)

The comprehension should typecheck when its desugaring would typecheck.

Monad comprehensions support rebindable syntax (Section 7.3.11, “Rebindable syntax and the implicit Prelude import”). Without rebindable syntax, the operators from the "standard binding" module are used; with rebindable syntax, the operators are looked up in the current lexical scope. For example, parallel comprehensions will be typechecked and desugared using whatever "mzip" is in scope.

The rebindable operators must have the "Expected type" given in the table above. These types are surprisingly general. For example, you can use a bind operator with the type

(>>=) :: T x y a -> (a -> T y z b) -> T x z b

In the case of transform comprehensions, notice that the groups are parameterised over some arbitrary type n (provided it has an fmap, as well as the comprehension being over an arbitrary monad.

7.3.11. Rebindable syntax and the implicit Prelude import

GHC normally imports Prelude.hi files for you. If you'd rather it didn't, then give it a -XNoImplicitPrelude option. The idea is that you can then import a Prelude of your own. (But don't call it Prelude; the Haskell module namespace is flat, and you must not conflict with any Prelude module.)

Suppose you are importing a Prelude of your own in order to define your own numeric class hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. So the -XRebindableSyntax flag causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions:

  • An integer literal 368 means "fromInteger (368::Integer)", rather than "Prelude.fromInteger (368::Integer)".

  • Fractional literals are handed in just the same way, except that the translation is fromRational (3.68::Rational).

  • The equality test in an overloaded numeric pattern uses whatever (==) is in scope.

  • The subtraction operation, and the greater-than-or-equal test, in n+k patterns use whatever (-) and (>=) are in scope.

  • Negation (e.g. "- (f x)") means "negate (f x)", both in numeric patterns, and expressions.

  • Conditionals (e.g. "if e1 then e2 else e3") means "ifThenElse e1 e2 e3". However case expressions are unaffected.

  • "Do" notation is translated using whatever functions (>>=), (>>), and fail, are in scope (not the Prelude versions). List comprehensions, mdo (Section 7.3.7.2, “ Mdo-notation (deprecated) ”), and parallel array comprehensions, are unaffected.

  • Arrow notation (see Section 7.10, “Arrow notation ”) uses whatever arr, (>>>), first, app, (|||) and loop functions are in scope. But unlike the other constructs, the types of these functions must match the Prelude types very closely. Details are in flux; if you want to use this, ask!

-XRebindableSyntax implies -XNoImplicitPrelude.

In all cases (apart from arrow notation), the static semantics should be that of the desugared form, even if that is a little unexpected. For example, the static semantics of the literal 368 is exactly that of fromInteger (368::Integer); it's fine for fromInteger to have any of the types:

fromInteger :: Integer -> Integer
fromInteger :: forall a. Foo a => Integer -> a
fromInteger :: Num a => a -> Integer
fromInteger :: Integer -> Bool -> Bool

Be warned: this is an experimental facility, with fewer checks than usual. Use -dcore-lint to typecheck the desugared program. If Core Lint is happy you should be all right.

7.3.12. Postfix operators

The -XPostfixOperators flag enables a small extension to the syntax of left operator sections, which allows you to define postfix operators. The extension is this: the left section

  (e !)

is equivalent (from the point of view of both type checking and execution) to the expression

  ((!) e)

(for any expression e and operator (!). The strict Haskell 98 interpretation is that the section is equivalent to

  (\y -> (!) e y)

That is, the operator must be a function of two arguments. GHC allows it to take only one argument, and that in turn allows you to write the function postfix.

The extension does not extend to the left-hand side of function definitions; you must define such a function in prefix form.

7.3.13. Tuple sections

The -XTupleSections flag enables Python-style partially applied tuple constructors. For example, the following program

  (, True)

is considered to be an alternative notation for the more unwieldy alternative

  \x -> (x, True)

You can omit any combination of arguments to the tuple, as in the following

  (, "I", , , "Love", , 1337)

which translates to

  \a b c d -> (a, "I", b, c, "Love", d, 1337)

If you have unboxed tuples enabled, tuple sections will also be available for them, like so

  (# , True #)

Because there is no unboxed unit tuple, the following expression

  (# #)

continues to stand for the unboxed singleton tuple data constructor.

7.3.14. Record field disambiguation

In record construction and record pattern matching it is entirely unambiguous which field is referred to, even if there are two different data types in scope with a common field name. For example:

module M where
  data S = MkS { x :: Int, y :: Bool }

module Foo where
  import M

  data T = MkT { x :: Int }

  ok1 (MkS { x = n }) = n+1   -- Unambiguous
  ok2 n = MkT { x = n+1 }     -- Unambiguous

  bad1 k = k { x = 3 }  -- Ambiguous
  bad2 k = x k          -- Ambiguous

Even though there are two x's in scope, it is clear that the x in the pattern in the definition of ok1 can only mean the field x from type S. Similarly for the function ok2. However, in the record update in bad1 and the record selection in bad2 it is not clear which of the two types is intended.

Haskell 98 regards all four as ambiguous, but with the -XDisambiguateRecordFields flag, GHC will accept the former two. The rules are precisely the same as those for instance declarations in Haskell 98, where the method names on the left-hand side of the method bindings in an instance declaration refer unambiguously to the method of that class (provided they are in scope at all), even if there are other variables in scope with the same name. This reduces the clutter of qualified names when you import two records from different modules that use the same field name.

Some details:

  • Field disambiguation can be combined with punning (see Section 7.3.15, “Record puns ”). For example:

    module Foo where
      import M
      x=True
      ok3 (MkS { x }) = x+1   -- Uses both disambiguation and punning
    

  • With -XDisambiguateRecordFields you can use unqualified field names even if the corresponding selector is only in scope qualified For example, assuming the same module M as in our earlier example, this is legal:

    module Foo where
      import qualified M    -- Note qualified
    
      ok4 (M.MkS { x = n }) = n+1   -- Unambiguous
    

    Since the constructor MkS is only in scope qualified, you must name it M.MkS, but the field x does not need to be qualified even though M.x is in scope but x is not. (In effect, it is qualified by the constructor.)

7.3.15. Record puns

Record puns are enabled by the flag -XNamedFieldPuns.

When using records, it is common to write a pattern that binds a variable with the same name as a record field, such as:

data C = C {a :: Int}
f (C {a = a}) = a

Record punning permits the variable name to be elided, so one can simply write

f (C {a}) = a

to mean the same pattern as above. That is, in a record pattern, the pattern a expands into the pattern a = a for the same name a.

Note that:

  • Record punning can also be used in an expression, writing, for example,

    let a = 1 in C {a}
    

    instead of

    let a = 1 in C {a = a}
    

    The expansion is purely syntactic, so the expanded right-hand side expression refers to the nearest enclosing variable that is spelled the same as the field name.

  • Puns and other patterns can be mixed in the same record:

    data C = C {a :: Int, b :: Int}
    f (C {a, b = 4}) = a
    

  • Puns can be used wherever record patterns occur (e.g. in let bindings or at the top-level).

  • A pun on a qualified field name is expanded by stripping off the module qualifier. For example:

    f (C {M.a}) = a
    

    means

    f (M.C {M.a = a}) = a
    

    (This is useful if the field selector a for constructor M.C is only in scope in qualified form.)

7.3.16. Record wildcards

Record wildcards are enabled by the flag -XRecordWildCards. This flag implies -XDisambiguateRecordFields.

For records with many fields, it can be tiresome to write out each field individually in a record pattern, as in

data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
f (C {a = 1, b = b, c = c, d = d}) = b + c + d

Record wildcard syntax permits a ".." in a record pattern, where each elided field f is replaced by the pattern f = f. For example, the above pattern can be written as

f (C {a = 1, ..}) = b + c + d

More details:

  • Wildcards can be mixed with other patterns, including puns (Section 7.3.15, “Record puns ”); for example, in a pattern C {a = 1, b, ..}). Additionally, record wildcards can be used wherever record patterns occur, including in let bindings and at the top-level. For example, the top-level binding

    C {a = 1, ..} = e
    

    defines b, c, and d.

  • Record wildcards can also be used in expressions, writing, for example,

    let {a = 1; b = 2; c = 3; d = 4} in C {..}
    

    in place of

    let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
    

    The expansion is purely syntactic, so the record wildcard expression refers to the nearest enclosing variables that are spelled the same as the omitted field names.

  • The ".." expands to the missing in-scope record fields. Specifically the expansion of "C {..}" includes f if and only if:

    • f is a record field of constructor C.

    • The record field f is in scope somehow (either qualified or unqualified).

    • In the case of expressions (but not patterns), the variable f is in scope unqualified, apart from the binding of the record selector itself.

    For example

    module M where
      data R = R { a,b,c :: Int }
    module X where
      import M( R(a,c) )
      f b = R { .. }
    

    The R{..} expands to R{M.a=a}, omitting b since the record field is not in scope, and omitting c since the variable c is not in scope (apart from the binding of the record selector c, of course).

7.3.17. Local Fixity Declarations

A careful reading of the Haskell 98 Report reveals that fixity declarations (infix, infixl, and infixr) are permitted to appear inside local bindings such those introduced by let and where. However, the Haskell Report does not specify the semantics of such bindings very precisely.

In GHC, a fixity declaration may accompany a local binding:

let f = ...
    infixr 3 `f`
in
    ...

and the fixity declaration applies wherever the binding is in scope. For example, in a let, it applies in the right-hand sides of other let-bindings and the body of the letC. Or, in recursive do expressions (Section 7.3.7, “The recursive do-notation ”), the local fixity declarations of a let statement scope over other statements in the group, just as the bound name does.

Moreover, a local fixity declaration *must* accompany a local binding of that name: it is not possible to revise the fixity of name bound elsewhere, as in

let infixr 9 $ in ...

Because local fixity declarations are technically Haskell 98, no flag is necessary to enable them.

7.3.18. Package-qualified imports

With the -XPackageImports flag, GHC allows import declarations to be qualified by the package name that the module is intended to be imported from. For example:

import "network" Network.Socket

would import the module Network.Socket from the package network (any version). This may be used to disambiguate an import when the same module is available from multiple packages, or is present in both the current package being built and an external package.

Note: you probably don't need to use this feature, it was added mainly so that we can build backwards-compatible versions of packages when APIs change. It can lead to fragile dependencies in the common case: modules occasionally move from one package to another, rendering any package-qualified imports broken.

7.3.19. Safe imports

With the -XSafeImports flag, GHC extends the import declaration syntax to take an optional safe keyword after the import keyword. This feature is part of the Safe Haskell GHC extension. For example:

import safe qualified Network.Socket as NS

would import the module Network.Socket with compilation only succeeding if Network.Socket can be safely imported. For a description of when a import is considered safe see Section 7.20, “Safe Haskell”

7.3.20. Summary of stolen syntax

Turning on an option that enables special syntax might cause working Haskell 98 code to fail to compile, perhaps because it uses a variable name which has become a reserved word. This section lists the syntax that is "stolen" by language extensions. We use notation and nonterminal names from the Haskell 98 lexical syntax (see the Haskell 98 Report). We only list syntax changes here that might affect existing working programs (i.e. "stolen" syntax). Many of these extensions will also enable new context-free syntax, but in all cases programs written to use the new syntax would not be compilable without the option enabled.

There are two classes of special syntax:

  • New reserved words and symbols: character sequences which are no longer available for use as identifiers in the program.

  • Other special syntax: sequences of characters that have a different meaning when this particular option is turned on.

The following syntax is stolen:

forall

Stolen (in types) by: -XExplicitForAll, and hence by -XScopedTypeVariables, -XLiberalTypeSynonyms, -XRank2Types, -XRankNTypes, -XPolymorphicComponents, -XExistentialQuantification

mdo

Stolen by: -XRecursiveDo,

foreign

Stolen by: -XForeignFunctionInterface,

rec, proc, -<, >-, -<<, >>-, and (|, |) brackets

Stolen by: -XArrows,

?varid, %varid

Stolen by: -XImplicitParams,

[|, [e|, [p|, [d|, [t|, $(, $varid

Stolen by: -XTemplateHaskell,

[:varid|

Stolen by: -XQuasiQuotes,

varid{#}, char#, string#, integer#, float#, float##, (#, #),

Stolen by: -XMagicHash,