Smart constructors
From HaskellWiki
DonStewart (Talk  contribs) (An example where we perform a bounds check on the type level) 
DonStewart (Talk  contribs) m (Turn on syntax highlighting) 

Line 22:  Line 22:  
Code up the a typical [[Typedata type]] describing a resistor value: 
Code up the a typical [[Typedata type]] describing a resistor value: 

+  <haskell> 

data Resistor = Metal Bands 
data Resistor = Metal Bands 

 Ceramic Bands 
 Ceramic Bands 

Line 27:  Line 28:  
type Bands = Int 
type Bands = Int 

+  </haskell> 

This has a problem however, that the constructors of type ''Resistor'' are 
This has a problem however, that the constructors of type ''Resistor'' are 

Line 43:  Line 45:  
so: 
so: 

+  <haskell> 

metalResistor :: Bands > Resistor 
metalResistor :: Bands > Resistor 

metalResistor n  n < 4  n > 8 = error "Invalid number of resistor bands" 
metalResistor n  n < 4  n > 8 = error "Invalid number of resistor bands" 

 otherwise = Metal n 
 otherwise = Metal n 

+  </haskell> 

This function behaves like the constructor ''Metal'', but also performs 
This function behaves like the constructor ''Metal'', but also performs 

Line 69:  Line 73:  
reckless user could bypass the smart constructor: 
reckless user could bypass the smart constructor: 

+  <haskell> 

module Resistor ( 
module Resistor ( 

Resistor,  abstract, hiding constructors 
Resistor,  abstract, hiding constructors 

Line 75:  Line 80:  
... 
... 

+  </haskell> 

== Using assertions == 
== Using assertions == 

Line 82:  Line 88:  
rewrite the smart constructor as: 
rewrite the smart constructor as: 

+  <haskell> 

metalResistor :: Bands > Resistor 
metalResistor :: Bands > Resistor 

metalResistor n = assert (n >= 4 && n <= 8) $ Metal n 
metalResistor n = assert (n >= 4 && n <= 8) $ Metal n 

+  </haskell> 

And now obtain more detailed error messages, automatically generated for us: 
And now obtain more detailed error messages, automatically generated for us: 

Line 112:  Line 120:  
Firstly, define some [[Peano numbers]] to represent the number of bands as types: 
Firstly, define some [[Peano numbers]] to represent the number of bands as types: 

−  +  
+  <haskell> 

data Z = Z 
data Z = Z 

data S a = S a 
data S a = S a 

−  +  </haskell> 

+  
Now specify a class for cardinal numbers. 
Now specify a class for cardinal numbers. 

+  <haskell> 

class Card c where 
class Card c where 

instance Card Z where 
instance Card Z where 

instance (Card c) => Card (S c) where 
instance (Card c) => Card (S c) where 

−  +  </haskell> 

+  
Ok, now we're set. So encode a typelevel version of the bounds check. 
Ok, now we're set. So encode a typelevel version of the bounds check. 

Only resistors with bands >= 4 and <= 8 are valid: 
Only resistors with bands >= 4 and <= 8 are valid: 

+  <haskell> 

class Card size => InBounds size where 
class Card size => InBounds size where 

Line 133:  Line 143:  
instance InBounds (S (S (S (S (S (S (S Z))))))) where  seven 
instance InBounds (S (S (S (S (S (S (S Z))))))) where  seven 

instance InBounds (S (S (S (S (S (S (S (S Z)))))))) where  eight 
instance InBounds (S (S (S (S (S (S (S (S Z)))))))) where  eight 

−  +  </haskell> 

+  
Now define a new resistor type. Note that since the bounds is represented in the 
Now define a new resistor type. Note that since the bounds is represented in the 

type, ''we no longer need to store the bounds in the resistor value''. 
type, ''we no longer need to store the bounds in the resistor value''. 

+  <haskell> 

data Resistor size = Resistor deriving Show 
data Resistor size = Resistor deriving Show 

−  +  </haskell> 

+  
And, finally, a convenience constructor for us to use, encoding the bounds 
And, finally, a convenience constructor for us to use, encoding the bounds 

check in the type: 
check in the type: 

−  +  
+  <haskell> 

resistor :: InBounds size => size > Resistor size 
resistor :: InBounds size => size > Resistor size 

resistor _ = Resistor 
resistor _ = Resistor 

+  </haskell> 

== Examples == 
== Examples == 

Line 149:  Line 161:  
First, define some convenience values: 
First, define some convenience values: 

+  <haskell> 

d0 = undefined :: Z 
d0 = undefined :: Z 

d3 = undefined :: S (S (S Z)) 
d3 = undefined :: S (S (S Z)) 

Line 155:  Line 168:  
d8 = undefined :: S (S (S (S (S (S (S (S Z))))))) 
d8 = undefined :: S (S (S (S (S (S (S (S Z))))))) 

d10 = undefined :: S (S (S (S (S (S (S (S (S (S Z))))))))) 
d10 = undefined :: S (S (S (S (S (S (S (S (S (S Z))))))))) 

+  </haskell> 

Now try to construct some resistors: 
Now try to construct some resistors: 

Line 213:  Line 227:  
A ''newtype'' is useful for this: 
A ''newtype'' is useful for this: 

+  <haskell> 

newtype MetalResistor = Metal Bands 
newtype MetalResistor = Metal Bands 

newtype CeramicResistor = Ceramic Bands 
newtype CeramicResistor = Ceramic Bands 

+  </haskell> 

now, a function of resistors must have either a ''MetalResistor'' type, or a 
now, a function of resistors must have either a ''MetalResistor'' type, or a 

''CeramicResistor'' type: 
''CeramicResistor'' type: 

+  <haskell> 

foo :: MetalResistor > Int 
foo :: MetalResistor > Int 

foo (MetalResistor n) = n 
foo (MetalResistor n) = n 

+  </haskell> 

You can't write a function over both resistor types (other than a purely 
You can't write a function over both resistor types (other than a purely 

Line 240:  Line 258:  
Another use for smart constructors is to perform basic optimisations, often to obtain a normal form for constructed data. For example, consider a data structure representing addition and multiplication of variables. 
Another use for smart constructors is to perform basic optimisations, often to obtain a normal form for constructed data. For example, consider a data structure representing addition and multiplication of variables. 

+  <haskell> 

data Expression = Variable String 
data Expression = Variable String 

 Add [Expression] 
 Add [Expression] 

 Multiply [Expression] 
 Multiply [Expression] 

+  </haskell> 

In this data structure, it is possible to represent a value such as <tt>Add [Variable "a", Add [Variable "b", Variable "c"]]</tt> more compactly as <tt>Add [Variable "a", Variable "b", Variable "c"]</tt>. 
In this data structure, it is possible to represent a value such as <tt>Add [Variable "a", Add [Variable "b", Variable "c"]]</tt> more compactly as <tt>Add [Variable "a", Variable "b", Variable "c"]</tt>. 

Line 248:  Line 268:  
This can be done automatically with smart constructors such as: 
This can be done automatically with smart constructors such as: 

+  <haskell> 

add :: [Expression] > Expression 
add :: [Expression] > Expression 

add xs = Add (concatMap fromAdd xs) 
add xs = Add (concatMap fromAdd xs) 

Line 259:  Line 280:  
fromMultiply (Multiply xs) = xs 
fromMultiply (Multiply xs) = xs 

fromMultiply x = [x] 
fromMultiply x = [x] 

+  </haskell> 

[[Category:Idioms]] 
[[Category:Idioms]] 
Revision as of 04:15, 9 March 2006
Smart constructors
This is an introduction to a programming idiom for placing extra constraints on the construction of values by using smart constructors.
Sometimes you need guarantees about the values in your program beyond what can be accomplished with the usual type system checks. Smart constructors can be used for this purpose.
Consider the following problem: we want to be able to specify a data type for electronic resistors. The resistors come in two forms, metal and ceramic. Resistors are labelled with a number of bands, from 4 to 8.
We'd like to be able to
 ensure only resistors with the right number of bands are constructed.
Contents 
1 Runtime checking : smart constructors
1.1 A first attempt
Code up the a typical data type describing a resistor value:
data Resistor = Metal Bands  Ceramic Bands deriving Show type Bands = Int
This has a problem however, that the constructors of type Resistor are unable to check that only bands of size 4 to 8 are built. It is quite legal to say:
*Main> :t Metal 23 Metal 23 :: Resistor
for example.
1.2 Smart(er) constructors
Smart constructors are just functions that build values of the required type, but perform some extra checks when the value is constructed, like so:
metalResistor :: Bands > Resistor metalResistor n  n < 4  n > 8 = error "Invalid number of resistor bands"  otherwise = Metal n
This function behaves like the constructor Metal, but also performs a check. This check will be carried out at runtime, once, when the value is built.
Running this code:
*Main> metalResistor 4 Metal 4
*Main> metalResistor 7 Metal 7 *Main> metalResistor 9 *** Exception: Invalid number of resistor bands
*Main> metalResistor 0 *** Exception: Invalid number of resistor bands
One extra step has to be made though, to make the interface safe. When exporting the type Resistor we need to hide the (unsafe) constructors, and only export the smart constructors, otherwise a reckless user could bypass the smart constructor:
module Resistor ( Resistor,  abstract, hiding constructors metalResistor,  only way to build a metal resistor ) where ...
1.3 Using assertions
Handcoding error messages can be tedious when used often. Instead we can use the assert function, provided (at least with GHC). We rewrite the smart constructor as:
metalResistor :: Bands > Resistor metalResistor n = assert (n >= 4 && n <= 8) $ Metal n
And now obtain more detailed error messages, automatically generated for us:
*Main> metalResistor 0 *** Exception: A.hs:4:1823: Assertion failed
We at least now are given the line and column in which the error occured.
2 Compiletime checking : the type system
2.1 Enforcing the constraint statically
There are other ways to obtain numerical checks like this. The most interesting are probably the static checks that can be done with Type arithmetic, that enforce the number of bands at compile time, rather than runtime, by lifting the band count into the type level.
In the following example, instead of checking the band count at runtime, we instead lift the resistor band count into the type level, and have the typecheck perform the check statically, using phantom types and Peano numbers.
We thus remove the need for a runtime check, meaning faster code. A consequence of this decision is that since the band count is now represented in the type, it is no longer necessary to carry it around at runtime, meaning less data has to be allocated.
Firstly, define some Peano numbers to represent the number of bands as types:
data Z = Z data S a = S a
Now specify a class for cardinal numbers.
class Card c where instance Card Z where instance (Card c) => Card (S c) where
Ok, now we're set. So encode a typelevel version of the bounds check. Only resistors with bands >= 4 and <= 8 are valid:
class Card size => InBounds size where instance InBounds (S (S (S (S Z)))) where  four instance InBounds (S (S (S (S (S Z))))) where  five instance InBounds (S (S (S (S (S (S Z)))))) where  six instance InBounds (S (S (S (S (S (S (S Z))))))) where  seven instance InBounds (S (S (S (S (S (S (S (S Z)))))))) where  eight
Now define a new resistor type. Note that since the bounds is represented in the type, we no longer need to store the bounds in the resistor value.
data Resistor size = Resistor deriving Show
And, finally, a convenience constructor for us to use, encoding the bounds check in the type:
resistor :: InBounds size => size > Resistor size resistor _ = Resistor
2.2 Examples
First, define some convenience values:
d0 = undefined :: Z d3 = undefined :: S (S (S Z)) d4 = undefined :: S (S (S (S Z))) d6 = undefined :: S (S (S (S (S (S Z))))) d8 = undefined :: S (S (S (S (S (S (S (S Z))))))) d10 = undefined :: S (S (S (S (S (S (S (S (S (S Z)))))))))
Now try to construct some resistors:
> resistor d0 No instance for (InBounds Z)
So the value 0 isn't in bounds, as we want. And it is a compiletime error to try to create such a resistor.
> resistor d3 No instance for (InBounds (S (S (S Z))))
Ok, how about a valid resistor?
> resistor d4 Resistor
Great!
> :t resistor d4 resistor d4 :: Resistor (S (S (S (S Z))))
And it's type encodes the number of bands.
> resistor d6 Resistor > resistor d8 Resistor
> :t resistor d8 resistor d8 :: Resistor (S (S (S (S (S (S (S (S Z))))))))
Similar result for other valid resistors.
> resistor d10 No instance for (InBounds (S (S (S (S (S (S (S (S (S (S Z)))))))))))
And 10 is too big.
2.3 Summary
By using a standard encoding of numeric values on the type level we are able to encode a bounds check in the type of a value, thus removing a runtime check, and removing the need to store the numeric value at runtime. The code is safer, as it is impossible to compile the program unless all resistors have the correct number of bands.
An extension would be to use a decimal encoding for the integers (at the expense of longer code).
2.4 Extensions
Further checks can be obtained by separating the metal and ceramic values on the type level, so no function that takes a metal resistor can be accidentally passed a ceramic one.
A newtype is useful for this:
newtype MetalResistor = Metal Bands newtype CeramicResistor = Ceramic Bands
now, a function of resistors must have either a MetalResistor type, or a CeramicResistor type:
foo :: MetalResistor > Int foo (MetalResistor n) = n
You can't write a function over both resistor types (other than a purely polymorphic function).
2.5 Related work
These ideas are also discussed on the old wiki here and also here (for compiletime unit analysis error catching at the type level). More here too.
In general, the more information you place on the type level, the more static checks you get  and thus less chance for bugs.
3 Runtime Optimisation : smart constructors
Another use for smart constructors is to perform basic optimisations, often to obtain a normal form for constructed data. For example, consider a data structure representing addition and multiplication of variables.
data Expression = Variable String  Add [Expression]  Multiply [Expression]
In this data structure, it is possible to represent a value such as Add [Variable "a", Add [Variable "b", Variable "c"]] more compactly as Add [Variable "a", Variable "b", Variable "c"].
This can be done automatically with smart constructors such as:
add :: [Expression] > Expression add xs = Add (concatMap fromAdd xs) multiply :: [Expression] > Expression multiply xs = Multiply (concatMap fromMultiply xs) fromAdd (Add xs) = xs fromAdd x = [x] fromMultiply (Multiply xs) = xs fromMultiply x = [x]