Difference between revisions of "Constructor"

From HaskellWiki
Jump to navigation Jump to search
(Data Constructors are first class values, category, links.)
m (Data declarations always introduce a single type constructor, and zero or more data constructors)
 
(14 intermediate revisions by 6 users not shown)
Line 2: Line 2:
 
'''Constructor''' can mean:
 
'''Constructor''' can mean:
 
* Type constructor
 
* Type constructor
* Data constructor
+
* Data constructor (or value constructor)
  +
  +
A <hask>data</hask> declaration introduces a ''type'' constructor and some ''data'' constructors for this type constructor.
   
 
== Type constructor ==
 
== Type constructor ==
  +
A '''[[type]] constructor''' may have zero or more arguments, if it has zero arguments it is called a ''nullary'' type constructor (or simply a '''type'''). An example of a nullary type constructor <hask>Bool</hask> with two nullary data constructors <hask>True</hask> and <hask>False</hask>
A [[type]] constructor is used to construct new types from given ones.
 
  +
 
<haskell>
  +
data Bool = True | False
  +
</haskell>
  +
  +
An example of a ''unary'' type constructor <haskell>Tree</haskell>
  +
 
<haskell>
 
<haskell>
 
data Tree a = Tip | Node a (Tree a) (Tree a)
 
data Tree a = Tip | Node a (Tree a) (Tree a)
 
</haskell>
 
</haskell>
  +
 
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.
 
illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named <hask>Tree</hask>, but a tree of what? Of any specific type <hask>a</hask>, be it <hask>Integer</hask>, <hask>Maybe String</hask>, or even <hask>Tree b</hask>, in which case it will be a tree of tree of <hask>b</hask>. The data type is polymorphic (and <hask>a</hask> is a type variable that is to be substituted by a specific type). So when used, the values will have types like <hask>Tree Int</hask> or <hask>Tree (Tree Boolean)</hask>.
   
 
== Data constructor ==
 
== Data constructor ==
A data constructor groups values together and tags alternatives in an algebraic data type,
+
A '''data constructor''' (or '''value constructor''') can have zero or more arguments where a data constructor taking zero arguments is called a ''nullary'' data constructor or simply a '''constant'''. They group values together and tag alternatives in an [[algebraic data type]]. E.g., the <hask>Tree a</hask> type from above
 
<haskell>
 
<haskell>
 
data Tree a = Tip | Node a (Tree a) (Tree a)
 
data Tree a = Tip | Node a (Tree a) (Tree a)
 
</haskell>
 
</haskell>
where there are two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (I'm happy leaving the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor alright, but it groups no value whatsoever, that is, it's a nullary constructor. There can be only one value that will have this constructor, also conveniently denoted <hask>Tip</hask>. So nullary constructors contain no data apart from its name! For example, the <hask>Bool</hask> data type is defined to be
+
has two data constructors, <hask>Tip</hask> and <hask>Node</hask>. Any value that belongs to the type <hask>Tree a</hask> (again, we leave the type parameter unspecified) will be a constructed by either <hask>Tip</hask> or <hask>Node</hask>. <hask>Tip</hask> is a constructor that groups no value, i.e, it's a nullary constructor. There is only one value, conveniently denoted <hask>Tip</hask> that has this constructor. So, nullary constructors contain no data apart from their name! For example, the <hask>Bool</hask> data type is defined as
 
<haskell>
 
<haskell>
 
data Bool = True | False
 
data Bool = True | False
 
</haskell>
 
</haskell>
 
and for all practical purposes you can just think of <hask>True</hask> and <hask>False</hask> as ''constants'' belonging to a type. On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.
and for all practical purposes you can just think of them as ''constants'' belonging to a type.
 
On the other hand, <hask>Node</hask> contains other data. The types of those data are its parameters. The first one has type <hask>a</hask>, so it's just a value of the parameter type <hask>a</hask>. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type <hask>Tree a</hask>, naturally.
 
   
 
===Data constructors as first class values===
 
===Data constructors as first class values===
 
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:
 
Data constructors are first class values in Haskell and actually have a [[type]]. For instance, the type of the <hask>Left</hask> constructor of the <hask>Either</hask> data type is:
 
<haskell>
 
<haskell>
Left :: forall b a. a -> Either a b
+
Left :: a -> Either a b
 
</haskell>
 
</haskell>
   
 
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.
 
As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.
   
  +
=== Data constructors are not types===
 
 
As discussed above, they denote values. It is illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.
=== Note ===
 
Data constructors are not types! They denote values. It would be illegal to write <hask>Node a (Node a) (Node a)</hask> there, because the type is <hask>Tree</hask>, not <hask>Node</hask>.
 
   
 
== Deconstructing data constructors ==
 
== Deconstructing data constructors ==
 
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],
 
All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via [[pattern matching]],
  +
 
<haskell>
 
<haskell>
depth Tip = 0
+
depth Tip = 0
 
depth (Node _ l r) = 1 + max (depth l) (depth r)
 
depth (Node _ l r) = 1 + max (depth l) (depth r)
 
</haskell>
 
</haskell>
So, the depth of a tip is zero. The depth of a node depends on its branches, but not it's content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.
 
   
 
So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using <hask>_</hask>). The left branch is named <hask>l</hask>, the right <hask>r</hask>, allowing us to use these values in the right hand side.
== Notes ==
 
  +
* You can declare a constructor (for both type and data) to be infix, and this can make your code a lot more readable.
 
 
== Notes and tips==
* Tuples are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.
 
 
* You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter.
 
* [[Tuples]] are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, <hask>(4, True)</hask> is really a value of the form <hask>(,) 4 True</hask> having the type <hask>(,) Int Bool</hask>, which, too, is written conveniently as <hask>(Int, Bool)</hask> to make it more readable. Incidentally, the empty tuple type <hask>()</hask> with its only value <hask>()</hask> is used throughout, and is called ''unit''.
 
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell>
 
* You can, in fact, name the values grouped together, using the [[record]] syntax, <haskell>
 
data Person = Person { name :: String, age :: Int, address :: String }
 
data Person = Person { name :: String, age :: Int, address :: String }
 
</haskell> so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.
</haskell>
 
  +
* Sometimes you need a little editing or checking when constructing your data. If you do, check [[smart constructors]].
so that for a person <hask>p</hask>, you can say <hask>age p</hask> to select his/her age, without resorting to pattern matching.
 
  +
* Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by ''hiding'' your constructor (not mentioning it in the export list of the module), creating an [[abstract data type]] as a result. Along with smart constructors mentioned above, one can achieve encapsulation.
  +
* Type constructors are used to denote the types of functions, e.g.,<br/><haskell>depth :: Tree a -> Int</haskell>Data or value constructors are used to pattern match, e.g.,<br/><haskell>depth (Node _ l r) = ...</haskell>
  +
* If you want to dig deeper into the Haskell type system: a data or value constructor has a ''type'', e.g., <hask>Node 3 Tip Tip</hask> has type <hask>Tree Int</hask>; a ''type'' of a type constructor itself is called [https://wiki.haskell.org/Kind kind].
  +
  +
== References ==
  +
* [http://www.cs.cmu.edu/~rwh/smlbook/book.pdf Programming in Standard ML]: Chapter 10 (Concrete Data Types)

Latest revision as of 16:51, 18 June 2021

Constructor can mean:

  • Type constructor
  • Data constructor (or value constructor)

A data declaration introduces a type constructor and some data constructors for this type constructor.

Type constructor

A type constructor may have zero or more arguments, if it has zero arguments it is called a nullary type constructor (or simply a type). An example of a nullary type constructor Bool with two nullary data constructors True and False

data Bool = True | False
An example of a unary type constructor
Tree
data Tree a = Tip | Node a (Tree a) (Tree a)

illustrates how to define a data type with type constructors (and data constructors at the same time). The type constructor is named Tree, but a tree of what? Of any specific type a, be it Integer, Maybe String, or even Tree b, in which case it will be a tree of tree of b. The data type is polymorphic (and a is a type variable that is to be substituted by a specific type). So when used, the values will have types like Tree Int or Tree (Tree Boolean).

Data constructor

A data constructor (or value constructor) can have zero or more arguments where a data constructor taking zero arguments is called a nullary data constructor or simply a constant. They group values together and tag alternatives in an algebraic data type. E.g., the Tree a type from above

data Tree a = Tip | Node a (Tree a) (Tree a)

has two data constructors, Tip and Node. Any value that belongs to the type Tree a (again, we leave the type parameter unspecified) will be a constructed by either Tip or Node. Tip is a constructor that groups no value, i.e, it's a nullary constructor. There is only one value, conveniently denoted Tip that has this constructor. So, nullary constructors contain no data apart from their name! For example, the Bool data type is defined as

data Bool = True | False

and for all practical purposes you can just think of True and False as constants belonging to a type. On the other hand, Node contains other data. The types of those data are its parameters. The first one has type a, so it's just a value of the parameter type a. This one is the value the tree node holds in it. The remaining two are the branches. Each of them have type Tree a, naturally.

Data constructors as first class values

Data constructors are first class values in Haskell and actually have a type. For instance, the type of the Left constructor of the Either data type is:

Left :: a -> Either a b

As first class values, they may be passed to functions, held in a list, be data elements of other algebraic data types and so forth.

Data constructors are not types

As discussed above, they denote values. It is illegal to write Node a (Node a) (Node a) there, because the type is Tree, not Node.

Deconstructing data constructors

All a data constructor does is holding values together. But you want to separate them if you want to use them. This is done via pattern matching,

depth Tip          = 0
depth (Node _ l r) = 1 + max (depth l) (depth r)

So, the depth of a tip is zero. The depth of a node depends on its branches, but not its content. See how the constructor in the left hand side names its parts? we don't need the content so we don't name it (using _). The left branch is named l, the right r, allowing us to use these values in the right hand side.

Notes and tips

  • You can declare a constructor (for both type and data) to be an infix operator, and this can make your code a lot more readable. However, for alphanumeric names, the names of both the type constructor and the data constructor(s) must start with an uppercase letter.
  • Tuples are a built in feature of the syntax but are plain old algebraic data types! They have only one constructor though. Having the same name as their types (don't freak out, it's just a matter of convenience, as the type constructors and the data constructors have separate namespaces). So, (4, True) is really a value of the form (,) 4 True having the type (,) Int Bool, which, too, is written conveniently as (Int, Bool) to make it more readable. Incidentally, the empty tuple type () with its only value () is used throughout, and is called unit.
  • You can, in fact, name the values grouped together, using the record syntax,
    data Person = Person { name :: String, age :: Int, address :: String }
    
    so that for a person p, you can say age p to select his/her age, without resorting to pattern matching.
  • Sometimes you need a little editing or checking when constructing your data. If you do, check smart constructors.
  • Sometimes you do not want the user of your library to group arbitrary values together. This is achieved by hiding your constructor (not mentioning it in the export list of the module), creating an abstract data type as a result. Along with smart constructors mentioned above, one can achieve encapsulation.
  • Type constructors are used to denote the types of functions, e.g.,
    depth :: Tree a -> Int
    
    Data or value constructors are used to pattern match, e.g.,
    depth (Node _ l r) = ...
    
  • If you want to dig deeper into the Haskell type system: a data or value constructor has a type, e.g., Node 3 Tip Tip has type Tree Int; a type of a type constructor itself is called kind.

References