User:Michiexile/MATH198/Lecture 5
From HaskellWiki
Michiexile (Talk | contribs) |
Michiexile (Talk | contribs) |
||
(6 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | IMPORTANT NOTE: THESE NOTES ARE STILL UNDER DEVELOPMENT. PLEASE WAIT UNTIL AFTER THE LECTURE WITH HANDING ANYTHING IN, OR TREATING THE NOTES AS READY TO READ. |
||
− | |||
− | |||
===Cartesian Closed Categories and typed lambda-calculus=== |
===Cartesian Closed Categories and typed lambda-calculus=== |
||
Line 68: | Line 65: | ||
# <math>proj_1(a,b) =_X a</math>, <math>proj_2(a,b) =_Xb</math>, <math>c=_X(proj_1(c),proj_2(c))</math> for all <math>a,b,c,X</math>. |
# <math>proj_1(a,b) =_X a</math>, <math>proj_2(a,b) =_Xb</math>, <math>c=_X(proj_1(c),proj_2(c))</math> for all <math>a,b,c,X</math>. |
||
# <math>\lambda_x\phi(x) a =_X \phi(a)</math> if <math>a</math> is substitutable for <math>x</math> in <math>\phi(x)</math> and <math>\phi(a)</math> is what we get by substituting each occurrence of <math>x</math> by<math>a</math> in <math>\phi(x)</math>. ''A term is '''substitutable''' for another if by performing the substitution, no occurrence of any variable in the term becomes bound,'' |
# <math>\lambda_x\phi(x) a =_X \phi(a)</math> if <math>a</math> is substitutable for <math>x</math> in <math>\phi(x)</math> and <math>\phi(a)</math> is what we get by substituting each occurrence of <math>x</math> by<math>a</math> in <math>\phi(x)</math>. ''A term is '''substitutable''' for another if by performing the substitution, no occurrence of any variable in the term becomes bound,'' |
||
− | # <math>\lambda_{x\in A}f x =_X f, provided <math>x\not\in X</math>. |
+ | # <math>\lambda_{x\in A}f x =_X f</math>, provided <math>x\not\in X</math>. |
# <math>\lambda_{x\in A}\phi(x) =_X \lambda_{x'\in A}\phi(x')</math> if <math>x'</math> is substitutable for <math>x</math> in <math>\phi(x)</math> and each variable is not free in the other expression. |
# <math>\lambda_{x\in A}\phi(x) =_X \lambda_{x'\in A}\phi(x')</math> if <math>x'</math> is substitutable for <math>x</math> in <math>\phi(x)</math> and each variable is not free in the other expression. |
||
Line 121: | Line 118: | ||
'''Definition''' A ''cone'' over a diagram <math>D</math> in a category <math>C</math> is some object <math>C</math> equipped with a family <math>c_i:C\to D_i</math> of arrows, one for each object in <math>J</math>, such that for each arrow <math>\alpha:i\to j</math> in <math>J</math>, the following diagram |
'''Definition''' A ''cone'' over a diagram <math>D</math> in a category <math>C</math> is some object <math>C</math> equipped with a family <math>c_i:C\to D_i</math> of arrows, one for each object in <math>J</math>, such that for each arrow <math>\alpha:i\to j</math> in <math>J</math>, the following diagram |
||
+ | |||
[[Image:ConeDefDiagram.png]] |
[[Image:ConeDefDiagram.png]] |
||
+ | |||
commutes, or in equations, <math>D_\alpha c_i = c_j</math>. |
commutes, or in equations, <math>D_\alpha c_i = c_j</math>. |
||
A ''morphism'' <math>f:(C,c_i)\to(C',c'_i)</math> of cones is an arrow <math>f:C\to C'</math> such that each triangle |
A ''morphism'' <math>f:(C,c_i)\to(C',c'_i)</math> of cones is an arrow <math>f:C\to C'</math> such that each triangle |
||
+ | |||
[[Image:ConeMorphismDiagram.png]] |
[[Image:ConeMorphismDiagram.png]] |
||
+ | |||
commutes, or in equations, such that <math>c_j = c'_j f</math>. |
commutes, or in equations, such that <math>c_j = c'_j f</math>. |
||
Line 163: | Line 164: | ||
'''Definition''' A ''cocone'' over a diagram <math>D:J\to C</math> is an object <math>C</math> with arrows <math>c_j:D_j\to C</math> such that for each arrow <math>\alpha:i\to j</math> in <math>J</math>, the following diagram |
'''Definition''' A ''cocone'' over a diagram <math>D:J\to C</math> is an object <math>C</math> with arrows <math>c_j:D_j\to C</math> such that for each arrow <math>\alpha:i\to j</math> in <math>J</math>, the following diagram |
||
+ | |||
[[Image:CoConeDefDiagram.png]] |
[[Image:CoConeDefDiagram.png]] |
||
+ | |||
commutes, or in equations, such that <math>c_jD_\alpha=c_i</math>. |
commutes, or in equations, such that <math>c_jD_\alpha=c_i</math>. |
||
A ''morphism'' <math>f:(C,c_i)\to (C',c'_i)</math> of cocones is an arrow <math>f:C\to C'</math> such that each triangle |
A ''morphism'' <math>f:(C,c_i)\to (C',c'_i)</math> of cocones is an arrow <math>f:C\to C'</math> such that each triangle |
||
+ | |||
[[Image:CoConeMorphismDiagram.png]] |
[[Image:CoConeMorphismDiagram.png]] |
||
+ | |||
commutes, or in equations, such that <math>c_j=c'_jf</math>. |
commutes, or in equations, such that <math>c_j=c'_jf</math>. |
||
Line 192: | Line 197: | ||
For both of these, the argument is almost identical to the one in [[User:Michiexile/MATH198/Lecture 5#Limits we've already seen|the limits section]] above. |
For both of these, the argument is almost identical to the one in [[User:Michiexile/MATH198/Lecture 5#Limits we've already seen|the limits section]] above. |
||
− | ====Useful limits and colimits==== |
||
− | |||
− | With the tools of limits and colimits at hand, we can start using these to introduce more category theoretical constructions - and some of these turn out to correspond to things we've seen in other areas. |
||
− | |||
− | Possibly among the most important are the equalizers and coequalizers (with kernel (nullspace) and images as special cases), and the pullbacks and pushouts (with which we can make explicit the idea of inverse images of functions). |
||
− | |||
− | One useful theorem to know about is: |
||
− | |||
− | '''Theorem''' The following are equivalent for a category <math>C</math>: |
||
− | * <math>C</math> has all finite limits. |
||
− | * <math>C</math> has all finite products and all equalizers. |
||
− | * <math>C</math> has all pullbacks and a terminal object. |
||
− | Also, the following dual statements are equivalent: |
||
− | * <math>C</math> has all finite colimits. |
||
− | * <math>C</math> has all finite coproducts and all coequalizers. |
||
− | * <math>C</math> has all pushouts and an initial object. |
||
− | |||
− | For this theorem, we can replace ''finite'' with any other cardinality in every place it occurs, and we will still get a valid theorem. |
||
− | |||
− | =====Equalizer, coequalizer===== |
||
− | |||
− | Consider the ''equalizer diagram'': |
||
− | |||
− | [[Image:EqualizerDiagram.png]] |
||
− | |||
− | A limit over this diagram is an object <math>C</math> and arrows to all diagram objects. The commutativity conditions for the arrows defined force for us <math>fp_A = p_B = gp_A</math>, and thus, keeping this enforced equation in mind, we can summarize the cone diagram as: |
||
− | |||
− | [[Image:EqualizerCone.png]] |
||
− | |||
− | Now, the limit condition tells us that this is the least restrictive way we can map into <math>A</math> with some map <math>p</math> such that <math>fp = gp</math>, in that every other way we could map in that way will factor through this way. |
||
− | |||
− | As usual, it is helpful to consider the situation in Set to make sense of any categorical definition: and the situation there is helped by the generalized element viewpoint: the limit object <math>C</math> is one representative of a subobject of <math>A</math> that for the case of Set contains all <math>x\in A: f(x) = g(x)</math>. |
||
− | |||
− | Hence the word we use for this construction: the limit of the diagram above is the ''equalizer of <math>f, g</math>''. It captures the idea of a maximal subset unable to distinguish two given functions, and it introduces a categorical way to define things by equations we require them to respect. |
||
− | |||
− | One important special case of the equalizer is the ''kernel'': in a category with a null object, we have a distinguished, unique, member <math>0</math> of any homset given by the compositions of the unique arrows to and from the null object. We define ''the kernel'' <math>Ker(f)</math> of an arrow <math>f</math> to be the equalizer of <math>f, 0</math>. Keeping in mind the arrow-centric view on categories, we tend to denot the arrow from <math>Ker(f)</math> to the source of <math>f</math> by <math>ker(f)</math>. |
||
− | |||
− | In the category of vector spaces, and linear maps, the map <math>0</math> really is the constant map taking the value <math>0</math> everywhere. And the kernel of a linear map <math>f:U\to V</math> is the equalizer of <math>f,0</math>. Thus it is some vector space <math>W</math> with a map <math>i:W\to U</math> such that <math>fi = 0i = 0</math>, and any other map that fulfills this condition factors through <math>W</math>. Certainly the vector space <math>\{u\in U: f(u)=0\}</math> fulfills the requisite condition, nothing larger will do, since then the map composition wouldn't be 0, and nothing smaller will do, since then the maps factoring this space through the smaller candidate would not be unique. |
||
− | |||
− | Hence, <math>Ker(f) = \{u\in U: f(u) = 0\}</math> just like we might expect. |
||
− | |||
− | Dually, we get the ''coequalizer'' as the colimit of the equalizer diagram. |
||
− | |||
− | A coequalizer |
||
− | [[Image:CoequalizerCoCone.png]] |
||
− | has to fulfill that <math>i_Bf = i_A = i_Bg</math>. Thus, writing <math>q=i_B</math>, we get an object with an arrow (actually, an epimorphism out of <math>B</math>) that identifies <math>f</math> and <math>g</math>. Hence, we can think of <math>i_B:B\to Q</math> as catching the notion of inducing equivalence classes from the functions. |
||
− | |||
− | This becomes clear if we pick out one specific example: let <math>R\subseteq X\times X</math> be an equivalence relation, and consider the diagram |
||
− | |||
− | [[Image:EquivCoequalizer.png]] |
||
− | |||
− | where <math>r_1</math> and <math>r_2</math> are given by the projection of the inclusion of the relation into the product onto either factor. Then, the coequalizer of this setup is an object <math>X/R</math> such that whenever <math>x\sim_R y</math>, then <math>q(x)=q(y)</math>. |
||
===Homework=== |
===Homework=== |
||
− | Credit will be given for up to 5 of the 7 exercises. |
+ | Credit will be given for up to 4 of the 6 exercises. |
# Prove that currying/uncurrying are isomorphisms in a CCC. Hint: the map <math>f\mapsto\lambda f</math> is a map <math>Hom(C\times A, B)\to Hom(C,[A\to B])</math>. |
# Prove that currying/uncurrying are isomorphisms in a CCC. Hint: the map <math>f\mapsto\lambda f</math> is a map <math>Hom(C\times A, B)\to Hom(C,[A\to B])</math>. |
||
− | # Prove that in a CCC, the composition <math>\lambda \circ ev</math> is <math>\lambda\circ ev = 1_{[A\to B]}: [A\to B] \to [A\to B]</math>. |
+ | # Prove that in a CCC <math>\lambda ev</math> is <math>\lambda ev = 1_{[A\to B]}: [A\to B] \to [A\to B]</math>. |
− | # Prove that an equalizer is a monomorphism. |
||
− | # Prove that a coequalizer is an epimorphism. |
||
# What is the limit of a diagram of the shape of the category 2? |
# What is the limit of a diagram of the shape of the category 2? |
||
− | # Prove that given any relation <math>R\subseteq X\times X</math>, its completion to an equivalence relation is the kernel of the coequalizer of the component maps of the relation. |
+ | # Is the category of Sets a CCC? Prove it. |
+ | # Is the category of vector spaces a CCC? Prove it. |
||
# * Implement a typed lambda calculus as an EDSL in Haskell. |
# * Implement a typed lambda calculus as an EDSL in Haskell. |
Latest revision as of 06:32, 28 October 2009
Contents |
[edit] 1 Cartesian Closed Categories and typed lambda-calculus
A category is said to have pairwise products if for any objects A,B, there is a product object .
A category is said to have pairwise coproducts if for any objects A,B, there is a coproduct object A + B.
Recall when we talked about internal homs in Lecture 2. We can now define what we mean, formally, by the concept:
Definition An object C in a category D is an internal hom object or an exponential object or B^{A} if it comes equipped with an arrow , called the evaluation arrow, such that for any other arrow , there is a unique arrow such that the composite
is f.
The idea here is that with something in an exponential object, and something in the source of the arrows we imagine live inside the exponential, we can produce the evaluation of the arrow at the source to produce something in the target. Using global elements, this reasoning comes through in a more natural manner: given and we can produce the global element . Furthermore, we can always produce something in the exponential whenever we have something that looks as if it should be there.
And with this we can define
Definition A category C is a Cartesian Closed Category or a CCC if:
- C has a terminal object 1
- Each pair of objects has a product and projections , .
- For every pair of objects, there is an exponential object with an evaluation map .
[edit] 1.1 Currying
Note that the exponential as described here is exactly what we need in order to discuss the Haskell concept of multi-parameter functions. If we consider the type of a binary function in Haskell:
binFunction :: a -> a -> a
On the other hand, we can feed in both values at once, and get
binFunction' :: (a,a) -> a
which lives in the exponential object .
These are genuinely different objects, but they seem to do the same thing: consume two distinct values to produce a third value. The resolution of the difference lies, again, in a recognition from Set theory: there is an isomorphism
which we can use as inspiration for an isomorphism valid in Cartesian Closed Categories.
[edit] 1.2 Typed lambda-calculus
The lambda-calculus, and later the typed lambda-calculus both act as foundational bases for computer science, and computer programming in particular. The idea in both is that everything is a function, and we can reduce the act of programming to function application; which in turn can be analyzed using expression rewriting rules that encapsulate the act of computation in a sequence of formal rewrites.
Definition A typed lambda-calculus is a formal theory with types, terms, variables and equations. Each term a has a type A associated to it, and we write a:A or . The system is subject to a sequence of rules:
- There is a type 1. Hence, the empty lambda calculus is excluded.
- If A,B are types, then so are and . These are, initially, just additional symbols, not imbued with the associations we usually give the symbols used.
- There is a term * :1. Hence, the lambda calculus without any terms is excluded.
- For each type A, there is an infinite (countable) supply of terms .
- If a:A,b:B are terms, then there is a term .
- If then there are terms proj_{1}(c):A,proj_{2}(c):B.
- If a:A And , then there is a term fa:B.
- If x:A is a variable and φ(x):B is a term, then there is a . Note that here, φ(x) is a meta-expression, meaning we have SOME lambda-calculus expression that may include the variable x.
- There is a relation a = _{X}a' for each set of variables X that occur freely in either a or a'. This relation is reflexive, symmetric and transitive. Recall that a variable is free in a term if it is not in the scope of a λ-expression naming that variable.
- If a:1 then a = _{{}} * . In other words, up to lambda-calculus equality, there is only one value of type * .
- If , then a = _{X}a' implies a = _{Y}a'. Binding more variables gives less freedom, not more, and thus cannot suddenly make equal expressions differ.
- a = _{X}a' implies fa = _{X}fa'.
- f = _{X}f' implies fa = _{X}f'a. So equality plays nice with function application.
- implies λ_{x}φ(x) = _{X}λ_{x}φ'(x). Equality behaves well with respect to binding variables.
- proj_{1}(a,b) = _{X}a, proj_{2}(a,b) = _{X}b, c = _{X}(proj_{1}(c),proj_{2}(c)) for all a,b,c,X.
- λ_{x}φ(x)a = _{X}φ(a) if a is substitutable for x in φ(x) and φ(a) is what we get by substituting each occurrence of x bya in φ(x). A term is substitutable for another if by performing the substitution, no occurrence of any variable in the term becomes bound,
- , provided .
- if x' is substitutable for x in φ(x) and each variable is not free in the other expression.
Note that = _{X} is just a symbol. The axioms above give it properties that work a lot like equality, but two lambda calculus-equal terms are not equal unless they are identical. However, a = _{X}b tells us that in any model of this lambda calculus - where terms, types, et.c. are replaced with actual things (mathematical objects, say, or a programming language semantics embedding typed lambda calculus) - then the things given by translating a and b into the model should end up being equal.
Any actual realization of typed lambda calculus is bound to have more rules and equalities than the ones listed here.
With these axioms in front of us, however, we can see how lambda calculus and Cartesian Closed Categories fit together: We can go back and forth between the wo concepts in a natural manner:
[edit] 1.2.1 Lambda to CCC
Given a typed lambda calculus L, we can define a CCC C(L). Its objects are the types of L. An arrow from A to B is an equivalence class (under = _{{x}}) of terms of type B, free in a single variable x:A.
We need the equivalence classes because for any variable x:A, we want to be the global element of corresponding to the identity arrow. Hence, that variable must itself correspond to an identity arrow.
And then the rules for the various constructions enumerated in the axioms correspond closely to what we need to prove the resulting category to be cartesian closed.
[edit] 1.2.2 CCC to Lambda
To go in the other direction, starting out with a Cartesian Closed Category and finding a typed lambda calculus corresponding to it, we construct its internal language.
Given a CCC C, we can assume that we have chosen, somehow, one actual product for each finite set of factors. Thus, both all products and all projections are well defined entities, with no remaining choice to determine them.
The types of the internal language L(C) are just the objects of C. The existence of products, exponentials and terminal object covers axioms 1-2. We can assume the existence of variables for each type, and the remaining axioms correspond to definition and behaviour of the terms available.
Using the properties of a CCC, it is at this point possible to prove a resulting equivalence of categories C(L(C)) = C, and similarly, with suitable definitions for what it means for formal languages to be equivalent, one can also prove for a typed lambda-calculus L that L(C(L)) = L.
More on this subject can be found in:
- Lambek & Scott: Aspects of higher order categorical logic and Introduction to higher order categorical logic
More importantly, by stating λ-calculus in terms of a CCC instead of in terms of terms and rewriting rules is that you can escape worrying about variable clashes, alpha reductions and composability - the categorical translation ignores, at least superficially, the variables, reduces terms with morphisms that have equality built in, and provides associative composition for free.
At this point, I'd recommend reading more on Wikipedia [1] and [2], as well as in Lambek & Scott: Introduction to Higher Order Categorical Logic. The book by Lambek & Scott goes into great depth on these issues, but may be less than friendly to a novice.
[edit] 2 Limits and colimits
One design pattern, as it were, that we have seen occur over and over in the definitions we've seen so far is for there to be some object, such that for every other object around, certain morphisms have unique existence.
We saw it in terminal and initial objects, where there's a unique map from or to every other object. And in products/coproducts where a wellbehaved map, capturing any pair of maps has unique existence. And finally, above, in the CCC characterization of the internal hom, we had a similar uniqueness requirement for the lambda map.
One thing we can notice is that the isomorphisms theorems for all these cases look very similar to each other: in each isomorphism proof, we produce the uniquely existing morphisms, and prove that their uniqueness and their other properties force the maps to really be isomorphisms.
Now, category theory has a philosophy slightly similar to design patterns - if we see something happening over and over, we'll want to generalize it. And there are generalizations available for these!
[edit] 2.1 Diagrams, cones and limits
Definition A diagram D of the shape of an index category J (often finite or countable), in a category C is just a functor . Objects in J will be denoted by i,j,k,... and their images in C by D_{i},D_{j},D_{k},....
This underlines that when we talk about diagrams, we tend to think of them less as just functors, and more as their images - the important part of a diagram D is the objects and their layout in C, and not the process of going to C from D.
Definition A cone over a diagram D in a category C is some object C equipped with a family of arrows, one for each object in J, such that for each arrow in J, the following diagram
commutes, or in equations, D_{α}c_{i} = c_{j}.
A morphism of cones is an arrow such that each triangle
commutes, or in equations, such that c_{j} = c'_{j}f.
This defins a category of cones, that we shall denote by Cone(D). And we define, hereby:
Definition The limit of a diagram D in a category C is a terminal object in Cone(D). We often denote a limit by
so that the map from the limit object to one of the diagram objects D_{i} is denoted by p_{i}.
The limit being terminal in the category of cones nails down once and for all the uniqueness of any map into it, and the isomorphism of any two terminal objects carries over to a proof once and for all for the limit case.
Specifically, since the morphisms of cones are morphisms in C, and composition is carried straight over, so proving a map is an isomorphism in the cone category implies it is one in the target category as well.
Definition A category C has all (finite) limits if all diagrams (of finite shape) have limit objects defined for them.
[edit] 2.2 Limits we've already seen
The terminal object of a category is the limit object of an empty diagram. Indeed, it is an object, with no specified maps to no other objects, such that every other object that also maps to the same empty set of objects - which is to say all other objects - have a uniquely determined map to the limit object.
The product of some set of objects it she limit object of the diagram containing all these objects and no arrows; a diagram of the shape of a discrete category. The condition here becomes the requirement of maps to all factors so any other cone factors through these maps.
To express the exponential as a limit, we need to go to a different category than the one we started in. Take the category with objects given by morphisms for fixed objects Y,Z, and morphisms given by morphisms commuting with the 'objects' they run between and fixing Y. The exponential is a terminal object in this category.
Adding further arrows to diagrams amounts to adding further conditions on the products, as the maps from the product to the diagram objects need to factor through any arrows present in the diagram.
These added relations, however, is exactly what trips things up in Haskell. The idealized Haskell category does not have even all finite limits. At the core of the issue here is the lack of dependent types: there is no way for the type system to guarantee equations, and hence only the trivial limits - the products - can be guaranteed by the Haskell type checked.
In order to get that kind of guarantees, the type checker would need an implementation of Dependent type, something that can be simulated in several ways, but is not (yet) an actual part of Haskell. Other languages, however, cover this - most notably Epigram, Agda and Cayenne - which the latter is much stronger influenced by constructve type theory and category theory even than Haskell.
The kind of equations that show up in a limit, however, could be thought of as invariants for the type - and thus something that can be tested for. The resulting equations can be plugged into a testing framework - such as QuickCheck to verify that the invariants hold under the functions applied.
[edit] 2.3 Colimits
The dual concept to a limit is defined using the dual to the cones:
Definition A cocone over a diagram is an object C with arrows such that for each arrow in J, the following diagram
commutes, or in equations, such that c_{j}D_{α} = c_{i}.
A morphism of cocones is an arrow such that each triangle
commutes, or in equations, such that c_{j} = c'_{j}f.
Just as with the category of cones, this yields a category of cocones, that we denote by Cocone(D), and with this we define:
Definition The colimit of a diagram is an initial object in Cocone(D).
We denote the colimit by
so that the map from one of the diagram objects D_{i} to the colimit object is denoted by i_{i}.
Again, the isomorphism results for coproducts and initial objects follow from that for the colimit, and the same proof ends up working for all colimits.
And again, we say that a category has (finite) colimits if every (finite) diagram admits a colimit.
[edit] 2.4 Colimits we've already seen
The initial object is the colimit of the empty diagram.
The coproduct is the colimit of the discrete diagram.
For both of these, the argument is almost identical to the one in the limits section above.
[edit] 3 Homework
Credit will be given for up to 4 of the 6 exercises.
- Prove that currying/uncurrying are isomorphisms in a CCC. Hint: the map is a map .
- Prove that in a CCC λev is .
- What is the limit of a diagram of the shape of the category 2?
- Is the category of Sets a CCC? Prove it.
- Is the category of vector spaces a CCC? Prove it.
- * Implement a typed lambda calculus as an EDSL in Haskell.