GHC.Generics
From HaskellWiki
(→The generic programmer) |
(→The generic programmer) |
||
| Line 248: | Line 248: | ||
=== The generic programmer === | === The generic programmer === | ||
| - | If you are a library author and are eager to make your classes easy to instantiate by your users, then you should invest some time in defining instances for each of the representation types of <tt>GHC.Generics</tt> and defining a generic default method. See the example for <tt>Serialize</tt> above, and the [http://dreixel.net/research/pdf/gdmh.pdf original paper] for many other examples (but make sure to check the [[Generics#Changes from the paper]] | + | If you are a library author and are eager to make your classes easy to instantiate by your users, then you should invest some time in defining instances for each of the representation types of <tt>GHC.Generics</tt> and defining a generic default method. See the example for <tt>Serialize</tt> above, and the [http://dreixel.net/research/pdf/gdmh.pdf original paper] for many other examples (but make sure to check the [[Generics#Changes from the paper changes from the paper]]). |
=== The GHC hacker === | === The GHC hacker === | ||
Revision as of 07:03, 9 May 2011
GHC 7.2 includes support for datatype-generic programming. This means you can have more classes for which you do not have to give an instance, likeContents |
1 Serialization
Suppose you are writing a class for serialization of data. You have a typedata Bit = O | I class Serialize a where put :: a -> [Bit]
You might have written some instances already:
instance Serialize Int where put i = serializeInt i instance Serialize a => Serialize [a] where put [] = [] put (h:t) = put h ++ put t
A user of your library, however, will have his own datatypes, like:
data UserTree a = Node a (UserTree a) (UserTree a) | Leaf
It is here that generic programming can help you. If you are familiar with SYB you could use it at this stage, but now we'll see how to do this with the new generic deriving mechanism.
1.1 Generic serialization
First you have to tell the compiler how to serialize any datatype, in general. Since Haskell datatypes have a regular structure, this means you can just explain how to serialize a few basic datatypes.
1.1.1 Representation types
We can represent most Haskell datatypes using only the following primitive types:
-- | Unit: used for constructors without arguments data U1 p = U1 -- | Constants, additional parameters and recursion of kind * newtype K1 i c p = K1 { unK1 :: c } -- | Meta-information (constructor names, etc.) newtype M1 i c f p = M1 { unM1 :: f p } -- | Sums: encode choice between constructors infixr 5 :+: data (:+:) f g p = L1 (f p) | R1 (g p) -- | Products: encode multiple arguments to constructors infixr 6 :*: data (:*:) f g p = f p :*: g p
type RepUserTree a = -- A UserTree is either a Leaf, which has no arguments U1 -- ... or it is a Node, which has three arguments that we put in a product :+: a :*: UserTree a :*: UserTree a
Simple, right? Different constructors become alternatives of a sum, and multiple arguments become products. In fact, we want to have some more information in the representation, like datatype and constructor names, and to know if a product argument is a parameter or a type. We use the other primitives for this, and the representation looks more like:
type RealRepUserTree a = -- Information about the datatype M1 D Data_UserTree ( -- Leaf, with information about the constructor M1 C Con_Leaf U1 -- Node, with information about the constructor :+: M1 C Con_Node ( -- Constructor argument, which could have information -- about a record selector label M1 S NoSelector ( -- Argument, tagged with P because it is a parameter K1 P a) -- Another argument, tagged with R because it is -- a recursive occurrence of a type :*: M1 S NoSelector (K1 R (UserTree a)) -- Idem :*: M1 S NoSelector (K1 R (UserTree a)) ))
1.1.2 A generic function
Since GHC can represent user types using only those primitive types, all you have to do is to tell GHC how to serialize each of the individual primitive types. The best way to do that is to create a new type class:
class GSerialize f where gput :: f a -> [Bin]
instance GSerialize U1 where gput U1 = []
The serializing multiple arguments is simply the concatenation of each of the individual serializations:
instance (GSerialize a, GSerialize b) => GSerialize (a :*: b) where gput (a :*: b) = gput a ++ gput b
The case for sums is the most interesting, as we have to record which alternative we are in. We will use a 0 for left injections and a 1 for right injections:
instance (GSerialize a, GSerialize b) => GSerialize (a :+: b) where gput (L1 x) = O : gput x gput (R1 x) = I : gput x
We don't need to encode the meta-information, so we just go over it recursively :
instance (GSerialize a) => GSerialize (M1 i c a) where gput (M1 x) = gput x
instance (Serialize a) => GSerialize (K1 i c a) where gput (K1 x) = put x
1.1.3 Default implementations
We've seen how to represent user types generically, and how to define functions on representation types. However, we still have to tie these two together, explaining how to convert user types to their representation and then applying the generic function.
The representationclass Representable0 a where -- Encode the representation of a user datatype type Rep0 a :: * -> * -- Convert from the datatype to its representation from0 :: a -> (Rep0 a) x -- Convert from the representation to the datatype to0 :: (Rep0 a) x -> a
instance Representable0 (UserTree a) where type Rep0 (UserTree a) = RepUserTree a from0 Leaf = L1 U1 from0 (Node a l r) = R1 (a :*: l :*: r) to0 (L1 U1) = Leaf to0 (R1 (a :*: l :*: r)) = Node a l r
putDefault :: (Representable0 a, GSerialize (Rep0 a)) => a -> [Bit] putDefault a = gput (from0 a)
instance (Serialize a) => Serialize (UserTree a) where put = putDefault
1.2 Using GHC's new features
What we have seen so far could all already be done, at the cost of writing a lot of boilerplate code yourself (or spending hours writing Template Haskell code to do it for you). Now we'll see how the new features of GHC can help you.
1.2.1 Deriving representations
The{-# LANGUAGE DeriveRepresentable #-} data UserTree a = Node a (UserTree a) (UserTree a) | Leaf deriving Representable0
(Standlone deriving also works fine, and you can use it for types you have not defined yourself, but are imported from somewhere else.) You will need the new DeriveRepresentable language pragma.
1.2.2 More general default methods
We don't want the user to have to write theWe solved this by allowing the user to give a different signature for default methods:
{-# LANGUAGE DefaultSignatures #-} class Serialize a where put :: a -> [Bit] default put :: (Representable0 a, GSerialize (Rep0 a)) => a -> [Bit] put a = gput (from0 a)
Now the user can simply write:
instance (Serialize a) => Serialize (UserTree a)
2 Different perspectives
We outline the changes introduced in 7.2 regarding support for generic programming from the perspective of three different types of users: the end-user, the generic programmer, and the GHC hacker.
2.1 The end-user
If you know nothing about generic programming and would like to keep it that way, then you will be pleased to know that using generics in GHC 7.2 is easier than ever. As soon as you encounter a class with a default signature (like Serialize above), you will be able to give empty instances for your datatypes, like this:
instance (Serialize a) => Serialize (UserTree a)
2.2 The generic programmer
If you are a library author and are eager to make your classes easy to instantiate by your users, then you should invest some time in defining instances for each of the representation types of GHC.Generics and defining a generic default method. See the example for Serialize above, and the original paper for many other examples (but make sure to check the Generics#Changes from the paper changes from the paper).
2.3 The GHC hacker
3 Changes from the paper
4 Limitations
To be written.
