Changes to Data.Typeable

Gábor Lehel illissius at gmail.com
Fri Jul 8 18:36:45 CEST 2011


2011/7/7 Simon Marlow <marlowsd at gmail.com>:
> On 07/07/11 17:14, Gábor Lehel wrote:
>>
>> On Thu, Jul 7, 2011 at 5:44 PM, Simon Marlow<marlowsd at gmail.com>  wrote:
>>>
>>> Hi folks,
>>>
>>> In response to this ticket:
>>>
>>>  http://hackage.haskell.org/trac/ghc/ticket/5275
>>>
>>> I'm making some changes to Data.Typeable, some of which affect the API,
>>> so
>>> as per the new library guidelines I'm informing the list.
>>>
>>> The current implementation of Typeable is based on
>>>
>>>  mkTyCon :: String ->  TyCon
>>>
>>> which internally keeps a table mapping Strings to Ints, so that each
>>> TyCon
>>> can be given a unique Int for fast comparison.  This means the String has
>>> to
>>> be unique across all types in the program.  Currently derived instances
>>> of
>>> typeable use the qualified original name (e.g. "GHC.Types.Int") which is
>>> not
>>> necessarily unique, is non-portable, and exposes implementation details.
>>>
>>> The String passed to mkTyCon is returned by
>>>
>>>  tyConString :: TyCon ->  String
>>>
>>> which lets the user get at this non-portable representation (also the
>>> Show
>>> instance returns this String).
>>>
>>> So the new proposal is to store three Strings in TyCon.  The internal
>>> representation is this:
>>>
>>> data TyCon = TyCon {
>>>   tyConHash    :: {-# UNPACK #-} !Fingerprint,
>>>   tyConPackage :: String,
>>>   tyConModule  :: String,
>>>   tyConName    :: String
>>>  }
>>>
>>> the fields of this type are not exposed externally.  Together the three
>>> fields tyConPackage, tyConModule and tyConName uniquely identify a TyCon,
>>> and the Fingerprint is a hash of the concatenation of these three Strings
>>> (so no more internal cache to map strings to unique Ids). tyConString now
>>> returns the value of tyConName only.
>>>
>>> I've measured the performance impact of this change, and as far as I can
>>> tell performance is uniformly better.  This should improve things for SYB
>>> in
>>> particular.  Also, the size of the code generated for deriving Typeable
>>> is
>>> less than half as much as before.
>>>
>>> === Proposed API changes ===
>>>
>>> 1. DEPRECATE mkTyCon
>>>
>>>   mkTyCon is used by some hand-written instances of Typeable.  It
>>>   will work as before, but is deprecated in favour of...
>>>
>>> 2. Add
>>>
>>>   mkTyCon3 :: String ->  String ->  String ->  TyCon
>>>
>>>   which takes the package, module, and name of the TyCon respectively.
>>>   Most users can just derive Typeable, there's no need to use mkTyCon3.
>>>
>>> In due course we can rename mkTyCon3 back to mkTyCon.
>>>
>>> Any comments?
>>>
>>> Cheers,
>>>        Simon
>>
>> Would this also mean typeRepKey could be taken out of the IO monad?
>> That would be nice.
>
> Ah yes, I forgot to mention the changes to typeRepKey.  So currently we have
>
>  typeRepKey :: TypeRep -> IO Int
>
> this API is difficult to support in the new library, I'd have to reintroduce
> the cache, and it wouldn't be very efficient.  I plan to change it to this:
>
>  data TypeRepKey -- abstract, instance of Eq, Ord
>  typeRepKey :: TypeRep -> IO TypeRepKey
>
> where TypeRepKey is a newtype of the internal Fingerprint.  Now, we could
> take typeRepKey out of IO, but the Ord instance of TypeRepKey is
> implementation-defined (it provides some total order, but we don't tell you
> what it is).  So arguably we should keep the IO.  What do people think?

Would the order be allowed to vary from run to run of the program
(which is why it's in IO now)? Could it be specified as
implementation-defined but non-varying? If so, I would favor that
option along with taking it out of IO. (Plenty of things are
implementation-defined, like the size of an Int.)

Albeit, the use case I had in mind was using Template Haskell to
construct a case statement over the literal Int values of the keys as
determined at compile time (hopefully compiling down to something like
a C switch statement), and I'm not sure if that's going to work if the
keys are no longer Ints. (That it wouldn't compile down to a switch
statement is one thing, but I'm not sure if the code would literally
be possible to write. Maybe it'd need a Lift instance?) Anyway, I
don't think it would hurt to take it out of IO if given the
opportunity, either way.

>
> Obviously this is not a backwards compatible change either way.
>
> Cheers,
>        Simon
>



-- 
Work is punishment for failing to procrastinate effectively.



More information about the Libraries mailing list