Splitting SYB from the base package in GHC 6.10

Claus Reinke claus.reinke at talk21.com
Tue Sep 2 16:46:09 EDT 2008


> Indeed not changing the instances is definitely still an option. Maybe it's
> best, for now, to focus on choosing a splitting solution which allows for
> all possible flexibility in the future development of SYB. After 6.10 is
> out, we then have all the proposed changes to SYB (including the dubious
> instances issue) analized and discussed before making the changes.
> 
> I find Simon PJ's argument for keeping Data in base ("so that others can
> build on it without depending on the full glory of SYB") rather strong, and
> I guess any "interesting" change to the Data class would require a change in
> the internal deriving mechanism. I'm not saying that Data is perfect and
> should not be changed, but maybe it can be in base without fundamentally
> diminishing the oportunites for development in SYB. Do you agree with me,
> Claus?

I find it curious to see you ask that question, as I've spent quite a bit
of time answering it, and providing information needed to prepare a
workable plan. Strangely, I had assumed you'd actually use that material
to help you understand the issues involved in any decision made now.
But my question haven't been answered, and now you simply repeat
your question without reference to my previous answers and suggestions
(have you tried playing with 'syb-utils', to see how a trying to fix the
instance issues in a separate package on top of 'base' would work, or not?).

Instead of repeating myself in full, I'd prefer you'd refer to my previous 
emails (ask if they were unclear), but here is the gist of it (with no claim 
of completeness):

0 if 'Data' stays in 'base', and the types are in 'base' there is no good
    reason for the instances to be elsewhere, right?

    (personally, I still like Ashley's suggestion of putting 'Data', 'Typeable'
    and 'Dynamic' in a separate package)

1 'base' traditionally cannot be updated between ghc releases. I asked
    whether that had changed, because otherwise leaving anything in 'base'
    will keep you from changing it. So if the technical limitations fixing 'base'
    are not gone, your adventure ends right there.

2 the "standard"/"dubious" separation of instances was entirely preliminary;
    following the recent discussion, I would suggest to split the instances into
    three groups, in three separate modules:

    [standard]: fully implemented 'Data' instances (no runtime errors).
                     one should probably reclassify 'Ratio a' in here.

    [partial]: partially implemented instances (usually for abstract types, 
                which 'Data' doesn't handle well; whether that can be mended
                without changing the class remains to be seen); these include 
                'Array a b', 'ThreadId', etc (previously in 'Standard') and the 
                pointer types (previously in 'Dubious'); if these instances can be 
                completed, existing clients will simply work better (fewer runtime 
                crashes)

    [misfits]: 'IO a', 'b -> a', and other types that enclose a substructure
                type in a contravariant context ("realworld", "b", "state", etc); 
                not only are the existing 'Data' instances incomplete, they skip 
                substructures on both transformations and queries, and the 
                type of query operators in 'Data' simply does not permit a 
                complete implementation of substructure queries for these 
                types (I think), so these instances are fairly hopeless at the 
                moment, and should only be available if explicitly requested.

3 if the implicit re-export of misfit instances from 'Data.Generic*' isn't 
    stopped now, any attempt to fix 'syb' is doomed (until the next ghc 
    release). 

    unless anyone wants to claim that it is technically possible to implement 
    complete 'Data' instances for these types (in which case I'm all ears;-), 
    the existing incomplete instances should not be available by default 
    (requiring a specific import instead).

    I'd prefer the other partial instances also to be under explicit import 
    only, so that noone can run into runtime crashes without having been 
    warned (just because they use 'gunfold' instead of 'gfold', say, and
    happen to work with arrays instead of lists), but it seems I'm in the 
    minority on this point.

4 even after providing selective import of non-standard instances, the
    existing importers of 'Data.Generic*' in core and extra libs need to 
    be cleaned up now, so that they only import (and thus re-export) the 
    miminum set of instances they depend on (in particular, no misfits); 
    again, failure to do that now will doom any attempt to fix the 'syb' 
    instances later.

    If the misfit instances are moved out of 'Data.Generic.Instances'
    into a module not imported anywhere else, while the other partial
    instances remain, no changes might be needed to the existing 
    'Data.Generics*' importers, but that needs to be checked.

I have no idea whether these would give you sufficient flexibility to
work on 'syb' after the ghc release, but I'm pretty sure that they are
part of the necessary minimum of issues that need to be addressed 
now. 

Even after working around #2182, you can observe the issues when 
using my 'syb-utils' package, eg, the 'Data.Generics.GPS' module 
uses 'Data.IntMap', so it accidentally re-exports all the old instances, 
even though it carefully avoids importing them itself. If it wasn't for 
#2356, that would never work (and if you happen to import 
'Data.Generics.GPS' before importing 'Data.Generics.Instances.*',
instead, it still might not, at least while building the package).

Claus

PS I'll update my 'syb-utils' package to reflect the new instance
    split, as given in 2 above.



More information about the Libraries mailing list