Data/Typeable/Uniplate instances for GHC types
claus.reinke at talk21.com
Sat Jul 19 09:21:56 EDT 2008
>> While you ignore the explicit parameter, in practice there's an
>> implicit dictionary parameter that is actually being used in res
>> (Data/Typeable). At first sight, that might sound okay -we want a caf per
>> type after all- but what happens is that we get *a caf per dictionary
>> parameter* instead.
> My tests show that is wrong. Inserting a trace around the type map
> creation shows it is only called twice, with Company/Salary and
> Salary/Salary. Profiling shows much the same result.
grmpf, back to the drawing board.. or, perhaps not. Note what you
said: type maps are only created for two of the types being traversed!
I finally found a difference between what I was trying to do and what
you are doing:
- trying to memoize one IntSet of TypeRepKeys per type (listing
all substructure types of that type) tends to fail, though it does
sometimes work, for the reasons given above
- trying to memoize one Map TypeRepKey TypeRepKeys per type
(listing all substructure types of that type, and all substructure types
for each substructure type) has the same issues
- but, combining the second approach *with passing the Map* as
a (hidden) parameter to the traversal scheme isolates us from
the issues, to some extent (and this seems to happen in PlateData,
- all subcalls in the traversal, instead of relying on their own
memoized Map, reuse the one from the top traversal type,
which contains the info they need by virtue of their working
on a substructure type of the top traversal type
- there is still duplication if one actually calls the traversal
scheme several times (try calling the instrumented Uniplate
bill several times, on an Employee) and the cafs aren't
commoned up between the calls, which isn't nice, but
not nearly as bad as duplicating the computation of
sub-structure info within each traversal:
*Main> uni_bill laemmel
*Main> uni_bill laemmel
*Main> (uni_bill laemmel,uni_bill laemmel)
I might still be on the wrong track, explanation-wise, but if I
apply this to the SYB traversal, I get the expected speedup.
So I'll give my head some rest from this sometimes-it-works
Now I need to clean up the mess I made of the source while
experimenting, and see how to extract this adaptation of the
Uniplate trick for general SYB use.
Btw, the fast SYB version now only creates one type map,
for the top type Company - why does Uniplate create two?
>> Try this variation of the Paradise benchmark data:
>> genCom :: Company
>> genCom = C $ take 100000 $ cycle
>> [D "Research" laemmel [PU joost, PU marlow],
>> D "Strategy" blair ]
> I did, and Uniplate underperforms. I'm not entirely sure why this is
> yet, but my thoughts are kind of drifting towards a space leak, but I
> do need to track it down. It seems as you increase the size of the
> expressions the PlateData technique performs worse at queries.
>> Am I on the right track here?
> My guess is no - I think in Uniplate the CAF inside an instance trick
> works. But you have certainly found _something_ ...
It may be that there are two parts to the caf-in-instance trick,
making cafs and passing cafs down the traversal, but anyway
Uniplate does manage to get by with only two type maps.
Something else, then - sorry to hear that.
Thanks again for a useful new technique in generic traversals,
More information about the Cvs-ghc