Data/Typeable/Uniplate instances for GHC types
ndmitchell at gmail.com
Wed Jul 16 07:01:31 EDT 2008
> You can implement Uniplate on top of Data/Typeable. (It's slightly slower than giving Uniplate instances directly, but if my memory serves, not a lot. See Neil's theis.) So you can think of Uniplate/Biplate as a rather nice API for generic programming, but one that shares a common set of instances underneath. I'll all for having Data/Typeable instances for the main GHC types, and a Uniplate API available on top.
The thesis suggests that using Uniplate with the fastest instances
will be around 30% slower than manual operations, but using SYB based
instances will be 300% slower. In practice, these are benchmarks on
generic traversals, and may not accurately reflect how people using
the libraries in practice - the bottlenecks may be elsewhere.
I recently talked to someone who had ported a program from using no
boilerplate removal code, to using Uniplate. They reported a
substantial reduction in the amount of code by using Uniplate, and
gave the following performance results:
No boilerplate removal code 1:30min
Uniplate with PlateDirect (fastest method) 1:45min
Uniplate with PlateData (based on SYB) 5min
And I suspect directly using SYB would have been somewhere in the
I have also previously talked with Ben Lippmeier about using SYB in
DCC (http://www.haskell.org/haskellwiki/DDC) and was told the
performance loss was "too great" and he had to switch back. However,
these are merely anecdotes.
For some applications having proper Uniplate instances changes the
feasibility of the task. However, this can be done afterwards if there
is a need, with minimal changes to the interface. I would suggest for
the moment that most GHC API objects derive Data/Typeable, and if
experience shows this to be too slow (which I suspect it might), then
explicit Uniplate instances could be given.
> Do readers have any arguments for or against including them inside GHC? I like
> Uniplate very much and I can imagine that there a probably a couple of places
> where it could be used to clean up some code.
My view is that anyone manipulating an abstract syntax tree inside a
compiler without using some form of generic programming is doing it
wrong. Uniplate (or SYB, or virtually any generic programming
technique) can make 30 line functions that obscure the real intent
into a simple and explicit 3 liner. There are good reasons that GHC
doesn't use generic programming (GHC predates the techniques,
portability requirements etc) but I think the benefits are too great
More information about the Cvs-ghc