Proposal: Performance improvements for Data.IntMap

Simon Marlow marlowsd at gmail.com
Thu Sep 23 06:40:47 EDT 2010


On 03/09/2010 17:17, Johan Tibell wrote:
> On Fri, Sep 3, 2010 at 6:11 PM, Ian Lynagh <igloo at earth.li
> <mailto:igloo at earth.li>> wrote:
>
>     On Tue, Aug 31, 2010 at 03:09:34AM -0700, Donald Bruce Stewart wrote:
>      >
>      > #4279: Proposal: Performance improvements for Data.IntMap
>      > http://hackage.haskell.org/trac/ghc/ticket/4279
>
>     Same .Internals comment as for Data.Map in #4277.
>
>
> I would go with .Internal (without the plural) as it seems to be the
> more widely used convention (e.g. in network, bytestring, text, etc).

For those who didn't follow this discussion on cvs-ghc: since the 
Data.Map/IntMap patches went into the repo (for testing and 
experimentation), we noticed some code blowup due to the extra INLINEs. 
  e.g. one module in GHC that makes heavy use of Data.Map tripled in 
size from 150k to 450k:

http://www.haskell.org/pipermail/cvs-ghc/2010-September/055944.html

we came up with a possible solution: a pragma that exports an inlining 
but doesn't force the unfolding to take place in all client code. 
Basically it defers the decision about whether to inline to the call 
site, which is the right thing: GHC's inlining heuristics are quite 
well-tuned to balance performance benefit against code blowup, and the 
client can always turn up the knob with -funfolding-use-threshold100 if 
they wish.  So in GHC 7.0 you can say

  {-# INLINABLE f #-}

to get the new behaviour.  Of course this isn't backwards-compatible: 
the pragma will be ignored by older GHCs, so you might want to use some 
CPP trickery.

So what to do for GHC 7.0.  The options are:

  - make the INLINABLE changes, re-do the measurements to make sure
    performance hasn't got worse, and push the patches.

  - roll back the first set of patches.

I have no strong opinions, but we have to do something soon: the GHC 7.0 
RC is due out tomorrow.  I be happy to make the INLINABLE changes myself 
and test that the code blowup problem is fixed, but I can't do the 
performance measurements easily.  The same applies to Data.Set/IntSet of 
course.

Cheers,
	Simon


More information about the Libraries mailing list