Why do we put GMP allocations on GHC heaps?

Wed Oct 23 08:57:55 UTC 2013

Gergely

Edward has it right.  Functional programs allocate a lot of intermediate stuff.  ((a+b)*c-d) allocates two intermediate Integers and discards them pretty soon afterwards.  GHC's code generator and garbage collector are good at both allocation and gc of young dead objects.

Using malloc/free would require a finaliser-style interface, which is significantly less efficient.  That might matter a lot for an arithmetic-intensive program, and not at all for one where Integer arithmetic was incidental.

However, you could perfectly well imagine a third package (alongside integer-gmp and integer-simpl), let's call it integer-gmp-malloc.  This would use the malloc/free interface, with finalisers to free space when no longer referenced from the GHC heap. It might be faster than integer-simple, and more inter-operable than integer-gmp. 

Also, freeing us from the GPL constraints of GMP, while offering better performance than integer-simple, would be great.

See also http://ghc.haskell.org/trac/ghc/wiki/ReplacingGMPNotes

Please do record the information or insights you get on a GHC wiki page!

Simon

| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
| Gergely Risko
| Sent: 22 October 2013 22:45
| To: ghc-devs at haskell.org
| Subject: Why do we put GMP allocations on GHC heaps?
| 
| Dear GHC gurus,
| 
| I've been looking into how GHC uses GMP (with the hidden agenda of
| taking the work of replacing it with something that is BSD license
| compatible and so can be linked in and shipped statically by default).
| 
| I think I more or less understand how GMP memory is managed and how the
| GC works together with it.  Actually understanding the "how" was not
| that hard, everything is quite clearly commented.
| 
| What I couldn't find any reference for is the "why".  Why does GHC do
| this?  Would it be possible to just not do this and let GMP malloc and
| free for itself (obviously we would have to call mpz_free here and
| there, but that seems doable).
| 
| I have multiple ideas, but don't know which one (if any) is correct:
|   - performance reasons,
|   - memory defragmentation reasons,
|   - GHC GC moves around stuff and that would somehow break,
|   - the previous one + threads + race conditions.
| 
| Second question: let's assume for a moment that we don't have any
| licensing issues with a GMP like library, can be linked statically into
| GHC, but we decide to go with the same allocation methodology as with
| GMP.  Am I right when I think that linking statically solves the
| technical issues too.
| 
| More concretely: openssl BN uses the openssl_malloc function can only be
| overridden openssl wide.  But if we link statically, than this override
| won't affect external openssl library bindings, because the openssl
| symbols in our object files won't even be exported, right?
| 
| Third question: is replacing with openssl bn an acceptable path to
| you guys?  I saw that 6 years ago there were work on getting
| integer-simple performance wise good enough to be the default and then
| adding other bignum libraries externally, but none of this happened.
| I see how sexy project it is computer science wise to write a bignum
| library in Haskell, but will we really do it?
| 
| Thanks,
| Gergely
| (errge)
| 
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs