Replacement for GMP: Update
naur at post11.tele.dk
Thu Aug 24 14:39:00 EDT 2006
Sorry for the late reply. From your latest communication which seems to be
Date: Sat, 12 Aug 2006 21:12:05 -0400
From: Peter Tanski <p.tanski at gmail.com>
Subject: Re: OpenSSL License (was Replacement for GMP: Update)
To: John Goerzen <jgoerzen at complete.org>
I am a bit uncertain where the matter stands right now. Nevertheless, I wish
to thank you for your reply. Clearly, you have done a lot of work, not least
to investigate alternatives.
> I hope I have answered a few of your points; some of them have given
> me more work but that is all right, I guess. ...
And you most certainly have, much more detailed than I could have hoped for.
And hopefully, this additional work that you mention is not something that
you considered a waste of time.
I am also most happy to read that
> Before I actually bind any library with GHC, I will thoroughly test
> it as a stand-alone library ...
since "correctness" of Integer computation is by far its most important
property. (I would say that "performance" comes second, then what could be
termed "interoperability" as third.)
The support for Integer computation is a messy subject: On the one hand, even
children out of the first few years of school are capable of understanding
the basic functionality involved. On the other hand, providing efficient
support of this functionality, especially if required in a wide selection of
circumstances, requires a surprising amount of hard work and insight. I would
say, if only really large integers were routinely required in actual,
real-world computations, our CPUs would support these computations and there
would be no need to consider the question in the, admittedly, fairly limited
context of GHC. The situation seems to be that the functionality is perhaps
considered obviously available, but the reality is that it can only be
attained at significant cost (either real money or honest sweat).
I am truly unable to tell what I would consider the ideal situation. On the
one hand, I value greatly the freedom of choosing my circumstances without
restraints. And I appreciate the desire of noble people like the GHC
developers to be equally free of such restraints. This would point in the
direction of basing Integer computations on a fairly generic implementation.
Which insightful consideration would then force us to admit would probably
cost a factor of, say, 2 or more in performance in some cases.
On the other hand, the existence of libraries like GMP seems to offer an
economic path out of this jungle. However, as the discussion over the past
couple of weeks illustrates, the path may be unacceptably thorny.
Let me quote some additional statements from this debate:
On Thursday 10 August 2006 19:00, Simon Peyton-Jones wrote:
> OK so you are talking of taking a copy of the BN code, changing the
> function names, and statically linking its allocator to GHC's RTS.
> If someone wants to make a program that uses BN for some other purpose,
> they get the original BN, whose function names don't clash, and which
> uses malloc.
> Sounds ok to me.
Two problems seem to be lurking here: The first is that, although taking a
copy of some existing library and massaging it to become working under some
presently available circumstances may be fine, what is really needed is
something that keeps on working, even under changing and future
circumstances. The second is that if I wish to mix Haskell and non-Haskell
code that processes the same data, I may find myself in need of a way to
convert between whatever is required by this copy of some existing library
that has been created and the presently available, but potentially quite
different, version of that same library.
No offence is meant here: I know that I am simply re-iterating what I have
said earlier. However, I believe that this is sufficiently important to risk
being ridiculed for pointing out the obvious.
Then you ask
> ... If you have the time, would you
> be able to find any programs you might have that displayed poor
> Integer performance or possibly bottlenecks in the current system?
Let me say first that I have not experienced any really surprisingly poor
performance of the current system. To be sure, I have seen programs similar
to my own and not written in Haskell that performed better on comparable
tasks. But I have not been able to say specifically that this poorer
performance of my program was caused by some inadequacy in the Haskell (with
I better be specific here: What I do is integer factorization (splitting
integers into their prime factors). And, for example, I have implemented a
version of the ECM (Elliptic Curve Method) in Haskell. Having done that
(independently and mainly to learn what it was about), I eventually compared
it in performance to GMP-ECM which Paul Zimmermann wrote. And I was initially
disappointed to find that GMP-ECM was about 7 times faster than my ECM code.
Fortunately, various sessions of "hand-waving" brought this factor down to
about 1.5. So, whenever GMP-ECM used 2 minutes, my program would use 3
I must admit that I am neither satisfied nor done with this result. At the
outset, when I started using Haskell for these number theoretic computations,
I had hoped that the overhead of using Haskell, compared to the use of C with
GMP calls, would be insignificant, something like 10 or at worst 20 percent.
So this 50 percent of overhead seems a bit excessive. On the other hand,
there may be additional hands to wave that I haven't thought of.
And again: I am not done working with this, just holding an extended break.
Please indicate if you wish to see some detailed and hard results.
Some additional experience is this: To gain some potentially easier-to-compare
measurements between Haskell-with-GMP and C-with-GMP, I implemented the GMP
function mpz_powm (raising a number to a power modulo a third number) as a
primitive operation in GHC. And compared it to my own power-raising operation
that was implemented in Haskell.
Overall, the GMP implementation of the power operation was 2 times faster than
my Haskell implememntation. But, to make sure, using some techniques that my
code certainly didn't use.
An interesting thing happened, however, for very large numbers: The GMP power
operation started using a lot of memory, thrashing, and then, eventually,
with sufficiently large operands, crashed for lack of memory. Where my
Haskell-implemented version continued to provide usable results.
The explanation of this should undoubtably be found in the method (some
FFT-thing) used for the power operation by GMP for very large numbers: My
guess is that GMP when using this method issues a large number of
allocate/free requests. And, again I am guessing, assuming that each
allocate request causes the GHC runtime to allocate, whereas each free only
causes a "we'll handle this at the next GC", then such large computations
may very well display such behaviour.
An example where the "Very Good Thing" of using the GHC RTS memory management
could be questioned.
I will not bother you further, except for expressing my sincere respect for
the work that you are doing.
More information about the Glasgow-haskell-users