Replacement for GMP: Update

Peter Tanski p.tanski at gmail.com
Thu Aug 10 20:00:40 EDT 2006


Thorkil,

> I would like to mention a few things that I have not seen  
> discussed: Clearly,
> using an existing library unmodified would be preferable: New  
> developments,
> error corrections, documentation, wide exposure, all of these  
> things would be
> available.

Agreed.  Unfortunately for us it seems that, apart from GMP, none of  
the fully developed and fast libraries available have all the  
functionality Haskell requires: ARPREC (and its predecessor, MPFUN)  
lack all binary operators (AND, OR, XOR, left-shift, right-shift, bit- 
flip, word-flip); OpenSSL's BN library lacks AND, OR, XOR and the  
least common multiple operation (lcm), as do the libraries in Botan  
and Crypto++.  I did not mention this before, but they also lack  
conversion functions to and from floating point representations.  So  
it would not be possible to use these in GHC and still maintain the  
same functionality without writing those transformation functions  
either separately (slow) or in the libraries themselves (modification).

There are libraries that have the functionality we require: MIRACL  
(at http://indigo.ie/~mscott/) is free for educational use but  
requires a premium for commercial use; LibTomMath is essentially beta  
(version 0.38), slow (less than half as fast as OpenSSL, which is  
equivalent in speed to GMP) and from my experience when putting it  
together, a little hacked (after all, it is a beta version); MPI  
(from http://www.cs.dartmouth.edu/~sting/mpi/) is a good library and  
has all the math functions but lacks the binary operations and has  
not been updated since 2003; maybe it has not required updating (it  
was originally written in 1998).  I have read that MPI is not as fast  
as LibTomMath (I will do a real comparison if anyone is interested).   
I do know from timing tests I did run in the MPI distribution that  
MPI tends to operate better on small integers (128-256 bits).   
LibTomMath and MPI are both essentially public domain.  I have heard  
that LibTomMath is used as the Tcl bignum library; I do not know that  
library supports Perl's bignum (Math::BigInt) but I could look into it.

Bignum libraries are actually fairly widely available but few are  
portable across a wide variety of domains: Microsoft's .NET bignum  
library and Apple's vecLib BigNum library, for example.  (Apple's  
vecLib vBigNum is also limited since it offers big integers only at  
256, 512 and 1024 bits and it does not provide all the mathematical  
operations we require.)

You may have noticed from reading the discussions so far that an  
internal library *is* a possibility.

> I have looked briefly at the OpenSSL Bignum library and in the  
> areas of memory
> management, but also error handling, it seems clearly intertwined  
> to some
> extent with OpenSSL in ways which would appear to rule out the  
> direct use of
> this library for GHC Integers.

OpenSSL's BN library is primarily tuned to support cryptography,  
particularly the generation of very large primes for public key  
cryptosystems.  It is possible to separate the BN library out (I am  
having some success there already).  It is also possible to use the  
library separately from Haskell using ForeignPtr; essentially doing  
everything through Haskell's FFI.  I have honestly not benchmarked a  
FFI-ForeignPtr interface against the current internal-GMP  
implementation, partly because the overhead required to use  
ForeignPtr and the availability of garbage-collected memory for GMP  
indicate that an internal GHC Bignum library would clearly be  
faster.  Many people have given good arguments for using an FFI- 
interface to such a library (for one, GMP supports dll's and posix  
dynamic libraries, which would eliminate the licensing issue), so I  
think before I go on I will set up a benchmark to try the two out.   
The one thing I cannot do without taking GMP out of the GHC compiler  
is a side-to-side comparison of GMP-FFI and GMP-internal because GMP  
can use only one memory allocator at a time: either GHC's runtime  
system Garbage Collector or malloc.

> However, considering the advantages of using
> an existing library unchanged, we might consider another  
> possibility: Working
> with the OpenSSL people to modify their library to allow the sort of
> interfacing that is needed for its direct and efficient use in GHC.  
> While, of
> course, retaining its value as part of OpenSSL.

I haven't looked at that for reasons of time--perhaps this is my  
personal target and perhaps it is for the benefit of the next GHC  
release on 26 August: it may take a long time to work with OpenSSL to  
modify their library.  It might be worth looking into, just to cover  
all bases.  The problem for me is that I seem to be doing all the  
work, albeit rather slowly; I simply do not have the time to follow  
every possibility.

> (And way further back: Have we tried to discuss the LGPL licence of  
> GMP with
> the authors? I am not really into all these matters, sorry if this  
> doesn't
> make sense.)

That is actually a good idea; it has been suggested before for an  
individual developer.  I doubt very much the Free Software Foundation  
would want to give an exception for one of their flagship products  
but I admit, that kind of thinking is, well, cowardly.  As for  
whether I am the person to ask them, I could float the question but I  
could not represent the GHC Team.

> Failing that, I would suggest considering the development of the  
> modified
> library to a form that would allow independent use, apart from its  
> use in
> GHC. This would add valuable possibilities to your options when  
> choosing the
> precise mixture of Haskell and, perhaps, raw C code that best  
> balances your
> performance desires and needs for convenience.

Before I actually bind any library with GHC, I will thoroughly test  
it as a stand-alone library and I will develop some good benchmarks  
for it as an FFI-based library (using ForeignPtr).  The Creators  of  
the GHC, however, have given good arguments why it is Very Good to  
use GHC's runtime system Garbage Collector to handle the Integer- 
memory for bignum libraries: it is almost certainly faster, it allows  
the runtime system's Scheduler to operate on Integers without  
resorting to the creation or management of Operating System (OS)  
threads and in general the evaluation of Integers is more a "native"  
than a "foreign" operation in Haskell programs.

I hope I have answered a few of your points; some of them have given  
me more work but that is all right, I guess.  You did say that you  
frequently use Integers in Haskell.  If you have the time, would you  
be able to find any programs you might have that displayed poor  
Integer performance or possibly bottlenecks in the current system?   
It might be helpful for benchmarking the replacement--this is really  
supposed to be an "upgrade," after all.

Best regards,
Peter





More information about the Glasgow-haskell-users mailing list