From v-julsew@microsoft.com Tue May 1 12:55:08 2001 Date: Tue, 1 May 2001 04:55:08 -0700 From: Julian Seward (Intl Vendor) v-julsew@microsoft.com Subject: FW: Observation re Hugs GC problem on sparcs w/gcc -O2
Looking back through old mail. I mailed this to the Hugs maintainers about 2 years ago, but I think it is still relevant. It contains a suggestion about why GC doesn't always work right on Hugs on sparcs when type.c/static.c/whatever are compiled -O/-O2, and an easy fix. J | -----Original Message----- | From: Julian Seward (Intl Vendor) [mailto:v-julsew@microsoft.com]=20 | Sent: Thursday, April 22, 1999 10:13 AM | To: 'hugs-bugs@cs.yale.edu' | Cc: 'D.Wakeling@exeter.ac.uk' | Subject: Observation re Hugs GC problem on sparcs w/gcc -O2 |=20 |=20 |=20 | Mark |=20 | I seem to gather that (normal) Hugs has GC problems on Sparc=20 | with optimisation on. David W also reported this recently. |=20 | I got the impression that you think this happens because the=20 | register window mechanism magically obscures some registers=20 | which could hold roots, at the point when GC occurs. |=20 | I would like to offer a different explanation, plus an easy=20 | fix. When I was working for the UFO project at Manchester,=20 | we did a collector which took roots from the C stack. It too=20 | had problems on sparcs, and the following observation saved=20 | the day. I think it might equally apply to Hugs. |=20 | In the code below, complexFnReturningAPair is obvious, | and ... compute_and_alloc ... refers to some code which | may cause a garbage collection. |=20 |=20 | Pair complexFnReturningAPair ( ... ) { .... } |=20 |=20 | Pair p; |=20 | p =3D complexFnReturningAPair ( ... ); | for ( ... ) { | ... fst(p) ... snd(p) ... | ... compute_and_alloc ... | } |=20 | So: p becomes a pair. We then enter a loop which refers to | fst(p) and/or snd(p), and in which GC could happen. After cppisation,=20 | the code looks like |=20 | p =3D complexFnReturningAPair ( ... ); | for ( ... ) { | ... heapTopFst[p] ... heapTopSnd[p] ... | ... compute_and_alloc ... | } |=20 | Expanding the array address calculations gives |=20 | p =3D complexFnReturningAPair ( ... ); | for ( ... ) { | ... heapTopFst + 4*p ... heapTopSnd + 4*p ... | ... compute_and_alloc ... | } |=20 | Now the critical step, done by gcc when optimising: lift | 4*p out of the loop since it's loop-invariant, and | (incidentally) CSE the two 4*p's together: |=20 | tmp =3D 4 * complexFnReturningAPair ( ... ); | for ( ... ) { | ... heapTopFst + tmp ... heapTopSnd + tmp ... | ... compute_and_alloc ... | } |=20 | Blargh! Now, inside the loop, there is no mention of the=20 | original p returned by complexFnReturningAPair, neither in=20 | regs nor on the stack. So the pair won't be retained by GC=20 | even though it's still live in the loop. |=20 | The moral of the story is this: when using an=20 | array-index-addressed heap, as Hugs does, we need to treat as=20 | a root not only valid array indexes into the heap, but also=20 | intermediates created by the array address calculations. That is: |=20 | p sat -MAX_HEAP < p <=3D 0 | p*sizeof(int) sat -MAX_HEAP < p <=3D 0 | heapTopFst+ p*sizeof(int) sat -MAX_HEAP < p <=3D 0 | heapTopSnd+ p*sizeof(int) sat -MAX_HEAP < p <=3D 0 |=20 | Looking in machdep.c:gcCStack(), I only see the first of=20 | those four possibilities handled. Perhaps the following=20 | would be better: |=20 | #define Blargh markWithoutMove(*ptr); \ | markWithoutMove((*ptr)/sizeof(Cell)); \ | markWithoutMove((=20 | (void*)(*ptr)-(void*)heapTopFst)/sizeof(Cell)); \ | markWithoutMove(( | (void*)(*ptr)-(void*)heapTopSnd)/sizeof(Cell)) |=20 | #define StackGrowsDown { while (ptr<=3DCStackBase) { Blargh;=20 | ptr++; }; } | #define StackGrowsUp { while (ptr>=3DCStackBase) { Blargh;=20 | ptr--; }; } | #define GuessDirection if (ptr>CStackBase) StackGrowsUp else=20 | StackGrowsDown |=20 | I tried this on x86 and it made no difference at all, since=20 | there isn't a problem for x86 anyway. I'd be interested to=20 | hear if it makes any difference on Sparcs. (note -- the code=20 | might not be exactly right). Check if you try it ... |=20 | Finally, I think I know why the problem doesn't strike x86s. =20 | That is because the x86 has addressing modes of the form |=20 | (reg1 + k*reg2) for k as 1, 2, 4 or 8 |=20 | So a reference to heapTopFst[p] becomes=20 |=20 | ; reg1 holds p | ; reg2 holds heapTopFst | movl (reg1 + 4*reg2), reg3 |=20 | and there is never any point precomputing 4*p, since the=20 | architecture can do it for free. |=20 | The entire diatribe applies not only to sparcs but other=20 | riscs which have lots of registers but only simple addressing=20 | modes. Have you had problem reports for (eg) MIPS too? |=20 | J |=20From matth@mindspring.com Wed May 2 02:42:34 2001 Date: Tue, 01 May 2001 20:42:34 -0500 From: Matt Harden matth@mindspring.com Subject: User defined Ix instances potentially unsafe
Sorry if this has been reported before. I shouldn't be able to crash ghc or hugs without using any "unsafe" features, right? Well, here 'tis: > module CrashArray where > > import Array > import Ix > > newtype CrashIx = CrashIx Int deriving (Ord, Eq, Show) > > instance Enum CrashIx where > toEnum x = (CrashIx x) > fromEnum (CrashIx x) = x > > instance Ix CrashIx where > inRange (_,_) _ = True > index (_,_) (CrashIx x) = x > range (x,y) = [x..y] > > myArray = listArray (CrashIx 0, CrashIx 0) [0] > crash = myArray ! (CrashIx maxBound) In ghci-5.00, I get a segfault and hugs-feb-2000 says: INTERNAL ERROR: Error in graph Now, admittedly my Ix instance is broken, but I don't think I should be able to segfault the interpreter. Unfortunately, I think the only way to fix this without changing the Library Report would be to add another layer of range checking to the array implementation. Bleh. Note also that the (inefficient) implementation in the report wouldn't crash, but would get the "wrong" error: "Undefined array element" instead of "Index out of range". I think we might describe this as a bug in the Library Report, rather than in any particular Haskell implementation. Enjoy! Matt HardenFrom antony@apocalypse.org Thu May 3 00:09:57 2001 Date: Wed, 02 May 2001 19:09:57 -0400 From: Antony Courtney antony@apocalypse.org Subject: version skew between GreenCard and hugs
Hi, I am currently engaged in a frustrating battle to track down a stray pointer bug in some GreenCard'ed code under Win32. Extremely un-fun. On my system, I currently have installed: Hugs98Feb2001 (from sources) GreenCard 2.01 (from sources) I notice that, for some reason, both GreenCard and hugs include their own versions of "GreenCard.h". (in the "src/" directory for the Hugs distribution, and in "lib/hugs" in the GreenCard distribution). And it turns out that they are different (though it looks like the Hugs version is a strict extension of the GreenCard version). Questions: 1. Why is this source file duplicated? Presumably since the file lives in the "lib/hugs" directory of GreenCard, it is only useful with Hugs anyway. Perhaps GreenCard should just let Hugs own this file. 2. Are there any version compatibility issues to worry about here? I'm a little perplexed as to why the latest release of hugs has a more up-to-date "GreenCard.h" than GreenCard itself. Thanks, -antony -- Antony Courtney Grad. Student, Dept. of Computer Science, Yale University antony@apocalypse.org http://www.apocalypse.org/pub/u/antonyFrom matth@mindspring.com Thu May 3 03:39:13 2001 Date: Wed, 02 May 2001 21:39:13 -0500 From: Matt Harden matth@mindspring.com Subject: User defined Ix instances potentially unsafe
Matt Harden wrote: > blah, blah, blah, bug in the Library Report, blah, blah... OK, so I failed to read the Library Report. It clearly states: > An implementation is entitled to assume the following laws about these operations: > > range (l,u) !! index (l,u) i == i -- when i is in range > inRange (l,u) i == i `elem` range (l,u) So my "bug" is only in my mind. Sorry for bothering everyone. Regards, MattFrom simonpj@microsoft.com Thu May 3 09:42:06 2001 Date: Thu, 3 May 2001 01:42:06 -0700 From: Simon Peyton-Jones simonpj@microsoft.com Subject: possible bug
Looks to me that this should be ok. GHC typechecks it fine. Simon | -----Original Message----- | From: Jonathon Bell [mailto:jbell@mathsoft.com]=20 | Sent: 18 April 2001 19:20 | To: 'hugs-bugs@haskell.org' | Subject: possible bug |=20 |=20 | Hello there, |=20 | I've been experimenting with the use of type dependencies in=20 | type classes and have come across something i find=20 | surprising. Could it in fact be a bug in the implementation?=20 | I'm using Hugs Feb 2001, with switches +o and -98: |=20 | > class Bug f a r | f a -> r where | > | > bug::f->a->r | > | > instance Bug (Int->r) Int r | >--instance ... | > instance (Bug f a r) =3D> Bug f (c a) (c r)=20 | > | > f:: Bug(Int->Int) a r =3D> a->r | > f =3D bug(id::Int->Int) |=20 | The above compiles fine and at the prompt .. |=20 | Main> f (f [0::Int]) |=20 | ...runs with an expected program error that member 'bug' has=20 | not been defined. Fine. But |=20 | Main> f (f (f [0::Int])) |=20 | -- ...fails to compile with an unresolved overloading: |=20 | *** ERROR - Unresolved overloading | *** Type : (Bug (Int->Int) Int a, Bug(Int->Int) a v =3D> [b] | *** Expression : f (f (f [0])) |=20 | which is a surprise. it appears as though the compiler is=20 | failing to exploit the dependency '|f a->r' from which it=20 | could infer that 'a' in the above message must in fact be=20 | 'Int', etc... |=20 | Many thanks for investigating this... |=20 | ________________________________ | Jonathon Bell jbell@mathsoft.com | MathSoft, Inc. www.mathsoft.com | 101 Main St, Cambridge, MA 02142 | (617) 577-1017 x745 |=20 |=20 | _______________________________________________ | Hugs-Bugs mailing list | Hugs-Bugs@haskell.org=20 | http://www.haskell.org/mailman/listinfo/hugs-bugs |=20From simonpj@microsoft.com Thu May 3 09:42:08 2001 Date: Thu, 3 May 2001 01:42:08 -0700 From: Simon Peyton-Jones simonpj@microsoft.com Subject: Possible bug?
This looks ok to me. The two instances do not unify, because (c1 a1) does not unify with (c2 a2, c2 b2) GHC is happy with it. Simon | -----Original Message----- | From: Jonathon Bell [mailto:jbell@mathsoft.com]=20 | Sent: 16 April 2001 22:43 | To: 'hugs-bugs@haskell.org' | Subject: Possible bug? |=20 |=20 | Hi chaps, |=20 | Hugs (Feb 2001) fails to compile the following, complaining=20 | that the instances are not consistent with the dependencies: |=20 | class Foo f a r | f a->r where |=20 | foo::f->a->r |=20 | instance Foo (a->r) (c a) (c r) |=20 | instance Foo ((a,b)->r) (c a,c b)(c r) |=20 |=20 | My intention is to overload the uncurried function foo for=20 | based on both its arity and the kind of tuple it is passed.=20 | Could you please explain what i am doing wrong here? |=20 | Thank you so much, |=20 | _______________________________ | Jonathon Bell jbell@mathsoft.com | MathSoft, Inc. www.mathsoft.com | 101 Main St, Cambridge, MA 02142 | (617) 577-1017 x745 |=20 |=20 | _______________________________________________ | Hugs-Bugs mailing list | Hugs-Bugs@haskell.org=20 | http://www.haskell.org/mailman/listinfo/hugs-bugs |=20From afie@cs.uu.nl Thu May 3 14:21:11 2001 Date: Thu, 3 May 2001 15:21:11 +0200 From: Arjan van IJzendoorn afie@cs.uu.nl Subject: Older (but not too old) versions of Hugs
Hello! Is it possible to put up older versions of Hugs? And I don't mean the really old 1.3 versions, but something like July 2000 or thereabouts. We would like to use the graphics library and it doesn't work in the latest (Feb2001) version. Regards, Arjan van IJzendoornFrom haberg@matematik.su.se Thu May 3 19:48:11 2001 Date: Thu, 3 May 2001 20:48:11 +0200 From: Hans Aberg haberg@matematik.su.se Subject: GMP unboxed number representation
[I am not on the GHC list, so replies please either cc hugs-bugs or me.] I am chatting with the GMP (GNU Multi-Precision number library) people about "unboxed" (no pointer) representations. What do you experts on such implementations think would be a suitable low-level GMP implementation; that is, that would make the use in say GHC easier? (As I think GHC uses GMP, I thought this might the right place to ask for an input). Hans AbergFrom qrczak@knm.org.pl Thu May 3 21:23:03 2001 Date: Thu, 3 May 2001 22:23:03 +0200 From: Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl Subject: GMP unboxed number representation
[Wiadomość wysłana również na grupę dyskusyjną.] Thu, 3 May 2001 20:48:11 +0200, Hans Aberg <haberg@matematik.su.se> pisze: > I am chatting with the GMP (GNU Multi-Precision number library) people > about "unboxed" (no pointer) representations. Do you mean storing smaller values directly in the structure? > What do you experts on such implementations think would be a suitable > low-level GMP implementation; that is, that would make the use in > say GHC easier? GHC does that optimization itself. Its Integer representation is thus: data Integer = S# Int# -- small integers | J# Int# ByteArray# -- large integers It keeps integers in the small variant where possible, switching to use the gmp variant on overflow. It's essential that these two variants are visible to ghc, so it can often generate appropriate dispatching code instead of physical allocation of these structures. The Int# in the J# variant corresponds to _mp_size in __mpz_struct, and the ByteArray# holds the pointer to Haskell heap block which contains: * a header pointer used by the GC, * _mp_alloc, * all the libs pointed to by _mp_d. Changing the representation in gmp would require massive rewriting in ghc. I don't know what improvement could be made by the way, so let me just describe what is currently going on. Code for primitive versions of (+)::Integer->Integer->Integer etc. has a comment: /* ToDo: this is shockingly inefficient */ It creates MP_INT variables, fills them from arguments of Haskell's J# objects (passed separately): arg1._mp_alloc = d1->words; arg1._mp_size = (s1); arg1._mp_d = (unsigned long int *) (BYTE_ARR_CTS(d1)); arg2._mp_alloc = d2->words; arg2._mp_size = (s2); arg2._mp_d = (unsigned long int *) (BYTE_ARR_CTS(d2)); calls the gmp function, and returns the _mp_size of the result together with its _mp_d-sizeof(two words) on the Haskell stack. ghc calls mp_set_memory_functions on startup to let gmp allocate ByteArray#s with appropriate header on the Haskell heap. ghc's Int# has always the same size as a pointer, either 32 or 64 bits. The runtime assumes that limbs have this size too, e.g. _mp_alloc here is just the number of words in the ByteArray#. There is no provision for turning the J# representation back into S# if it gets small enough. (If there was, it would not be enough to check abs(_mp_size)<=1, because mpz with abs(_mp_size)<=1 is able to represent one bit more than a single Int#, because the sign bit is stored in the _mp_size's sign.) I think it's essential to not make the S# case slower, but the representation of larger numbers could be changed if only somebody did the huge task of rewriting everything (with #ifdefs for older gmp). I'm not sure how to integrate gmp's view of optimizing small integers with ghc's view. -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTĘPCZA QRCZAKFrom qrczak@knm.org.pl Thu May 3 21:27:09 2001 Date: Thu, 3 May 2001 22:27:09 +0200 From: Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl Subject: GMP unboxed number representation
[Wiadomość wysłana również na grupę dyskusyjną.] (The reply went to hugs-bugs and glasgow-haskell-bugs, but in separate mails because I'm reading glasgow-haskell-bugs through my mail<->news gateway. Please Cc: replies to a ghc list or me; I'm not subscribed to hugs-bugs.) -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTĘPCZA QRCZAKFrom reid@cs.utah.edu Thu May 3 22:36:08 2001 Date: Thu, 3 May 2001 15:36:08 -0600 From: Alastair Reid reid@cs.utah.edu Subject: version skew between GreenCard and hugs
> 1. Why is this source file duplicated? For the benefit of those who want to use greencard but don't want to use a source distribution of Hugs. A better solution to this (rather lame) argument would have been adding GreenCard.h to the binary distributions. > 2. Are there any version compatibility issues to worry about here? Hugs uses version numbers for the API that Greencard uses. Greencard currently uses version 2 whilst Hugs supports 2 and 3. There is a dynamic check that the version number is ok. I believe that the check reflects reality - i.e., that it really is ok to use greencard and the greencard-supplied API with a modern version of Hugs. > I'm a little perplexed as to why the latest release of hugs > has a more up-to-date "GreenCard.h" than GreenCard itself. Because GreenCard.h is really a definition of the FFI-support API and the FFI support has been updated much more recently than GreenCard. Perhaps the name of the file should be changed and copies included in binary distributions. -- Alastair Reid ps The CVS repository for Hugs is online so you could track down who is to blame for changes like this. (Me, in this case.)From haberg@matematik.su.se Fri May 4 10:32:36 2001 Date: Fri, 4 May 2001 11:32:36 +0200 From: Hans Aberg haberg@matematik.su.se Subject: GMP unboxed number representation
At 22:23 +0200 2001/05/03, Marcin 'Qrczak' Kowalczyk wrote: >> I am chatting with the GMP (GNU Multi-Precision number library) people >> about "unboxed" (no pointer) representations. > >Do you mean storing smaller values directly in the structure? Yes, in one form or another. Let's focus on integers. Then GMP currently uses typedef struct { int _mp_alloc; int _mp_size; mp_limb_t *_mp_d; } __mpz_struct; where _mp_d is the pointer to the dynamically allocated number (mp_limb_t is an integral type of suitable size for the platform). My idea is to somehow augment it, so that when numbers are small, they do not use dynamical allocation. Say typedef struct { union { mp_int _mp_n; mp_limb_t* _mp_d; }; int unboxed; } __mpz_struct; or some variation of it. One problems though is the GC (Garbage Collector): My immediate concern is to write a C++ wrap, in which case the return of an integer is slow, as it will invoke the copy constructor, causing a dynamic allocation. One way around it is to use a reference count. Then, in order to avoid a dynamic allocation for the ref count as well, one wants the object pointed to by _mp_d above contain all data, say typedef struct { union { mp_int _mp_n; mp_limb_t* _mp_all; /* Pointer to all. */ }; int unboxed; } __mpz_struct; #define _mpz_alloc _mp_all[0] #define _mpz_count _mp_all[1] #define _mpz_size _mp_all[2] #define _mpz_d _mp_all + 3 But, after all, a ref count is just one primitive form of GC, used in C++ because it is easy to automate, and because it is difficult to implement a more advanced form of GC. So here comes the question in, if now GMP should serve say the implementation of compilers such as GHC, what kind of low number representations should one use? One advantage of having it in GMP is that it supports low-level assembler code. >GHC does that optimization itself. Its Integer representation is thus: > >data Integer > = S# Int# -- small integers > | J# Int# ByteArray# -- large integers > >It keeps integers in the small variant where possible, switching to >use the gmp variant on overflow. > >It's essential that these two variants are visible to ghc, so it >can often generate appropriate dispatching code instead of physical >allocation of these structures. So this suggest that it might be a disadvantage for GHC to have a GMP mergeing the two number representations. Or is it so that the dispatch code can only be generated if the input data is sufficiently static, in which it would be great advantage with it in GMP in the case the data is dynamic? >The Int# in the J# variant corresponds to _mp_size in __mpz_struct, and >the ByteArray# holds the pointer to Haskell heap block which contains: >* a header pointer used by the GC, >* _mp_alloc, >* all the libs pointed to by _mp_d. > >Changing the representation in gmp would require massive rewriting >in ghc. I don't know what improvement could be made by the way, >so let me just describe what is currently going on. You shouldn't worry that anything going on in GMP would not respect upwards compatibility. It seems that this type of GMP discussions have taken place from time to time; I am only the only bringing it up now. The GMP developers have so far judged that the effort does not outweigh the advantage of having a type which is faster for small numbers. Do you think that the speed-up for smaller numbers would be significant for such a rewrite? >Code for primitive versions of (+)::Integer->Integer->Integer etc. has >a comment: > /* ToDo: this is shockingly inefficient */ Yes, I think so to. This is one reason for moving it into GMP, because on an assembler level, one can do more efficient overflow checks, etc. >It creates MP_INT variables, fills them from arguments of Haskell's >J# objects (passed separately): > arg1._mp_alloc = d1->words; > arg1._mp_size = (s1); > arg1._mp_d = (unsigned long int *) (BYTE_ARR_CTS(d1)); > arg2._mp_alloc = d2->words; > arg2._mp_size = (s2); > arg2._mp_d = (unsigned long int *) (BYTE_ARR_CTS(d2)); >calls the gmp function, and returns the _mp_size of the result together >with its _mp_d-sizeof(two words) on the Haskell stack. > >ghc calls mp_set_memory_functions on startup to let gmp allocate >ByteArray#s with appropriate header on the Haskell heap. > >ghc's Int# has always the same size as a pointer, either 32 or 64 bits. >The runtime assumes that limbs have this size too, e.g. _mp_alloc here >is just the number of words in the ByteArray#. > >There is no provision for turning the J# representation back into S# >if it gets small enough. (If there was, it would not be enough to >check abs(_mp_size)<=1, because mpz with abs(_mp_size)<=1 is able to >represent one bit more than a single Int#, because the sign bit is >stored in the _mp_size's sign.) > >I think it's essential to not make the S# case slower, but the >representation of larger numbers could be changed if only somebody >did the huge task of rewriting everything (with #ifdefs for older >gmp). I'm not sure how to integrate gmp's view of optimizing small >integers with ghc's view. My guess is that the S# representation should be used for situations where small static size can be detected. Even overflow detection slows down these primitive a lot (or so I am told). Then when the static detection cannot be used, one would have to work with the J# representation. Hans AbergFrom reid@cs.utah.edu Fri May 4 12:22:13 2001 Date: Fri, 4 May 2001 05:22:13 -0600 (MDT) From: Alastair Reid reid@cs.utah.edu Subject: exceptions and forking
[Copied to hugs-bugs. Simon is replying to a query by me about a recent change in the Exception library (that the CVS repository version of Hugs now supports). The BlockedOnDeadMVar exception is an exception sent to all threads which are blocked on an unreachable MVar which is unreachable and, therefore, can never, ever be awakened by a live thread writing to the MVar.] "Simon Marlow" <simonmar@microsoft.com> writes: > Well, strictly speaking you should implement BlockedOnDeadMVar because > otherwise a thread can be silently GC'd without getting a chance to > clean up and release any locks it may be holding. I'm not sure how > difficult it would be to implement in Hugs, though. I'm absolutely convinced. Not dealing properly with issues like this is what makes killThread such a broken idea in Java (though not in Haskell thanks to ideas in your PLDI paper). Unfortunately, it's going to be somewhat unpleasant to implement in Hugs. It's obvious enough how to use ForeignObjs to detect when the GC is done with an MVar but will require yet another change to the list of runnable threads to be able to bring a dying thread back to life while we send it an exception. Hack upon hack upon hack.... Sigh! [Ummm, consider this as a bug report from me (Hugs doesn't implement BlockedOnDeadMVar) combined with me accepting responsibility for fixing it if no-one beats me to it.] -- Alastair Reid reid@cs.utah.edu http://www.cs.utah.edu/~reid/From qrczak@knm.org.pl Fri May 4 18:06:53 2001 Date: Fri, 4 May 2001 19:06:53 +0200 From: Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl Subject: GMP unboxed number representation
On Fri, May 04, 2001 at 11:32:36AM +0200, Hans Aberg wrote: > Let's focus on integers. ghc doesn't use other gmp types. > One problems though is the GC (Garbage Collector): Indeed. ghc solves it by using gmp format only temporarily to perform operations, and keeping numbers in its own structures. > Then, in order to avoid a dynamic allocation for the ref count > as well, one wants the object pointed to by _mp_d above contain > all data, Attaching data to _mp_d can be done without changing gmp, but also without the ability to mix other library using gmp in the same program: by providing appropriate allocation functions, which allocate more, fill the header and return a shifted pointer to gmp. This is what ghc does. I think you would have to be careful to not use gmp functions which change numbers in place. In Haskell there is no problem because the interface is functional anyway, but a C++ user might want to have ++z performed in place. When the number can be shared, it doesn't work. So either do it functionally or copy by value. In either case sometimes you lose. Leaving gmp memory management to the programmer would make using it awful. > So here comes the question in, if now GMP should serve say the > implementation of compilers such as GHC, what kind of low number > representations should one use? I don't know how to do it better than currently: that it lets the program/library which uses gmp manage gmp's memory. > So this suggest that it might be a disadvantage for GHC to have a > GMP mergeing the two number representations. It would be an unnecessary complication, but manageable. Since ghc creates mpz objects for each operation, it could use only the big representation of gmp on inputs, handling small numbers itself as currently. But it must be prepared to receive small result. It would be silly to wrap it in an allocated memory if gmp produced a small answer, so primops (implemented in C or assembler) should return either representation. It would be a bit ugly. Ghc's primops use a restriced set of types and rarely allocate memory themselves. For example plusInteger# has the type Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) where (# x, y #) means to return both values on the stack. It is wrapped for integer addition thus: plusInteger :: Inteter -> Integer -> Integer plusInteger i1@(S# i) i2@(S# j) = case addIntC# i j of { (# r, c #) -> if c ==# 0# then S# r else toBig i1 + toBig i2 } plusInteger i1@(J# _ _) i2@(S# _) = i1 + toBig i2 plusInteger i1@(S# _) i2@(J# _ _) = toBig i1 + i2 plusInteger (J# s1 d1) (J# s2 d2) = case plusInteger# s1 d1 s2 d2 of (# s, d #) -> J# s d Returning one of two variants from a primop requires expressing it similarly to a C struct, without type punning. So it could be this: plusInteger# :: Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, Int#, ByteArr# #) plusInteger (J# s1 d1) (J# s2 d2) = case plusInteger# s1 d1 s2 d2 of (# 0#, i, _ #) -> S# i (# _, s, d #) -> J# s d The third component in the S# case is ignored, but the primop must pass something - a pointer to a dummy Haskell object (this technique is used in some other primops). This assumes that the condition for having an alternative representation is the same in ghc and gmp: fitting into a single signed integer of the limb size (i.e. pointer size). Note that the J# representation currently relies on the fact that mpz can be reconstructed from an integer + an array of words of an explicit size. It's not possible to change the representation of mpz and keep the J# representation in ghc. > Or is it so that the dispatch code can only be generated if the > input data is sufficiently static, in which it would be great > advantage with it in GMP in the case the data is dynamic? It's dynamic too. An advantage of making the variants visible to the compiler is that a function returning an Integer is compiled to a code which takes an address of two continuations as an argument, and enters the continuation corresponding to the constructor returned, passing it constructor arguments as arguments. It doesn't necessarily allocate S# or J# node on the heap if it's to be evaluated right away. This happens to all algebraic types in general (perhaps there are size constraints, I don't know). So I guess that dispatching is best done by using an algebraic type. But I'm not sure. The ghc documentation says that ghc loves single-constructor types. It was written a long time ago, but perhaps it's still true. So how to distinguish representations in another way? Well, perhaps data Integer = J# Int# Int# ByteArray# with a dummy object for the ByteArray# in the short case would work. I don't know how well: it's ugly but maybe fast, and it seems easier to integrate with a variant representation in gmp. In any case better judging of performance of ghc for various representations should be done by somebody with more knowledge than me. I don't know what are exact the overheads in various cases. > You shouldn't worry that anything going on in GMP would not respect > upwards compatibility. ghc abuses gmp by using its internals directly. When the representation of mpz changes, ghc must be changed, even if the C interface remains compatible. Even renaming some gmp functions and providing old names as macros (or something like this) in gmp3 caused compatibility problem for ghc, and ghc-4.08 requires gmp2, because assembler code calls C functions directly and cpp couldn't translate the names. > The GMP developers have so far judged that the effort does not > outweigh the advantage of having a type which is faster for small > numbers. Do you think that the speed-up for smaller numbers would > be significant for such a rewrite? It is significant. I don't know any numbers, but ghc's handling Integers was told to get a big speedup after introducing the separate representation for small integers. But this might be partially because calling gmp functions has a lot of indirections, construction of mpz objects etc., so the speedup could result from avoiding calling gmp at all. Allocation in ghc is fast. Perhaps in your C++ wrappers you can distinguish the variant above gmp, and use gmp at all only for large numbers? C++ doesn't have such indirections, you would surely keep original mpz objects, but it might be easier than changing gmp. Note that I'm not against changing gmp. It's not obvious how the consequences would be... -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÄPCZA QRCZAKFrom haberg@matematik.su.se Fri May 4 22:08:59 2001 Date: Fri, 4 May 2001 23:08:59 +0200 From: Hans Aberg haberg@matematik.su.se Subject: GMP unboxed number representation
At 19:06 +0200 2001/05/04, Marcin 'Qrczak' Kowalczyk wrote: >> Let's focus on integers. > >ghc doesn't use other gmp types. GMP has a very interesting multi-precision floating number type as well, but it has the same problem as the integers: It does not use the native double's for small float, so it probably becomes slow. >> One problems though is the GC (Garbage Collector): > >Indeed. ghc solves it by using gmp format only temporarily to perform >operations, and keeping numbers in its own structures. Does that mean that you are getting extra memory allocations, or are you simply handling over a suitable mpz, and then picking up the _mp_d pointer in you own format? The latter sounds interesting, perhaps one can use it in a C++ wrap. >> Then, in order to avoid a dynamic allocation for the ref count >> as well, one wants the object pointed to by _mp_d above contain >> all data, > >Attaching data to _mp_d can be done without changing gmp, but also >without the ability to mix other library using gmp in the same program: >by providing appropriate allocation functions, which allocate more, >fill the header and return a shifted pointer to gmp. This is what >ghc does. Yes, this is the approach I suggested, but then as a C library the one with the different allocation function will become binary incompatible with the regular library. Perhaps I though in too simplistic patterns; if I put the pointer shifting in the C++ library, I might automate it. But then the C++ library becomes incompatible with the C library. Perhaps it does not matter. >I think you would have to be careful to not use gmp functions which >change numbers in place. In Haskell there is no problem because the >interface is functional anyway, but a C++ user might want to have >++z performed in place. When the number can be shared, it doesn't >work. So either do it functionally or copy by value. In either case >sometimes you lose. Leaving gmp memory management to the programmer >would make using it awful. I do not think that standard C++ library containers give such guarantees (say for std::string); one merely builds an interface. (As for the ref count, I use a function detach() which removes any object to be mutated from the reference cluster before mutation.) So it should not be a problem. But it could be interesting for optimization to have GMP functions that are performed in place. One can use some interesting mixtures, say only put the information about the pointer, but copying the sign, in order to make sign changes fast. >> So here comes the question in, if now GMP should serve say the >> implementation of compilers such as GHC, what kind of low number >> representations should one use? > >I don't know how to do it better than currently: that it lets the >program/library which uses gmp manage gmp's memory. > >> So this suggest that it might be a disadvantage for GHC to have a >> GMP mergeing the two number representations. > >It would be an unnecessary complication, but manageable. Since ghc >creates mpz objects for each operation, it could use only the big >representation of gmp on inputs, handling small numbers itself as >currently. But it must be prepared to receive small result. It would >be silly to wrap it in an allocated memory if gmp produced a small >answer, so primops (implemented in C or assembler) should return >either representation. Perhaps GMP should provide small number arithmetic, with a fast way to determine if the answer fits in a small number representation. For example mpz_si_si, taking two (long) int's, but with the result in a pointer. If one hands over at least two limbs long allocation, this function does not need to make an allocation, but the idea is that one should be able to look at the result to quickly determine if it fits into one int then. >It would be a bit ugly. Ghc's primops use a restriced set of types >and rarely allocate memory themselves. For example plusInteger# >has the type > Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) >where (# x, y #) means to return both values on the stack. >It is wrapped for integer addition thus: > >plusInteger :: Integer -> Integer -> Integer >plusInteger i1@(S# i) i2@(S# j) = case addIntC# i j of { (# r, c #) -> > if c ==# 0# then S# r > else toBig i1 + toBig i2 } >plusInteger i1@(J# _ _) i2@(S# _) = i1 + toBig i2 >plusInteger i1@(S# _) i2@(J# _ _) = toBig i1 + i2 >plusInteger (J# s1 d1) (J# s2 d2) = case plusInteger# s1 d1 s2 d2 of > (# s, d #) -> J# s d > >Returning one of two variants from a primop requires expressing it >similarly to a C struct, without type punning. So it could be this: > plusInteger# :: Int# -> ByteArr# > -> Int# -> ByteArr# > -> (# Int#, Int#, ByteArr# #) >plusInteger (J# s1 d1) (J# s2 d2) = case plusInteger# s1 d1 s2 d2 of > (# 0#, i, _ #) -> S# i > (# _, s, d #) -> J# s d > >The third component in the S# case is ignored, but the primop must >pass something - a pointer to a dummy Haskell object (this technique >is used in some other primops). All this checking probably slows down the small format as well. >This assumes that the condition for having an alternative >representation is the same in ghc and gmp: fitting into a single >signed integer of the limb size (i.e. pointer size). So you are not using GMP as an interface. One should really try to get a better GMP interface so that program like GHC can use that instead. >Note that the J# representation currently relies on the fact that >mpz can be reconstructed from an integer + an array of words of an >explicit size. It's not possible to change the representation of >mpz and keep the J# representation in ghc. > >> Or is it so that the dispatch code can only be generated if the >> input data is sufficiently static, in which it would be great >> advantage with it in GMP in the case the data is dynamic? > >It's dynamic too. An advantage of making the variants visible to >the compiler is that a function returning an Integer is compiled to >a code which takes an address of two continuations as an argument, >and enters the continuation corresponding to the constructor returned, >passing it constructor arguments as arguments. It doesn't necessarily >allocate S# or J# node on the heap if it's to be evaluated right >away. This happens to all algebraic types in general (perhaps there >are size constraints, I don't know). So I guess that dispatching is >best done by using an algebraic type. > >But I'm not sure. The ghc documentation says that ghc loves >single-constructor types. It was written a long time ago, but perhaps >it's still true. So how to distinguish representations in another >way? Well, perhaps > data Integer = J# Int# Int# ByteArray# >with a dummy object for the ByteArray# in the short case would work. >I don't know how well: it's ugly but maybe fast, and it seems easier >to integrate with a variant representation in gmp. > >In any case better judging of performance of ghc for various >representations should be done by somebody with more knowledge than me. >I don't know what are exact the overheads in various cases. > >> You shouldn't worry that anything going on in GMP would not respect >> upwards compatibility. > >ghc abuses gmp by using its internals directly. When the representation >of mpz changes, ghc must be changed, even if the C interface remains >compatible. > >Even renaming some gmp functions and providing old names as macros >(or something like this) in gmp3 caused compatibility problem for ghc, >and ghc-4.08 requires gmp2, because assembler code calls C functions >directly and cpp couldn't translate the names. So this, in the end, suggests that one perhaps should get a better GMP interface for perhaps both small and large number representations. But it should then be so that GHC could use that interface, rather than abusing its internals. >> The GMP developers have so far judged that the effort does not >> outweigh the advantage of having a type which is faster for small >> numbers. Do you think that the speed-up for smaller numbers would >> be significant for such a rewrite? > >It is significant. I don't know any numbers, but ghc's handling >Integers was told to get a big speedup after introducing the separate >representation for small integers. > >But this might be partially because calling gmp functions has a lot of >indirections, construction of mpz objects etc., so the speedup could >result from avoiding calling gmp at all. Allocation in ghc is fast. What I am was told a long time ago is that as soon as one leaves the simple builtin types, one must perform a lot of checks: CPU's are simply not built for handling multiprecision. Those checks steal a lot of cycles, in addition to memory allocation. >Perhaps in your C++ wrappers you can distinguish the variant above >gmp, and use gmp at all only for large numbers? C++ doesn't have >such indirections, you would surely keep original mpz objects, but >it might be easier than changing gmp. There is a C++ GMP wrap library already using small number representations and a ref count for large numbers, on top of the _mp_d allocation. But the overflow checks are slow on a C/C++ language level, and better moved into assembler where available. >Note that I'm not against changing gmp. It's not obvious how the >consequences would be... One must have a good idea of how to get about it, before starting to write on something like that. It is an interesting question though. Hans AbergFrom fjh@cs.mu.oz.au Sat May 5 09:48:18 2001 Date: Sat, 5 May 2001 18:48:18 +1000 From: Fergus Henderson fjh@cs.mu.oz.au Subject: User defined Ix instances potentially unsafe
On 02-May-2001, Matt Harden <matth@mindspring.com> wrote: > Matt Harden wrote: > > > blah, blah, blah, bug in the Library Report, blah, blah... > > OK, so I failed to read the Library Report. It clearly states: > > > An implementation is entitled to assume the following laws about these operations: > > > > range (l,u) !! index (l,u) i == i -- when i is in range > > inRange (l,u) i == i `elem` range (l,u) > > So my "bug" is only in my mind. Sorry for bothering everyone. I don't think it's quite as straight-forward as that. Hugs and ghc may conform to the Library Report, but the behaviour is still undesirable, and IMHO should be fixed. -- Fergus Henderson <fjh@cs.mu.oz.au> | "I have always known that the pursuit | of excellence is a lethal habit" WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.From qrczak@knm.org.pl Mon May 7 17:45:26 2001 Date: Mon, 7 May 2001 18:45:26 +0200 From: Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl Subject: GMP unboxed number representation
On Fri, May 04, 2001 at 11:08:59PM +0200, Hans Aberg wrote: > GMP has a very interesting multi-precision floating number type as > well, but it has the same problem as the integers: It does not use > the native double's for small float, so it probably becomes slow. I think that it's easier to check machine-size int for overflow than to check double for overflow or loss of precision, so it's impractical to use native double and keep predictable precision. > >Indeed. ghc solves it by using gmp format only temporarily to perform > >operations, and keeping numbers in its own structures. > > Does that mean that you are getting extra memory allocations, No. When mpz is constructed to perform an operation, _mp_d is set to point to an already allocated memory on the ghc heap. Ghc's heap address can generally move during GC. gmp is called from blocks of C or assembler code which don't cause GC in the middle. Using persistent mpz objects would require allocation of _mp_d on the C heap which is slower, and a finalization hook for each Integer (which would be an additional overhead given the way custom finalization hooks are implemented). > Perhaps GMP should provide small number arithmetic, with a fast > way to determine if the answer fits in a small number representation. Indeed. Ghc has those primops, used for Integer arithmetic, with tricky implementations in the case the code is generated via C: /* ----------------------------------------------------------------------------- * Int operations with carry. * -------------------------------------------------------------------------- */ /* With some bit-twiddling, we can define int{Add,Sub}Czh portably in * C, and without needing any comparisons. This may not be the * fastest way to do it - if you have better code, please send it! --SDM * * Return : r = a + b, c = 0 if no overflow, 1 on overflow. * * We currently don't make use of the r value if c is != 0 (i.e. * overflow), we just convert to big integers and try again. This * could be improved by making r and c the correct values for * plugging into a new J#. */ #define addIntCzh(r,c,a,b) \ { r = a + b; \ c = ((StgWord)(~(a^b) & (a^r))) \ >> (BITS_IN (I_) - 1); \ } #define subIntCzh(r,c,a,b) \ { r = a - b; \ c = ((StgWord)((a^b) & (a^r))) \ >> (BITS_IN (I_) - 1); \ } /* Multiply with overflow checking. * * This is slightly more tricky - the usual sign rules for add/subtract * don't apply. * * On x86 hardware we use a hand-crafted assembly fragment to do the job. * * On other 32-bit machines we use gcc's 'long long' types, finding * overflow with some careful bit-twiddling. * * On 64-bit machines where gcc's 'long long' type is also 64-bits, * we use a crude approximation, testing whether either operand is * larger than 32-bits; if neither is, then we go ahead with the * multiplication. */ #if i386_TARGET_ARCH #define mulIntCzh(r,c,a,b) \ { \ __asm__("xorl %1,%1\n\t \ imull %2,%3\n\t \ jno 1f\n\t \ movl $1,%1\n\t \ 1:" \ : "=r" (r), "=&r" (c) : "r" (a), "0" (b)); \ } #elif SIZEOF_VOID_P == 4 #ifdef WORDS_BIGENDIAN #define C 0 #define R 1 #else #define C 1 #define R 0 #endif typedef union { StgInt64 l; StgInt32 i[2]; } long_long_u ; #define mulIntCzh(r,c,a,b) \ { \ long_long_u z; \ z.l = (StgInt64)a * (StgInt64)b; \ r = z.i[R]; \ c = z.i[C]; \ if (c == 0 || c == -1) { \ c = ((StgWord)((a^b) ^ r)) \ >> (BITS_IN (I_) - 1); \ } \ } /* Careful: the carry calculation above is extremely delicate. Make sure * you test it thoroughly after changing it. */ #else #define HALF_INT (1 << (BITS_IN (I_) / 2)) #define stg_abs(a) ((a) < 0 ? -(a) : (a)) #define mulIntCzh(r,c,a,b) \ { \ if (stg_abs(a) >= HALF_INT \ stg_abs(b) >= HALF_INT) { \ c = 1; \ } else { \ r = a * b; \ c = 0; \ } \ } #endif > So this, in the end, suggests that one perhaps should get a better GMP > interface for perhaps both small and large number representations. But it > should then be so that GHC could use that interface, rather than abusing > its internals. Yes, except that I'm not sure how much harder for ghc would be to use a GMP's interface than to do it itself. BTW, it could be nice to have a better way for writing large integer literals. Ghc used to convert them from a decimal string, and now it builds them from pieces by * and + in base 2^31-1 or 2^63-1. Both approaches are a bit ugly. But probably large integer literals are not that common for it to matter much. -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÄPCZA QRCZAKFrom laurent sturm"
At 18:45 +0200 2001/05/07, Marcin 'Qrczak' Kowalczyk wrote: >> GMP has a very interesting multi-precision floating number type as >> well, but it has the same problem as the integers: It does not use >> the native double's for small float, so it probably becomes slow. > >I think that it's easier to check machine-size int for overflow >than to check double for overflow or loss of precision, so it's >impractical to use native double and keep predictable precision. It is easier, right. But for floats one can make in most cases an overflow estimate that will take care of most cases. For the rest of the cases, one can simply carry out the computations and check if it resulted in a NaN: If the CPU has a FPU, floats compute quickly, so this will probably be much faster than using multiprecision anyway. >> Perhaps GMP should provide small number arithmetic, with a fast >> way to determine if the answer fits in a small number representation. > >Indeed. Ghc has those primops, used for Integer arithmetic, with >tricky implementations in the case the code is generated via C: ... [Thanks for code snippet.] .. >> So this, in the end, suggests that one perhaps should get a better GMP >> interface for perhaps both small and large number representations. But it >> should then be so that GHC could use that interface, rather than abusing >> its internals. > >Yes, except that I'm not sure how much harder for ghc would be to >use a GMP's interface than to do it itself. I arrived at a suggestion for GMP: typedef struct { _mp_int _mp_s; /* Small number representation. */ mp_limb_t *_mp_d; /* Pointer to the limbs and all other data. */ } __mpz_struct; where simply _mp_d is set to NULL if one is using the small number representation. (Strictly speaking, this might a new type with a different name in order to keep GMP upwards compatibility; let's skip that aspect here.) I find it interesting for several reasons: First, the small number format is wholly untampered with, so all one is down to is overflow checks. If a CPU would support instructions for +, -, *: 1-register x 1-register -> 2-register then the small number representations could be made very fast. (But I am told that many CPU's do not support that.) But the conversion back and forth from native integral types to GMP types are also easy: For example, the signed integral type one merely puts into the _mp_s field and sets _mp_d to NULL. The unsigned integral type one checks the most significant bit; if it is 0, one converts it as a signed, if it is 1, one sets it to the one limb format. GMP would only need to define the basic arithmetic operations involving __mpz_struct; the ones involving native integral types could be macroed on top of them. Can you tell me the pros and cons relative GHC of this suggestion? >BTW, it could be nice to have a better way for writing large integer >literals. Ghc used to convert them from a decimal string, and now it >builds them from pieces by * and + in base 2^31-1 or 2^63-1. Both >approaches are a bit ugly. But probably large integer literals are >not that common for it to matter much. I am not sure what you mean here; in what context should these integer literals be entered? Hans AbergFrom qrczak@knm.org.pl Mon May 7 19:54:42 2001 Date: Mon, 7 May 2001 20:54:42 +0200 From: Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl Subject: GMP unboxed number representation
On Mon, May 07, 2001 at 07:45:59PM +0200, Hans Aberg wrote: > typedef struct { > _mp_int _mp_s; /* Small number representation. */ > mp_limb_t *_mp_d; /* Pointer to the limbs and all other data. */ > } __mpz_struct; > where simply _mp_d is set to NULL if one is using the small number > representation. Does _mp_s have a meaning when _mp_d is not NULL? Or do you keep the used size and allocated size under _mp_d? The latter is good that it's not necessary to copy these sizes into gmp structures separately but they live inside the ByteArray#. There are no null ByteArray#s. The GC currently aborts if a null pointer is found when a Haskell object is expected (C pointers are tagged like Int# etc., i.e. as nonpointers, and ByteArray#s are tagged like Haskell objects, even though they are not exactly normal Haskell objects). This could be probably changed (unless the benefit of having this ASSERT for debugging the GC is large; I doubt it). Depending on this I see two choices for Integer representation in ghc, with the second looking really well: 1. data Integer = S# Int# | J# ByteArray# -- Size no longer needed here. Primops have a hard case for returning the appropriate data. This is ugly: case primSomething# args of (# 0, s, _ #) -> S# s (# _, _, d #) -> J# d and I'm not sure that using a dummy ByteArray# and checking the pointer equality would work well. 2. Let the GC ignore null pointers. Introduce nullByteArray# primop. Mirror the gmp representation exactly and let gmp perform arithmetic on all numbers: data Integer = J# Int# ByteArray# It's still best if _mp_int has the size of a pointer. > I am not sure what you mean here; in what context should these > integer literals be entered? When the programmer writes for example: f :: Integer -> Integer f x = 54321453245328452134 - x the compiler must generate code which makes a mpz constant of the right value. Currently it generates something like this: lvl = S# 2147483647# lit = ((((S# 11# `timesInteger` lvl) `plusInteger` S# 1673077741#) `timesInteger` lvl) `plusInteger` S# 914624008#) f x = lit `minusInteger` x It used to generate something like this: lit = case addr2Integer# "54321453245328452134"# of (# s, d #) -> J# s d f = lit `minusInteger` x -- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTÄPCZA QRCZAKFrom reid@cs.utah.edu Mon May 7 20:11:25 2001 Date: Mon, 7 May 2001 13:11:25 -0600 From: Alastair Reid reid@cs.utah.edu Subject: GMP unboxed number representation
Could this discussion be moved to a different mailing list please? Hugs doesn't use the GMP library and if it ever did it would restrict its use to the most portable subset. It seems like this discussion is only relevant to GHC and, perhaps, HBC which are willing to pay almost any cost for a few cycles more. -- Alastair ReidFrom tim@freewidget.com Tue May 8 10:23:09 2001 Date: Tue, 8 May 2001 05:23:09 -0400 From: William Rollins tim@freewidget.com Subject: TESTING NEW SOFTWARE
You hugs-bugs have been selected by way of a new email search program=2E Y= our internet profile indicates you may be an Internet Opportunity Seeker=2E= This email is sent once and only once=2E If you have an interest in makin= g money on the internet thru a solid business, then visit http://makemoney= 2=2Efreeservers=2Ecom Warm Regards William Rollins President & Founder FREEWIDGET=2Ecom tim@freewidget=2EcomFrom haberg@matematik.su.se Tue May 8 10:32:44 2001 Date: Tue, 8 May 2001 11:32:44 +0200 From: Hans Aberg haberg@matematik.su.se Subject: GMP unboxed number representation
At 13:11 -0600 2001/05/07, Alastair Reid wrote: >Could this discussion be moved to a different mailing list please? Your wish is granted. >Hugs doesn't use the GMP library and if it ever did it would restrict >its use to the most portable subset. It seems like this discussion >is only relevant to GHC and, perhaps, HBC which are willing to >pay almost any cost for a few cycles more. Actually, there is a generic GMP library that only uses C and not assembler (incidentally, the version that I happen to use). And the things we are discussing are perfectly valid from that viewpoint only. The assembler stuff is then only an optimization add-on. Hans AbergFrom Ben
Dear hugs maintainers, I spent a few hours today writing code to manipulate polynomials for a project. The polynomials are of type newtype Polynomial a = Polynomial [a] deriving (Eq) and I have defined an instance of Num as follows instance Num a => Num (Polynomial a) where fromInteger x = toPolynomial [fromInteger x] negate xs = toPolynomial (map negate (fromPolynomial xs)) x + y = toPolynomial (fromPolynomial x `polyA` fromPolynomial y) x * y = toPolynomial (fromPolynomial x `polyM` fromPolynomial y) where polyA and polyM are appropriately defined. I also have defined, (@@) :: (Num a) => Polynomial a -> a -> a f @@ g = applyrec (fromPolynomial f) g 0 where applyrec (f:fs) g m = f * g^m + applyrec fs g (m+1) applyrec [] _ _ = 0 I originally wrote this to find values of the polynomials for a given integer. ie Polynomial [1,2,3] @@ 3 = 1 + 2*3 + 3*3^2 I also had to write a polynomial composition function, but I realized that one can use @@ for this since a polynomial is a Num. In fact this works in hugs when I type it, Polynomial [1,2,3] @@ Polynomial [3,2,1]. But I need to do more complicated things with this, something like set = [0..t] poly = [ toPolynomial [x1,x2] | x1<-set, x2<-set ] compo = [ f@@g | f <- poly, g <- poly, f /= g ] However this doesn't work and fails with an error: ERROR "/home/brain/school/current/poly.hs" (line 7): Type error in application *** Expression : f @@ g *** Term : f *** Type : Polynomial Integer *** Does not match : Polynomial (Polynomial Integer) My question is the following. Why is it performing the operaton for me in some cases and not in others? Is there a way to say that the type of manipulation that I want to do is ok? Thank you, Ben.From matth@mindspring.com Mon May 14 02:21:29 2001 Date: Sun, 13 May 2001 20:21:29 -0500 From: Matt Harden matth@mindspring.com Subject: confusion
Ben wrote: > > Dear hugs maintainers, Well, I'm not a hugs maintainer, but I'll give it my best shot... What you report is the correct behavior of the Haskell type system, not a hugs bug. > ... > (@@) :: (Num a) => Polynomial a -> a -> a > ... > > set = [0..t] > poly = [ toPolynomial [x1,x2] | x1<-set, x2<-set ] > compo = [ f@@g | f <- poly, g <- poly, f /= g ] > > However this doesn't work and fails with an error: > > ERROR "/home/brain/school/current/poly.hs" (line 7): Type error in application > *** Expression : f @@ g > *** Term : f > *** Type : Polynomial Integer > *** Does not match : Polynomial (Polynomial Integer) To understand this error, think carefully about the types of the operations involved. Use the :t command to get the types of (@@), poly, and (/+). Here they all are: Polynomial> :t (@@) (@@) :: (Num a) => Polynomial a -> a -> a Polynomial> :t poly poly :: [Polynomial Integer] Polynomial> :t (/=) (/=) :: Eq a => a -> a -> Bool Now consider the types of f and g in the definition of compo. Are they the same, or different? You should understand then why (f@@g) is not well-typed. I would suggest that the composition operator you seek is similar to (@@), but not the same, because it's type would be different. I hope that helps. MattFrom im@t956379207.demon.co.uk Mon May 14 13:12:12 2001 Date: Mon, 14 May 2001 13:12:12 +0100 From: Iain McNaughton im@t956379207.demon.co.uk Subject: A Little Learning is a Dangerous Thing !
Hi ! I'm not quite sure if this is the right place to go for help with Hugs, but I'll give it try... Situation is this: I was running some scripts in WinHugs, and everything was running happily. I then attempted to add ( using ":a" ) a new script, and received the following error message: "Program code storage space exhausted". Well, I thought to myself ( fool that I evidently was ! ) that there must be a parameter somewhere in Hugs that I can set to increase the program storage space, so I set out to find it. Unfortunately, having tweaked a couple of numbers ( heap size was one, I think, and the other was something similar-looking ), and restarted Hugs, I now have the following error as I start WinHugs: "Cannot allocate heap storage". Immediately after this error message within the WinHugs interface, a pop-up box appears, containing the message: "Fatal Error: unable to load prelude". At that point, WinHugs disappears. I've tried uninstalling Hugs, and completely re-installing it, but that doesn't appear to do any good. So, I've clearly f*cked up somewhere, but I'm not sure where. Any comments or advice would be greatly appreciated. I understand that this might not count as a bug in its own right, as the software is presumably behaving in a perfectly sensible way; rather, the bug is in me, the over-adventurous user. However, though I've read all the documentation, I'm stuck, and would appreciate any help or assistance you can give me. If you want to flame me for gross stupidity, that would be fine, too ! Thanks for your ( hoped for ) help. Sincerely, Iain McNaughton. -- Iain McNaughtonFrom herrmann@infosun.fmi.uni-passau.de Mon May 14 13:54:55 2001 Date: Mon, 14 May 2001 14:54:55 +0200 From: Ch. A. Herrmann herrmann@infosun.fmi.uni-passau.de Subject: A Little Learning is a Dangerous Thing !
Hi Iain, I'm working with UNIX only but likely my change will solve your problem too. Iain> I was running some scripts in WinHugs, and everything was Iain> running happily. I then attempted to add ( using ":a" ) a new Iain> script, and received the following error message: "Program Iain> code storage space exhausted". In the "src" directory, there are some parameters defined in "prelude.h". Our settings are (we use LARGE_HUGS): #define NUM_ADDRS Pick(28000, 60000, 1280000) #define NUM_STACK Pick(1800, 12000, 64000) NUM_ADDRS is for the program code storage space and NUM_STACK for the stacksize. Recompile, maybe it'll work. Cheers -- Christoph Herrmann Postdoctoral Research Associate University of Passau, Germany E-mail: herrmann@fmi.uni-passau.de WWW: http://brahms.fmi.uni-passau.de/cl/staff/herrmann.htmlFrom antony@apocalypse.org Mon May 14 14:29:21 2001 Date: Mon, 14 May 2001 09:29:21 -0400 From: Antony Courtney antony@apocalypse.org Subject: A Little Learning is a Dangerous Thing !
Hi Iain, Iain McNaughton wrote: > > Well, I thought to myself ( fool that I evidently was ! ) that there > must be a parameter somewhere in Hugs that I can set to increase the > program storage space, so I set out to find it. Unfortunately, having > tweaked a couple of numbers ( heap size was one, I think, and the other > was something similar-looking ), and restarted Hugs, I now have the > following error as I start WinHugs: "Cannot allocate heap storage". > Immediately after this error message within the WinHugs interface, a > pop-up box appears, containing the message: "Fatal Error: unable to load > prelude". At that point, WinHugs disappears. > > I've tried uninstalling Hugs, and completely re-installing it, but that > doesn't appear to do any good. So, I've clearly f*cked up somewhere, but > I'm not sure where. Any comments or advice would be greatly appreciated. Actually, I think you have stepped on a known design problem concerning how hugs stores its state to the registry under Windows. Unfortunately, every time you do a ":set" in hugs, the changes you make to the options settings are stored in the registry. This is great if you set the right option in exactly the right way, but is brutally unforgiving of mistakes (as you have discovered). Your best bet (be VERY careful doing this) is to run "regedit" (by selecting "Run..." from the Windows "Start" menu and typing "regedit"), and navigate to the key: "HKEY_CURRENT_USER\Software\Haskell\Hugs" select the subkey folder for your hugs version (i.e. "February 2000"). The "Options" key contains the options settings you entered. You can try and edit this by hand to get rid of the options settings that accidentally caused your problem. I *think* you can also just delete this key altogether and hugs will revert to its default configuration. All of this is done at your own risk. > I understand that this might not count as a bug in its own right, as the > software is presumably behaving in a perfectly sensible way; rather, the > bug is in me, the over-adventurous user. However, though I've read all > the documentation, I'm stuck, and would appreciate any help or > assistance you can give me. If you want to flame me for gross stupidity, > that would be fine, too ! Whether this is a bug or a "design flaw" is a matter of opinion. But hopefully we all agree that an "adventurous user" should not be punished with an unusable hugs installation for all time! We are working on improving this situation. Good luck, -antony -- Antony Courtney Grad. Student, Dept. of Computer Science, Yale University antony@apocalypse.org http://www.apocalypse.org/pub/u/antonyFrom e8gus@etek.chalmers.se Mon May 14 18:15:56 2001 From: e8gus@etek.chalmers.se (Gustav Andersson) Date: Mon, 14 May 2001 19:15:56 +0200 Subject: stack overflow Message-ID: <006d01c0dc99$9e844730$e5c31081@Mira> This is a multi-part message in MIME format. ------=_NextPart_000_006A_01C0DCAA.4D4F0830 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello, I think I've found a bug in Hugs 98 (February 2001 version). Hugs = crashes (Stack Overflow) whenever I try to execute the following piece = of code (well I know, the code is not very useful, it came from a typo): = ---------------------- f :: Int -> Int -> Int f a b =3D f (f' a b) b f':: Int -> Int -> Int f' a b =3D a+b ---------------------- Then i Hugs execute f with two arbitary integers ex. > f 2 2 Hugs Version February 2001 will crash with the following error (piece = from drwatson's log-file in Windows 2000). Application exception occurred: App: (pid=3D996) When: 2001-05-14 @ 19:01:03.010 Exception number: c00000fd (stack overflow) I've also tried the program under hugs in unix (I don't know the = version, it didn't have the function ":version") Stack overflow: pid 9856, proc hugs, addr 0x11fdfffe0, pc 0x12001c124 Segmentation fault I hope this will help you with the next version of hugs. Yours Sincerly=20 Gustav Andersson ------=_NextPart_000_006A_01C0DCAA.4D4F0830 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable