unboxed types

Isaac Dupree isaacdupree at charter.net
Mon Dec 31 10:09:57 EST 2007


Neil Mitchell wrote:
> Hi Isaac,
> 
>> Or will I have to
>> #define UTopen (#
>> #defined UTclose #)
>>
>> and (UTopen x, y UTclose)
> 
> Yuk! There is a ticket on adding a prefix form of (#,#), which is
> currently lacking. Perhaps adding that first, then moving to the
> unboxed thingy would be best. Also your use of # in a CPP macro may
> confuse various CPP stuff in GHC.

UTopen/UTclose are no worse than the macros for unboxed string literals 
that GHC already uses:
#define SLIT(x)	 (FS.mkLitString# (x#))
#define FSLIT(x) (FS.mkFastString# (x#))

(although I wish it didn't even use those! Is there any way to use RULES 
or optimization for the same effect?)

and then I would probably have to use separate macros for different 
tuple sizes, since CPP doesn't like commas. Or use UT(COMMA COMMA) for a 
three-size tuple.  `TUPLE3 x y z` doesn't look so bad though -- just 
like a constructor, which it could be otherwise; data TUPLE3 a b c = 
TUPLE3 a b c

> 
> I'd also be shocked if it turns out that unboxed types make that much
> of a difference. They will make a difference in library functions
> (called a lot), or hot-spots, but in general they only add a constant
> improvement. Of course, numbers are better than speculation - and
> would be interesting in their own right!

could be, I guess I'll test.  They're only used in the gratuitous way I 
described in a few places, so I was guessing it was intentional and 
there might have been a performance bottleneck...

oh wait, I remember why... you can't return a boxed tuple of unboxed 
values.  Which might actually be important in some places, like 
utils/Encoding:

------------------------------------------------------------------
-- UTF-8

-- We can't write the decoder as efficiently as we'd like without
-- resorting to unboxed extensions, unfortunately.  I tried to write
-- an IO version of this function, but GHC can't eliminate boxed
-- results from an IO-returning function.
--
-- We assume we can ignore overflow when parsing a multibyte character here.
-- To make this safe, we add extra sentinel bytes to unparsed UTF-8 
sequences
-- before decoding them (see StringBuffer.hs).

{-# INLINE utf8DecodeChar# #-}
utf8DecodeChar# :: Addr# -> (# Char#, Addr# #)



Isaac



More information about the Cvs-ghc mailing list