ForeignPtr reloaded

Bulat Ziganshin bulat.ziganshin at gmail.com
Sun Jun 4 10:34:33 EDT 2006


Hello Donald,

Sunday, June 4, 2006, 11:22:24 AM, you wrote:

> I got around to writing a benchmark for testing out tuned foreign
> pointer implementations with byte strings. In tests allocating between
> 1M and 100M foreign pointers, I found around a 5% speed up and 5% less
> space usage, using the tuned finalizer-less ForeignPtr described in the
> attached patch, over the current design -- as Bulat has always suggested
> I would find :)

> The effect wasn't as easy to observe with fewer numbers. However, I now

hmmm.. for each ForeignPtr you should see about 10 bytes profit, so
for 1m-100m ptrs total gain should be 10mb-1gb. how much percents -
depends entirely on the size of each string.

speed improvement depends on operations tested. afaik, this
change should make creation of ByteStrings faster, so i hope that it
will make faster reading of ByteStrings what is now 1.5-2 times
slower than writing in my Streams library (with ghc 6.4):

vPutChar: 0.891 secs (user: 0.851 secs)
vGetChar: 0.771 secs (user: 0.771 secs)

vPutBuf20: 0.491 secs (user: 0.391 secs)
vGetBuf20: 0.461 secs (user: 0.461 secs)

vPutByteStrLn20: 0.992 secs (user: 0.991 secs)         --- see the difference
vGetByteLine20 : 1.793 secs (user: 1.622 secs)         --- between these 2 lines
vGetByteLineContents20: 2.343 secs (user: 2.243 secs)

vPutStrLn20: 2.494 secs (user: 2.323 secs)
vGetLine20 : 7.701 secs (user: 7.130 secs)
vGetContents: 13.710 secs (user: 12.798 secs)


> have a good use case where we will care about this: lazy bytestrings.
> They used chunked lists of bytestrings to handle large amounts of input.
> Each chunk needs its own foreign pointer. This patch should thus improve
> their performance, by avoiding the redundant ioref for the finalizer.

afaik, your lazy bytestrings is anyway about 1mb each. but when you
will apply, say, 'lines' to it - you will get a lazy list of
ByteStrings, each representing just one line of file. as i previously
said, it should be a good benchmark for your library - counting lines via
(return (length.lines) =<< getContents). you should remember that this
test instantly showed difference between foreign ptrs in ghc 6.4 and
6.5


> So, here's the patch. Simon M -- what do you think? Good to apply?

i think that the patch is essential, because small strings (and other
byte arrays) are much more used than megabyte and gigabyte ones



-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com



More information about the Cvs-ghc mailing list