efficient Bytestring mapM_ for IO/ST?

Brandon Moore brandon_m_moore at yahoo.com
Tue Mar 22 04:50:16 CET 2011


> From: wren ng thornton <wren at freegeek.org>
> Sent: Mon, March 21, 2011 10:30:48 PM
> On 3/21/11 4:40 PM, Brandon Moore wrote:
> > Is there an efficient way to  iterate over the bytes of a ByteString?
> 
> The code I've been using (rather  similar to your unsafe map) is:
> 
>     import qualified  Data.ByteString.Internal as BSI
>     import qualified  Foreign.ForeignPtr       as FFI
> 
>     foldIO :: (a  -> Word8 -> IO a) -> a -> ByteString -> IO a
>      foldIO f z0 (BSI.PS fp off len) =
>          FFI.withForeignPtr fp $ \p0 -> do
>              let q = p0 `plusPtr` (off+len)
>              let go z p
>                      | z `seq` p `seq` False = undefined
>                      | p == q    = return  z
>                     |  otherwise = do
>                          w  <- peek p
>                          z' <- f z w
>                          go z' (p  `plusPtr` 1)
>             go z0 (p0 `plusPtr`  off)
>     {-# INLINE foldIO #-}

I don't need to pass an acumulating parameter.
I'll see how well my code runs if I do.

With a modified version of foldr, the simple definition

mapM_ f x = foldr (\b rest -> f b >> rest) (return ())

is a bit faster than the specialized code from before.
The changes to foldr improved performance of this definition
by almost 10x for my one benchmark. Here's the strict
ByteString code:

{-# INLINE foldr'' #-}
foldr'' :: (Word8 -> a -> a) -> a -> ByteString -> a
foldr'' k v (PS x s l) = inlinePerformIO $ withForeignPtr x $ \ptr ->
 let { !u = s + l
      ;  go ix | ix == u = v
         | otherwise = k (inlinePerformIO (do c  <- peekElemOff ptr ix; 
touchForeignPtr x; return c)) (go (ix+1))
     }
in return (go s)

The lazy ByteString version is built with foldrChunks

foldr f v = foldrChunks (\chunk rest -> foldr'' f rest chunk)

Looking for performance regressions, this seems
to be around 10% slower than the current foldr
in the test

main = do
  str <- readFile "1GiBFile"
  print $ length $ foldr (:) [] str

Surprisingly, Data.ByteString.Lazy.foldr (:) [] seems to be
about twice as fast as unpack!

Are there any other benchmarks I should try?
Are there any other uses for a lazy ByteString foldr?

The test suite from
http://darcs.haskell.org/bytestring
built after a bit of hacking on the Makefile, but doesn't
seem to do much timing.

Brandon


      



More information about the Libraries mailing list