Data.PackedString.hGetPS

Ian Lynagh igloo at earth.li
Sun Feb 29 04:13:54 EST 2004


Hi all,

If I have

\begin{code}
import Data.PackedString (hGetPS)
import System.IO (openBinaryFile, hClose, IOMode(ReadMode))

main :: IO ()
main = do h <- openBinaryFile "1000000x" ReadMode
          p <- hGetPS h 1000000
          hClose h
          return ()
\end{code}

(1000000x is a file containing 1000000 'x' characters) then:

and compile with ghc -O2 (either 6.2 or reasonably recent CVS) then I
get:

$ ./foo 
Stack space overflow: current size 1048576 bytes.
Use `+RTS -Ksize' to increase it.
$



The definition is:

-- | Read a 'PackedString' directly from the specified 'Handle'.
-- This is far more efficient than reading the characters into a 'String'
-- and then using 'packString'.  
-- NOTE: as with 'hPutPS', the string representation in the file is 
-- assumed to be ISO-8859-1.
hGetPS :: Handle -> Int -> IO PackedString
hGetPS h i = do
  arr <- newArray_ (0, i-1)
  l <- hGetArray h arr i
  chars <- mapM (\i -> readArray arr i >>= return.chr.fromIntegral) [0..l-1]
  return (packString chars)

(is the efficiency comment accurate? This definition is calling
packString on a String it makes, so it's not intuitively obvious to me).

I think the last line should be replaced with

  return (packNChars l chars)

and that the packNChars definition:

  packNChars len str = PS (array (0,len-1) (zip [0..] str))

is equivalent to:

  packNChars len str = PS (listArray (0,len-1) str)

which I assume is more efficient (due to not con/destructing tuples if
nothing else).

None of this solves the stack overflow, though.

While I'm here, David Roundy has a PackedString which looks to have been
forked from GHC or its libraries, now based around

  data PackedString = PS !(ForeignPtr Word8) !Int !Int

He said it can't serve as a replacement as it doesn't support large
unicode characters, though (would just changing to a Word32 or Char fix
that?). If you're interested it's in the darcs
(http://abridgegame.org/darcs/) darcs repo - selecting
"FastPackedString.hs" on http://abridgegame.org/cgi-bin/darcs?darcs*
should show you the latest version. It doesn't have this problem.


Thanks
Ian



More information about the Libraries mailing list