Builder -package

data Builder
bytestring Data.ByteString.Builder
Builders denote sequences of bytes. They are Monoids concatenation, which runs in O(1).
data Builder
text Data.Text.Lazy.Builder
A Builder is an efficient way to build lazy Text values. There are several functions for constructing builders, but only one to inspect them: to extract any data, you have to turn them into lazy Text values using toLazyText. Internally, a builder constructs a lazy Text by filling arrays piece by piece. As each buffer is filled, it is 'popped' off, to become a new chunk of the resulting lazy Text. All this is hidden from the user of the Builder.
module Data.ByteString.Builder
bytestring Data.ByteString.Builder
IntC Int deriving( Eq, Ord, Show ) type Row = [Cell] type Table = [Row] </pre> We use the following imports and abbreviate mappend to simplify reading. > import qualified Data.ByteString.Lazy as L > import Data.ByteString.Builder > import Data.ByteString.Builder.ASCII (intDec) > import Data.Monoid > import Data.Foldable (foldMap) > import Data.List (intersperse) > > infixr 4 <> > (<>) :: Monoid m => m -> m -> m > (<>) = mappend CSV is a character-based representation of tables. For maximal modularity, we could first render Tables as Strings and then encode this String using some Unicode character encoding. However, this sacrifices performance due to the intermediate String representation being built and thrown away right afterwards. We get rid of this intermediate String representation by fixing the character encoding to UTF-8 and using Builders to convert Tables directly to UTF-8 encoded CSV tables represented as lazy ByteStrings. > encodeUtf8CSV :: Table -> L.ByteString > encodeUtf8CSV = toLazyByteString . renderTable > > renderTable :: Table -> Builder > renderTable rs = mconcat [renderRow r <> charUtf8 '\n' | r <- rs] > > renderRow :: Row -> Builder > renderRow [] = mempty > renderRow (c:cs) = > renderCell c <> mconcat [ charUtf8 ',' <> renderCell c' | c' <- cs ] > > renderCell :: Cell -> Builder > renderCell (StringC cs) = renderString cs > renderCell (IntC i) = intDec i > > renderString :: String -> Builder > renderString cs = charUtf8 '"' <> foldMap escape cs <> charUtf8 '"' > > escape '\\' = charUtf8 '\\' <> charUtf8 '\\' > escape '\"' = charUtf8 '\\' <> charUtf8 '\"' > escape c = charUtf8 c Note that the ASCII encoding is a subset of the UTF-8 encoding, which is why we can use the optimized function intDec to encode an Int as a decimal number with UTF-8 encoded digits. Using intDec is more efficient than stringUtf8 . show, as it avoids constructing an intermediate String. Avoiding this intermediate data structure significantly improves performance because encoding Cells is the core operation for rendering CSV-tables. See Data.ByteString.Builder.Prim for further information on how to improve the performance of renderString. We demonstrate our UTF-8 CSV encoding function on the following table. > strings :: [String] > strings = ["hello", "\"1\"", "»-wörld"] > > table :: Table > table = [map StringC strings, map IntC [-3..3]] The expression encodeUtf8CSV table results in the following lazy ByteString. > Chunk "\"hello\",\"\\\"1\\\"\",\"\206\187-w\195\182rld\"\n-3,-2,-1,0,1,2,3\n" Empty We can clearly see that we are converting to a binary format. The '»' and 'ö' characters, which have a Unicode codepoint above 127, are expanded to their corresponding UTF-8 multi-byte representation. We use the criterion library (http://hackage.haskell.org/package/criterion) to benchmark the efficiency of our encoding function on the following table. > import Criterion.Main -- add this import to the ones above > > maxiTable :: Table > maxiTable = take 1000 $ cycle table > > main :: IO () > main = defaultMain > [ bench "encodeUtf8CSV maxiTable (original)" $ > whnf (L.length . encodeUtf8CSV) maxiTable > ] On a Core2 Duo 2.20GHz on a 32-bit Linux, the above code takes 1ms to generate the 22'500 bytes long lazy ByteString. Looking again at the definitions above, we see that we took care to avoid intermediate data structures, as otherwise we would sacrifice performance. For example, the following (arguably simpler) definition of renderRow is about 20% slower. > renderRow :: Row -> Builder > renderRow = mconcat . intersperse (charUtf8 ',') . map renderCell Similarly, using O(n) concatentations like ++ or the equivalent concat operations on strict and lazy ByteStrings should be avoided. The following definition of renderString is also about 20% slower. > renderString :: String -> Builder > renderString cs = charUtf8 $ "\"" ++ concatMap escape cs ++ "\"" > > escape '\\' = "\\" > escape '\"' = "\\\"" > escape c = return c Apart from removing intermediate data-structures, encodings can be optimized further by fine-tuning their execution parameters using the functions in Data.ByteString.Builder.Extra and their "inner loops" using the functions in Data.ByteString.Builder.Prim.
module Data.ByteString.Lazy.Builder
bytestring Data.ByteString.Lazy.Builder
We decided to rename the Builder modules. Sorry about that. The old names will hang about for at least once release cycle before we deprecate them and then later remove them.
module Data.Text.Lazy.Builder
text Data.Text.Lazy.Builder
Efficient construction of lazy Text values. The principal operations on a Builder are singleton, fromText, and fromLazyText, which construct new builders, and mappend, which concatenates two builders. To get maximum performance when building lazy Text values using a builder, associate mappend calls to the right. For example, prefer > singleton 'a' `mappend` (singleton 'b' `mappend` singleton 'c') to > singleton 'a' `mappend` singleton 'b' `mappend` singleton 'c' as the latter associates mappend to the left.
module Data.Generics.Builders
syb Data.Generics.Builders
This module provides generic builder functions. These functions construct values of a given type.
module Generics.SYB.Builders
syb Generics.SYB.Builders
Convenience alias for Data.Generics.Builders.
hPutBuilder :: Handle -> Builder -> IO ()
bytestring Data.ByteString.Builder
Output a Builder to a Handle. The Builder is executed directly on the buffer of the Handle. If the buffer is too small (or not present), then it is replaced with a large enough buffer. It is recommended that the Handle is set to binary and BlockBuffering mode. See hSetBinaryMode and hSetBuffering. This function is more efficient than hPut . toLazyByteString because in many cases no buffer allocation has to be done. Moreover, the results of several executions of short Builders are concatenated in the Handles buffer, therefore avoiding unnecessary buffer flushes.
runBuilder :: Builder -> BufferWriter
bytestring Data.ByteString.Builder.Extra
Turn a Builder into its initial BufferWriter action.