bytestring-0.10.4.0: Fast, compact, strict and lazy byte strings with a list interface

Copyright(c) 2010 Jasper Van der Jeugt (c) 2010-2011 Simon Meier
LicenseBSD3-style (see LICENSE)
MaintainerSimon Meier <[email protected]>
PortabilityGHC
Safe HaskellTrustworthy
LanguageHaskell98

Data.ByteString.Builder.Extra

Contents

Description

Extra functions for creating and executing Builders. They are intended for application-specific fine-tuning the performance of Builders.

Synopsis

Execution strategies

toLazyByteStringWith Source

Arguments

:: AllocationStrategy

Buffer allocation strategy to use

-> ByteString

Lazy ByteString to use as the tail of the generated lazy ByteString

-> Builder

Builder to execute

-> ByteString

Resulting lazy ByteString

Heavy inlining. Execute a Builder with custom execution parameters.

This function is inlined despite its heavy code-size to allow fusing with the allocation strategy. For example, the default Builder execution function toLazyByteString is defined as follows.

{-# NOINLINE toLazyByteString #-}
toLazyByteString =
  toLazyByteStringWith (safeStrategy smallChunkSize defaultChunkSize) L.empty

where L.empty is the zero-length lazy ByteString.

In most cases, the parameters used by toLazyByteString give good performance. A sub-performing case of toLazyByteString is executing short (<128 bytes) Builders. In this case, the allocation overhead for the first 4kb buffer and the trimming cost dominate the cost of executing the Builder. You can avoid this problem using

toLazyByteStringWith (safeStrategy 128 smallChunkSize) L.empty

This reduces the allocation and trimming overhead, as all generated ByteStrings fit into the first buffer and there is no trimming required, if more than 64 bytes and less than 128 bytes are written.

data AllocationStrategy Source

A buffer allocation strategy for executing Builders.

safeStrategy Source

Arguments

:: Int

Size of first buffer

-> Int

Size of successive buffers

-> AllocationStrategy

An allocation strategy that guarantees that at least half of the allocated memory is used for live data

Use this strategy for generating lazy ByteStrings whose chunks are likely to survive one garbage collection. This strategy trims buffers that are filled less than half in order to avoid spilling too much memory.

untrimmedStrategy Source

Arguments

:: Int

Size of the first buffer

-> Int

Size of successive buffers

-> AllocationStrategy

An allocation strategy that does not trim any of the filled buffers before converting it to a chunk

Use this strategy for generating lazy ByteStrings whose chunks are discarded right after they are generated. For example, if you just generate them to write them to a network socket.

smallChunkSize :: Int Source

The recommended chunk size. Currently set to 4k, less the memory management overhead

defaultChunkSize :: Int Source

The chunk size used for I/O. Currently set to 32k, less the memory management overhead

Controlling chunk boundaries

byteStringCopy :: ByteString -> Builder Source

Construct a Builder that copies the strict ByteString.

Use this function to create Builders from smallish (<= 4kb) ByteStrings or if you need to guarantee that the ByteString is not shared with the chunks generated by the Builder.

byteStringInsert :: ByteString -> Builder Source

Construct a Builder that always inserts the strict ByteString directly as a chunk.

This implies flushing the output buffer, even if it contains just a single byte. You should therefore use byteStringInsert only for large (> 8kb) ByteStrings. Otherwise, the generated chunks are too fragmented to be processed efficiently afterwards.

byteStringThreshold :: Int -> ByteString -> Builder Source

Construct a Builder that copies the strict ByteStrings, if it is smaller than the treshold, and inserts it directly otherwise.

For example, byteStringThreshold 1024 copies strict ByteStrings whose size is less or equal to 1kb, and inserts them directly otherwise. This implies that the average chunk-size of the generated lazy ByteString may be as low as 513 bytes, as there could always be just a single byte between the directly inserted 1025 byte, strict ByteStrings.

lazyByteStringCopy :: ByteString -> Builder Source

Construct a Builder that copies the lazy ByteString.

lazyByteStringInsert :: ByteString -> Builder Source

Construct a Builder that inserts all chunks of the lazy ByteString directly.

lazyByteStringThreshold :: Int -> ByteString -> Builder Source

Construct a Builder that uses the thresholding strategy of byteStringThreshold for each chunk of the lazy ByteString.

flush :: Builder Source

Flush the current buffer. This introduces a chunk boundary.

Low level execution

type BufferWriter = Ptr Word8 -> Int -> IO (Int, Next) Source

A BufferWriter represents the result of running a Builder. It unfolds as a sequence of chunks of data. These chunks come in two forms:

  • an IO action for writing the Builder's data into a user-supplied memory buffer.
  • a pre-existing chunks of data represented by a strict ByteString

While this is rather low level, it provides you with full flexibility in how the data is written out.

The BufferWriter itself is an IO action: you supply it with a buffer (as a pointer and length) and it will write data into the buffer. It returns a number indicating how many bytes were actually written (which can be 0). It also returns a Next which describes what comes next.

data Next Source

After running a BufferWriter action there are three possibilities for what comes next:

Constructors

Done

This means we're all done. All the builder data has now been written.

More !Int BufferWriter

This indicates that there may be more data to write. It gives you the next BufferWriter action. You should call that action with an appropriate buffer. The int indicates the minimum buffer size required by the next BufferWriter action. That is, if you call the next action you must supply it with a buffer length of at least this size.

Chunk !ByteString BufferWriter

In addition to the data that has just been written into your buffer by the BufferWriter action, it gives you a pre-existing chunk of data as a ByteString. It also gives you the following BufferWriter action. It is safe to run this following action using a buffer with as much free space as was left by the previous run action.

runBuilder :: Builder -> BufferWriter Source

Turn a Builder into its initial BufferWriter action.

Host-specific binary encodings

intHost :: Int -> Builder Source

Encode a single native machine Int. The Int is encoded in host order, host endian form, for the machine you're on. On a 64 bit machine the Int is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way are not portable to different endian or int sized machines, without conversion.

int16Host :: Int16 -> Builder Source

Encode a Int16 in native host order and host endianness.

int32Host :: Int32 -> Builder Source

Encode a Int32 in native host order and host endianness.

int64Host :: Int64 -> Builder Source

Encode a Int64 in native host order and host endianness.

wordHost :: Word -> Builder Source

Encode a single native machine Word. The Word is encoded in host order, host endian form, for the machine you're on. On a 64 bit machine the Word is an 8 byte value, on a 32 bit machine, 4 bytes. Values encoded this way are not portable to different endian or word sized machines, without conversion.

word16Host :: Word16 -> Builder Source

Encode a Word16 in native host order and host endianness.

word32Host :: Word32 -> Builder Source

Encode a Word32 in native host order and host endianness.

word64Host :: Word64 -> Builder Source

Encode a Word64 in native host order and host endianness.

floatHost :: Float -> Builder Source

Encode a Float in native host order. Values encoded this way are not portable to different endian machines, without conversion.

doubleHost :: Double -> Builder Source

Encode a Double in native host order.