[Haskell-cafe] blaze-builder and FlexibleInstances in code that aims to become part of the Haskell platform

Johan Tibell johan.tibell at gmail.com
Mon May 23 17:43:04 CEST 2011


On Fri, May 20, 2011 at 11:12 PM, Simon Meier <iridcode at gmail.com> wrote:
> There, seems to be a historical artefact here. The new Write
> abstraction in system-io-write is different from the one used in
> blaze-builder. It's type is
>
>  data Write a = Write Int (a -> Ptr Word8 -> IO (Ptr Word8))
>
> This definition ensures that the bound on the number of bytes written
> is independent of the value being encoded. That's crucial for the
> implementation of `mapWriteByteString`. It also benefits the other
> Write combinators, as the bound can always be computed in a
> data-independent fashion. Inlining, is therefore really sufficient to
> arrive at a constant bound during compile time.

I don't see why this makes a difference, you could still do

    myWrite x = Write (length x) (\ _ p -> pokePokePoke p x)

> I don't see how this Write type can be emulated using `writeAtMost`, do you?

There's no difference, as I showed above. Both can result in data
dependent lengths. It's up to the programmer to make sure the length
is independent of the value being written, when so desired.

> Hmm, all my Writes are top-level function definitions annotated with
> {-# INLINE #-}. Moreover, all combinators for Writes are also inlined
> and all their calls are saturated. Therefore, I thought GHC is capable
> of optimizing away the pattern matches on the Write constructor.

You also need to make all top-level functions non-recursive but from
what I remember you did so. The case for Writes is the same as for
higher-order arguments, the call site must meet the definition site.
So if you have something like:

    myWrite :: Write Word8

    writeList :: Write a -> [a] -> ...

    f xs = writeList myWrite xs

we need to make sure both myWrite and writeList are inlined into f.
The case is similar for writeAtMost. The question is what happens if
the user ever fails to get everything to inline optimally. In the
writeAtMost case just have an indirect function call instead of a
direct one. In the Write case we also have extra allocation and
indirection. We've had such problems in e.g. attoparsec. While things
should inline properly in big programs they rarely do. Same problem
exists for fusion where fusion constructors end up in the final
program although they should have been eliminated.

> I'm happy to remove Writes, if there's a superior way of sharing the
> low-level encoding code that they abstract. However, I did peek at
> Core from time to time and found that the Write constructors were
> optimized away. I currently see Writes as an expert domain to be used
> by authors of libraries like bytestring, text, aeson, blaze-html, etc.
> With appropriate documentation and benchmarks I expect them to be able
> to make good choices w.r.t. inlining and partial application.

I agree. Writes (and writeAtMost) would be the domain of experts.

If we expects write to be reused a lot it might make sense to have a
separate Write type. Note that I'd be reluctant to see dependencies
that involve I/O underneath bytestring as it's designed as a pure data
structure library (and is likely to have things involving I/O on top
of it).

Cheers,
Johan



More information about the Haskell-Cafe mailing list