[Haskell-cafe] Efficient use of ByteString and type classes in template system

Tue Apr 17 08:39:03 EDT 2007

Great!

I've written some QuickCheck tests now (not commited) so I can start
to swap out the implementation and benchmark it. After I get it to run
fast enough and some nice utility methods (like the possibility of
using records as context) I'll announce a version 1.0.

Johan

On 4/17/07, Thomas Hartman <tphyahoo at gmail.com> wrote:
> Created wiki page
>
> http://haskell.org/haskellwiki/String_Interpolation
>
> and referenced various topics mentioned in this thread, there.
>
> 2007/4/16, Donald Bruce Stewart <dons at cse.unsw.edu.au>:
> > johan.tibell:
> > > Hi Haskell Caf?!
> > >
> > > I'm writing a perl/python like string templating system which I plan
> > > to release soon:
> > >
> > > darcs get http://darcs.johantibell.com/template
> > >
> > > The goal is to provide simple string templating; no inline code, etc..
> > > An alternative to printf and ++.
> >
> > Ok. You might also want to briefly look at the other templating system I
> > know of in Haskell, this small module by Stefan Wehr,
> >
> >     http://www.cse.unsw.edu.au/~dons/code/icfp05/tests/unit-tests/VariableExpansion.hs
> >
> > Just a quick thing he did for the ICFP contest, but does indicate one
> > way to do it (i.e. via pretty printing).
> >
> > >
> > > Example usage:
> > >
> > > >import qualified Data.ByteString as B
> > > >import Text.Template
> > > >
> > > >helloTemplate = "Hello, $name! Would you like some ${fruit}s?"
> > > >helloContext = [("name", "Johan"), ("fruit", "banana")]
> > > >
> > > >test1 = B.putStrLn $ substitute (B.pack helloTemplate) helloContext
> > >
> > > I want to make it perform well, especially when creating a template
> > > once and then rendering it multiple times. "Compiling" the template is
> > > a separate step from rendering in this use case:
> > >
> > > >compiledTemplate = template $ B.pack helloTemplate
> > > >
> > > >test2 = B.putStrLn $ render compiledTemplate helloContext
> > >
> > > A template is represented by a list of template fragments, each
> > > fragment is either a ByteString literal or a variable which is looked
> > > up in the "context" when rendered.
> > >
> > > >data Frag = Lit ByteString | Var ByteString
> > > >newtype Template = Template [Frag]
> > >
> > > This leads me to my first question. Would a lazy ByteString be better
> > > or worse here? The templates are of limited length. I would say the
> > > length is usually between one paragraph and a whole HTML page. The
> > > Template data type already acts a bit like a lazy ByteString since it
> > > consists of several chunks (although the chunck size is not adjusted
> > > to the CPU cache size like with the lazy ByteString).
> >
> > Probably lazy bytestrings are better here, since you get O(n/k) append
> > cost, rather than O(n).  If most strings are small, it mightn't be
> > noticeable.
> >
> > > Currently the context in which a template is rendered is represented
> > > by a type class.
> > >
> > > >class Context c where
> > > >    lookup :: ByteString -> c -> Maybe ByteString
> > > >
> > > >instance Context (Map String String) where
> > > >    lookup k c = liftM B.pack (Map.lookup (B.unpack k) c)
> > > >
> > > >instance Context (Map ByteString ByteString) where
> > > >    lookup = Map.lookup
> > > >
> > > >-- More instance, for [(String, String)], etc.
> > >
> > > I added this as a convenience for the user, mainly to work around the
> > > problem of not having ByteString literals. A typical usage would have
> > > the keys in the context being literals and the values some variables:
> >
> > note sure if it is relevant, but:
> >
> >     pack "Foo"
> >
> > will be converted via rewrite rules to a bytestring literal at compile
> > time. So there's no overhead for having String literals.
> >
> > >
> > > >someContext = Map.fromList [("name", name), ("fruit", fruit)]
> > >
> > > I'm not sure if this was a good decision, With this I'm halfway to the
> > > (in)famous Stringable class and it seems like many smarter people than
> >
> > Yes, seems a little worrying.
> >
> > > me have avoided introducing such a class. How will this affect
> > > performace? Take for example the rendering function:
> > >
> > > >render :: Context c => Template -> c -> ByteString
> > > >render (Template frags) ctx = B.concat $ map (renderFrag ctx) frags
> > > >
> > > >renderFrag :: Context c => c -> Frag -> ByteString
> > > >renderFrag ctx (Lit s) = s
> > > >renderFrag ctx (Var x) = case Text.Template.lookup x ctx of
> > > >                           Just v  -> v
> > > >                           Nothing -> error $ "Key not found: " ++
> > > >                           (B.unpack x)
> > >
> > > How will the type dictionary 'c' hurt performance here? Would
> > > specializing the function directly in render help?
> >
> > Hmm. Hard to say: look at the Core code and we will know.
> >
> > Really though, you'll need some stress test cases to be able to make
> > resonable conclusions about performance.
> >
> > >
> > > >render (Template frags) ctx = B.concat $ map (renderFrag f) frags
> > > >    where f = flip Text.Template.lookup ctx
> > > >
> > > >renderFrag f (Var x) = case f x of
> > >
> > > I can see the implementation taking one of the following routes:
> > > - Go full Stringable, including for the Template
> > > - Revert to Context = Map ByteString ByteString which was the original
> > > implementation.
> > > - Some middle road, without MPTC, for example:
> > > >class Context c where
> > > >    lookup :: ByteString -> c ByteString ByteString -> Maybe ByteString
> > > This would allow the user to supply some more efficient data type for
> > > lookup but not change the string type. Having a type class would allow
> > > me to provide things like the possibility to create a Context from a
> > > record where each record accessor function would server as key.
> > > Something like:
> > >
> > > >data Person { personName :: String, personAge :: Int }
> > > would get converted (using Data?) to:
> > > >personContext = [("personName", show $ personName aPerson),
> > > >                 ("personAge", show $ personAge aPerson)]
> > > but not actually using a Map but the record itself.
> > >
> > > I guess my more general question is: how do I reason about the
> > > performance of my code or any code like this? Are there any other
> > > performance improvements that could be made?
> > >
> > > Also, I would be grateful if someone could provide some feedback on
> > > the implementation, anything goes!
> > >
> > > I still have some known TODOs:
> > >
> > > - Import error messages for invalid uses of "$".
> > > - Improve the regex usage overall.
> > > - Add some more functions; the plan is to add those function which
> > > could be expressed in efficiently with the current interface. An
> > > example is things like renderAndWrite, when writing doing a B.concat
> > > first is unnecessary.
> >
> > I'd suggest: keep it simple and fast. Then work out what extra stuff you
> > need.
> >
> > -- Don
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>