[Haskell-cafe] monadic DSL for compile-time parser generator, not possible?

dag.odenhall at gmail.com dag.odenhall at gmail.com
Tue Mar 12 21:36:27 CET 2013


Why not the parsers package [1]? Write the parser against the Parsing class
and then use trifecta or write instances for attoparsec or parsec. With
enough inlining perhaps the overhead of the class gets optimized away?

[1] http://hackage.haskell.org/package/parsers


On Tue, Mar 12, 2013 at 9:06 PM, Jeremy Shaw <jeremy at n-heptane.com> wrote:

> It would be pretty damn cool if you could create a data type for
> generically describing a monadic parser, and then use template haskell
> to generate a concrete parser from that data type. That would allow
> you to create your specification in a generic way and then target
> different parsers like parsec, attoparsec, etc. There is some code
> coming up in a few paragraphs that should describe this idea more
> clearly.
>
> I would like to suggest that while it would be cool, it is
> impossible. As proof, I will attempt to create a simple monadic parser
> that has only one combinator:
>
>     anyChar :: ParserSpec Char
>
> We need the GADTs extension and some imports:
>
> > {-# LANGUAGE GADTs, TemplateHaskell #-}
> > import Control.Monad              (join)
> > import qualified Text.Parsec.Char as P
> > import Language.Haskell.TH        (ExpQ, appE)
> > import Language.Haskell.TH.Syntax (Lift(lift))
> > import Text.Parsec                (parseTest)
> > import qualified Text.Parsec.Char as P
> > import Text.Parsec.String         (Parser)
>
> Next we define a type that has a constructor for each of the different
> combinators we want to support, plus constructors for the functor and
> monad methods:
>
> > data ParserSpec a where
> >     AnyChar :: ParserSpec Char
> >     Return  :: a -> ParserSpec a
> >     Join    :: ParserSpec (ParserSpec a) -> ParserSpec a
> >     FMap    :: (a -> b) -> ParserSpec a -> ParserSpec b
> >
> > instance Lift (ParserSpec a) where
> >     lift _ = error "not defined because we are screwed later anyway."
>
> In theory, we would extend that type with things like `Many`, `Some`,
> `Choice`, etc.
>
> In Haskell, we are used to seeing a `Monad` defined in terms of
> `return` and `>>=`. But, we can also define a monad in terms of
> `fmap`, `return` and `join`. We will do that in `ParserSpec`, because
> it makes the fatal flaw more obvious.
>
> Now we can define the `Functor` and `Monad` instances:
>
> > instance Functor ParserSpec where
> >     fmap f p = FMap f p
>
> > instance Monad ParserSpec where
> >     return a = Return a
> >     m >>= f  = Join ((FMap f) m)
>
> and the `anyChar` combinator:
>
> > anyChar :: ParserSpec Char
> > anyChar = AnyChar
>
> And now we can define a simple parser that parses two characters and
> returns them:
>
> > charPair :: ParserSpec (Char, Char)
> > charPair =
> >     do a <- anyChar
> >        b <- anyChar
> >        return (a, b)
>
> Now, we just need to define a template haskell function that generates
> a `Parser` from a `ParserSpec`:
>
> > genParsec :: (Lift a) => ParserSpec a -> ExpQ
> > genParsec AnyChar    = [| anyChar |]
> > genParsec (Return a) = [| return a |]
> > genParsec (Join p)   = genParsec p
> > -- genParsec (FMap f p) = appE [| f |] (genParsec p) -- uh-oh
>
> Looking at the `FMap` case we see the fatal flaw. In order to
> generate the parser we would need some way to transform any arbitrary
> Haskell function of type `a -> b` into Template Haskell. Obviously,
> that is impossible (for some definition of obvious).
>
> Therefore, we can assume that it is not possible to use Template
> Haskell to generate a monadic parser from a monadic specification.
>
> We can also assume that `Applicative` is not available either. Seems
> likely that `Category` based parsers would also be out.
>
> Now, we can, of course, do the transformation at runtime:
>
> > interpretParsec :: ParserSpec a -> Parser a
> > interpretParsec AnyChar    = P.anyChar
> > interpretParsec (Return a) = return a
> > interpretParsec (FMap f a) = fmap f (interpretParsec a)
> > interpretParsec (Join mm)  = join (fmap interpretParsec (interpretParsec
> mm))
>
> > test = parseTest (interpretParsec charPair) "ab"
>
> My fear is that doing that will result in added runtime overhead. One
> reason for wanting to create a compile-time parser generator is to have
> the opportunity to generate very fast parsing code. It seems like here
> we can only be slower than the parser we are targeting? Though..
> perhaps not? Perhaps the parser returned by `interpretParsec` has all
> the interpret stuff removed and is as fast as if we have constructed
> it by hand? I don't have an intuitive feel for it.. I guess criterion
> would know..
>
> Any thoughts?
>
> - jeremy
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130312/5cc998a5/attachment.htm>


More information about the Haskell-Cafe mailing list