[Haskell-cafe] monadic DSL for compile-time parser generator, not possible?

Jeremy Shaw jeremy at n-heptane.com
Tue Mar 12 21:53:05 CET 2013


Mostly, because I want to do other sorts of compile-time inspections
on the parser. Being able to generate the parser is just the easiest
part to get started with.

- jeremy

On Tue, Mar 12, 2013 at 3:36 PM, dag.odenhall at gmail.com
<dag.odenhall at gmail.com> wrote:
> Why not the parsers package [1]? Write the parser against the Parsing class
> and then use trifecta or write instances for attoparsec or parsec. With
> enough inlining perhaps the overhead of the class gets optimized away?
>
> [1] http://hackage.haskell.org/package/parsers
>
>
> On Tue, Mar 12, 2013 at 9:06 PM, Jeremy Shaw <jeremy at n-heptane.com> wrote:
>>
>> It would be pretty damn cool if you could create a data type for
>> generically describing a monadic parser, and then use template haskell
>> to generate a concrete parser from that data type. That would allow
>> you to create your specification in a generic way and then target
>> different parsers like parsec, attoparsec, etc. There is some code
>> coming up in a few paragraphs that should describe this idea more
>> clearly.
>>
>> I would like to suggest that while it would be cool, it is
>> impossible. As proof, I will attempt to create a simple monadic parser
>> that has only one combinator:
>>
>>     anyChar :: ParserSpec Char
>>
>> We need the GADTs extension and some imports:
>>
>> > {-# LANGUAGE GADTs, TemplateHaskell #-}
>> > import Control.Monad              (join)
>> > import qualified Text.Parsec.Char as P
>> > import Language.Haskell.TH        (ExpQ, appE)
>> > import Language.Haskell.TH.Syntax (Lift(lift))
>> > import Text.Parsec                (parseTest)
>> > import qualified Text.Parsec.Char as P
>> > import Text.Parsec.String         (Parser)
>>
>> Next we define a type that has a constructor for each of the different
>> combinators we want to support, plus constructors for the functor and
>> monad methods:
>>
>> > data ParserSpec a where
>> >     AnyChar :: ParserSpec Char
>> >     Return  :: a -> ParserSpec a
>> >     Join    :: ParserSpec (ParserSpec a) -> ParserSpec a
>> >     FMap    :: (a -> b) -> ParserSpec a -> ParserSpec b
>> >
>> > instance Lift (ParserSpec a) where
>> >     lift _ = error "not defined because we are screwed later anyway."
>>
>> In theory, we would extend that type with things like `Many`, `Some`,
>> `Choice`, etc.
>>
>> In Haskell, we are used to seeing a `Monad` defined in terms of
>> `return` and `>>=`. But, we can also define a monad in terms of
>> `fmap`, `return` and `join`. We will do that in `ParserSpec`, because
>> it makes the fatal flaw more obvious.
>>
>> Now we can define the `Functor` and `Monad` instances:
>>
>> > instance Functor ParserSpec where
>> >     fmap f p = FMap f p
>>
>> > instance Monad ParserSpec where
>> >     return a = Return a
>> >     m >>= f  = Join ((FMap f) m)
>>
>> and the `anyChar` combinator:
>>
>> > anyChar :: ParserSpec Char
>> > anyChar = AnyChar
>>
>> And now we can define a simple parser that parses two characters and
>> returns them:
>>
>> > charPair :: ParserSpec (Char, Char)
>> > charPair =
>> >     do a <- anyChar
>> >        b <- anyChar
>> >        return (a, b)
>>
>> Now, we just need to define a template haskell function that generates
>> a `Parser` from a `ParserSpec`:
>>
>> > genParsec :: (Lift a) => ParserSpec a -> ExpQ
>> > genParsec AnyChar    = [| anyChar |]
>> > genParsec (Return a) = [| return a |]
>> > genParsec (Join p)   = genParsec p
>> > -- genParsec (FMap f p) = appE [| f |] (genParsec p) -- uh-oh
>>
>> Looking at the `FMap` case we see the fatal flaw. In order to
>> generate the parser we would need some way to transform any arbitrary
>> Haskell function of type `a -> b` into Template Haskell. Obviously,
>> that is impossible (for some definition of obvious).
>>
>> Therefore, we can assume that it is not possible to use Template
>> Haskell to generate a monadic parser from a monadic specification.
>>
>> We can also assume that `Applicative` is not available either. Seems
>> likely that `Category` based parsers would also be out.
>>
>> Now, we can, of course, do the transformation at runtime:
>>
>> > interpretParsec :: ParserSpec a -> Parser a
>> > interpretParsec AnyChar    = P.anyChar
>> > interpretParsec (Return a) = return a
>> > interpretParsec (FMap f a) = fmap f (interpretParsec a)
>> > interpretParsec (Join mm)  = join (fmap interpretParsec (interpretParsec
>> > mm))
>>
>> > test = parseTest (interpretParsec charPair) "ab"
>>
>> My fear is that doing that will result in added runtime overhead. One
>> reason for wanting to create a compile-time parser generator is to have
>> the opportunity to generate very fast parsing code. It seems like here
>> we can only be slower than the parser we are targeting? Though..
>> perhaps not? Perhaps the parser returned by `interpretParsec` has all
>> the interpret stuff removed and is as fast as if we have constructed
>> it by hand? I don't have an intuitive feel for it.. I guess criterion
>> would know..
>>
>> Any thoughts?
>>
>> - jeremy
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>



More information about the Haskell-Cafe mailing list