[Haskell-cafe] attoparsec and backtracking

Sat Mar 16 04:26:57 CET 2013

On 16 March 2013 12:54, Erik de Castro Lopo <mle+hs at mega-nerd.com> wrote:
> Evan Laforge wrote:
>
>> The first is that it's hard to get the right error msg out.  For
>> instance, I have a parser that tries to parse a number with an
>> optional type suffix.  It's an error if the suffix is unrecognized:
>>
>> p_num :: A.Parser Score.TypedVal
>> p_num = do
>>     num <- p_untyped_num
>>     suffix <- A.option "" ((:"") <$> A.letter_ascii)
>>     case Score.code_to_type suffix of
>>         Nothing -> fail $ "p_num expected suffix in [cdsr]: " ++ show suffix
>>         Just typ -> return $ Score.Typed typ num
>
> I think the mistake here is to parse something and then decide if
> its it valid. It should be the parser which decides whether its
> valid. So rather than:
>
>      suffix <- A.option "" ((:"") <$> A.letter_ascii)
>
> try:
>
>      typ <- A.choice [ {- list or valid suffix parsers -} ]
>      return $ Score.Typed typ num
>
>> However, which error msg shows up depends on the order of the (<|>)
>> alternatives, and in general the global structure of the entire
>> parser, because I think it just backtracks and then picks the last
>> failing backtrack.
>
> I'm not sure if what I've offered will help, but its worth a try.
>
>> Even after carefully rearranging all the parsers
>> it seems impossible to get this particular error to bubble up to the
>> top.
>
> Yes, I've found it impossible to force attoparsec to fail a parse.
> I think that is intended as a feature.

I don't know about a "feature", but I tried adding polyparse-style
commit semantics to attoparsec and couldn't do so without making it
rather noticeably slower.

>
>> The thing is, as soon as I see an unexpected suffix I know I can
>> fail entirely right there, with exactly that error msg, but since
>> there's no way to turn off backtracking I think there's no way to do
>> that.
>
> Yes, that's my impression.
>
> <snip>
>
>> I
>> originally used parsec, but parsing speed is my main bottleneck, so I
>> don't want to give ground there.
>
> We you using Parsec as a token parser or as a Char parser. Obviously
> the second is going to be slow in comparison to the first.
>
>> I've heard some good things about traditional alex+happy...
>> of course it would mean a complete rewrite but might be interesting.
>
> I've user Alex with both Parsec and Happy, where speed was strong
> secondary goal. Personally I much prefer Parsec; IMO its easier to debug
> and extend and modify that Happy based parsers. I also know some people
> prefer Happy.
>
>> Has anyone compared the performance of attoparsec vs. alex+happy?
>
> I haven't, nor have I compared those two with alex+parsec. It would
> be an interesting experiment.
>
> Erik
> --
> ----------------------------------------------------------------------
> Erik de Castro Lopo
> http://www.mega-nerd.com/
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
http://IvanMiljenovic.wordpress.com