[Haskell-cafe] parsec manyTill stack overflow

Badea Daniel badeadaniel at yahoo.com
Fri Jul 4 18:15:34 EDT 2008


The file I'm trying to parse contains mixed sections like:

...

<start_section=

... script including arithmetic expressions ...

/end_section>

...

so I defined two parsers: one for the 'outer' language and 
the other one for the 'inner' language.  I used (manyTill 
inner_parser end_section_parser) but I got a stack overflow
because there's just too much text between section begin and
end.

With getInput I can switch from the outer parser to the inner
parser but this one tries to parse until eof and when it hits the
'/end_section>' it fails.


--- On Fri, 7/4/08, Derek Elkins <derek.a.elkins at gmail.com> wrote:

> From: Derek Elkins <derek.a.elkins at gmail.com>
> Subject: Re: [Haskell-cafe] parsec manyTill stack overflow
> To: "Badea Daniel" <badeadaniel at yahoo.com>
> Cc: haskell-cafe at haskell.org
> Date: Friday, July 4, 2008, 2:22 PM
> On Fri, 2008-07-04 at 13:31 -0700, Badea Daniel wrote:
> > I'm trying to parse a large file looking for
> instructions on each line and for a section end marker but
> Parsec's manyTill function causes stack overflow, as
> you can see in the following example (I'm using ghci
> 6.8.3):
> > 
> > > parse (many anyChar) ""
> ['a'|x<-[1..1024*64]]
> > 
> > It almost immediately starts printing
> "aaaaaaaaaaa...." and runs to completion.
> > 
> > > parse (manyTill anyChar eof) ""
> ['a'|x<-[1..1024*1024]]
> > *** Exception: stack overflow
> > 
> > I guess this happens because manyTill recursively
> accumulates output
> > from the first parser and returns only when it hits
> the 'end' parser.
> > Is it possible to write a version of
> 'manyTill' that works like 'many'
> > returning output from 'anyChar' as soon as it
> advances through the
> > list of tokens?
> 
> No, manyTill doesn't know whether it is going to return
> anything at all
> until its second argument succeeds.  I can make manyTill
> not stack
> overflow, but it will never immediately start returning
> results.  For
> the particular case above you can use getInput and setInput
> to get a
> result that does what you want.
> 
> parseRest = do
>     rest <- getInput
>     setInput []
>     return rest
> 
> That should probably update the position as well though
> it's not so
> crucial in the likely use-cases of such a function.


      


More information about the Haskell-Cafe mailing list