[Haskell-beginners] parsec and source material with random order lines

Brent Yorgey byorgey at seas.upenn.edu
Tue Dec 25 03:28:42 CET 2012


Hi Emmanuel,

Sounds like you want a permutation parser, perhaps?  Check out

  http://hackage.haskell.org/packages/archive/parsec/latest/doc/html/Text-Parsec-Perm.html

-Brent

On Tue, Dec 25, 2012 at 12:18:37AM +0100, Emmanuel Touzery wrote:
> Hi,
> 
>  I'm trying to parse ical files but the source material doesn't matter
> much. First, I know there is an icalendar library on hackage, but I'm
> trying to learn as well through this.
> 
>  Now the format is really quite simple and actually I'm parsing it, it
> works, but I don't like the code I'm writing, it feels wrong and I'm sure
> there is a better way. Actually for now I'm parsing it to an array of
> arrays, but I want to fill a proper "data" structure.
> 
>  For my purpose the file contains a bunch of records like this:
> 
> BEGIN:VEVENT
> DTSTART:20121218T103000Z
> DTEND:20121218T120000Z
> [..]
> DESCRIPTION:
> [..]
> END:VEVENT
> 
> There are a bunch of records I don't care about and also I want to parse no
> matter what is the order of directives (so, i want to parse also if DTEND
> appears before DTSTART for instance, and so on).
> 
> That last part is my one problem. I can't do:
> 
> parseBegin
> start <- parseStart
> end <- parseEnd
> skipRows
> desc <- parseDesc
> skipRows
> end <- parseEnd
> return Event { eventStart = start, eventEnd = end ...}
> 
> my current working code is:
> 
> parseEvent = do
>     parseBegin
>     contents <- many1 $ (try startDate)
>             <|> (try endDate)
>             <|> (try description)
>             <|> unknownCalendarInfo
>     parseEnd
>     return contents
> 
> But then contents of course returns an array, while I want to return only
> one element here.
> 
> SOMEHOW what I would like is:
> 
> parseEvent = do
>     parseBegin
>     contents <- many1 $ (start <- T.try startDate)
>             <|> (end <- T.try endDate)
>             <|> (desc <- T.try description)
>             <|> unknownCalendarInfo
>     parseEnd
>     return Event { eventStart = start, eventEnd = end ...}
> 
>  But obviously as far as Parsec is concerned startDate could occur several
> times and also it's just not valid Haskell syntax.
> 
>  So, any hint about this problem? Parsing multi-line records with Parsec,
> when I don't know the order in which the lines will appear? I mean sure I
> can convert my array to the proper data structure... I find which element
> in the array contains the start date and then which contains the end
> date... and build my data structure.. But I'm sure something much nicer can
> be done... I just can't find how.
> 
>  I see the author of iCalendar fixed the problem but I can't completely
> understand his source, it's too many things at the same time for me, I need
> to take this one step at a time.
> 
>  Thank you!
> 
> Emmanuel

> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners




More information about the Beginners mailing list