[Haskell-cafe] Re: Proper round-trip HughesPJ/Parsec for Doubles?

Christian Maeder Christian.Maeder at dfki.de
Fri Feb 26 07:18:28 EST 2010


Andy Gimblett schrieb:
> Hi Christian,
[...]
>> It may make sense to use something like readMaybe (which is missing in
>> the Prelude) instead of "read" to allow the parser to fail more nicely.
> 
> It seems to be kicking up reasonable errors as it is, e.g.:
> 
> *Main> parse aFloat "" "2e-h"
> Left (line 1, column 4):
> unexpected "h"
> expecting digit

yes, this is fine, because you reject "-h", but suppose it was passed to
read (due to a programming error).

> I haven't seen any uncaught exceptions propagating, if that's what
> you're worried about...?

Yes, "read" will always work for you. But you could use a parser
*almost* as simple as
  many1 $ oneOf "NaInfity+-.eE0123456789"
and ask "readMaybe" if the parsed String can be read as Double.

> So here's what I have now:
> 
> float' :: TokenParser st -> GenParser Char st Double
> float' t =
>   do n <- maybeChar "-"
>      spaces
>      fs <- choice [symbol t "NaN",
>                    symbol t "Infinity",
>                    do whole <- many1 digit
>                       frac <- option "" $ do char '.'
>                                              ds <- many1 digit
>                                              return $ '.' : ds
>                       ex <- option "" $ do choice [char 'e', char 'E']
>                                            s <- maybeChar "+-"
>                                            ds <- many1 digit
>                                            return $ concat ["e", s, ds]
>                       return $ concat [whole, frac, ex]
>                   ]
>      whiteSpace t
>      return $ read $ n ++ fs
>   where maybeChar :: String -> GenParser Char st String
>         maybeChar as = option "" (choice (map char as) >>= \a -> return
> [a])

I would omit handling of spaces (that's a separate lexing step). It's
enough to be able to parse those numbers, that are possible results of
"show" (for round-trip).

"symbol t" could be replaced by "(try . string)" in order to get rid of
the TokenParser (that I don't like).

Spaces following an initial minus sign are quite unusual and rather
indicate that the sign does not belong to number, but that the sign is a
separate operation.

>>> You can also break it immediately before do, which I think is
>>> sometimes more clear.
>>
>> If not an extra space is added following "do" this leads to an "odd"
>> indentation of at least one line.
> 
> I'm curious: which line in the above is indented oddly?  Oh, wait: you
> don't mean odd as in "strange", do you?  You mean odd as in "not even"? 
> So, e.g. the "spaces" line starts at column 5?  What's wrong with that?

Right, again a matter of taste.

Cheers Christian

P.S. below is my parser for tptp numbers (as comparison).
It rejects leading zeros, but allows an initial + sign.

"fmap read real" would work if the input does not start with +
(or is NaN or Infinity).

-- | does not allow leading zeros
natural :: Parser String
natural = string "0" <|> many1 digit

decimal :: Parser String
decimal = do
  s <- option "" $ string "+" <|> string "-"
  ds <- natural
  return $ s ++ ds

real :: Parser String
real = do
  d <- decimal
  f <- option "" $ do
    p <- char '.'
    n <- many1 digit
    return $ p : n
  e <- option "" $ do
    x <- char 'e' <|> char 'E'
    g <- decimal
    return $ x : g
  return $ d ++ f ++ e




More information about the Haskell-Cafe mailing list