[Haskell-cafe] Badly designed Parsec combinators?

Thu Feb 16 14:31:37 EST 2006

Udo Stenzel wrote:
> Juan Carlos Arevalo Baeza wrote:
>   
>> myParser :: Parser ()
>> myParser =
>>        do  string "Hello"
>>            optional (string ", world!")
>>
>>   It makes no sense for myParser to generate any values, especially not 
>> the result from the optional statement, so it is set to return ().
>>     
>
> Don't you think this will interfere somehow with type inference?

   With type inference? No, why? I mean... specifying the type of a 
function (as is recommended practice in multiple places) places a 
hard-point in the type system. It is even sometimes critical to make the 
types of a program totally predictable (or decidable), as when it's 
needed to resolve the the monomorphism restriction.

   Therefore, the language/compiler can do things to types at those hard 
points. If, say, the compiler needs to match some expression with "IO 
()" (or else it'll throw an error), and it infers "IO String", it can 
unambiguously resolve it by adding the ">> return ()". In my opinion, 
this would make the program better by removing chaff. In the function 
above, the "string" statement returns a value which is ignored because 
it is in the middle of a do-notation sequence. The second statement, 
after my proposed change, also returns a value. But this value currently 
cannot be ignored like the others in the do-sequence, even though it 
could without ambiguity of any kind.

   If I hadn't specified the type of "myParser", it would have gotten 
the inferred type "Parser (Maybe String)". But I should be able to 
specify the more general one "Parser ()" because that change is decidable.

   In some conceptual way, this is no different than this:

max :: Int -> Int -> Int
max a b = if a > b then a else b

   In this case, I've forced the type of the function to be more 
restrictive (and definitely different) than what it would have had if 
the type signature weren't there.

>   I
> wouldn't like a function that might decide to throw away its result if
> (erroneously) used in a context that wouldn't need it.  I also think
> almost every function has a sensible result, and written with the right
> combinator, can return it without too much hassle.  So I'd probably
> write:
>
> yourParser :: Parser String
> yourParser = liftM2 (++) (string "Hello")
>                          (option "" (string ", world!")
>   

   Personally, that style is way too functional (and a bit lisp-ish) for 
me. I prefer just using:

yourParser :: Parser String
yourParser =
        do  helloResult <- string "Hello"
            worldResult <- option "" $ string ", world!"
            return $ helloResult ++ worldResult

   But that's just a matter of style. In this case, that might even be a 
reasonable thing to do, returning this value from this function. But 
sometimes isn't. Sometimes, dropping results is the right thing to do.

> I also find it very convenient to have a combinator that does a bind and
> return the unmodified result of the first computation.  With that you
> get:
>
> (*>) :: Monad m => m a -> m b -> m a
> m *> n = do a <- m ; n ; return a
>
> ourParser :: Parser String
> ourParser = string "Hello" *> optional (string ", world!")
>   

   So you do drop returned values now and then? But with that function 
you lose out on the do-notation convenience.

> Therefore, implicit (return ()) is selsdom useful, has the potential to
> cause annoying silent failures and is generally not worth the hassle.
>   

   Useful? No more than the do-notation. They are both conveniences. No 
more than the "liftM2" function you used above: that's another 
convenience. All languages are full of conveniences that are not 
strictly necessary.

   Annoying silent failures? No more than the ">>" monad combinator.

>>   Another case where I encounter this is with the "when" function:
>>
>> myParser2 :: Bool -> Parser ()
>> myParser2 all =
>>        do  string "Hello"
>>            when all $
>>                do  string ", world"
>>            string "!"
>>     
>
> A better fix would be more flexible when:
>
> when :: Monad m => Bool -> m a -> m (Maybe a)
> when True  m = Just `liftM` m
> when False _ = return Nothing
>
> ...which is quite similar to the proposed change to Parsec's 'optional'.
> I'd support both.
>   

   I like that.

>> It resembles a lot the 
>> automatic conversions that C++ does.
>>     
>
> I'm more reminded of Perl...
>   

   I don't know perl :)

   Thanx!

JCAB