[Haskell-cafe] Has character changed in GHC 6.8?

Johan Tibell johan.tibell at gmail.com
Wed Jan 23 07:43:54 EST 2008


> > The benefit would be that if the input is not in latin-1 an exception
> > could be thrown rather than returning a Char representing the wrong
> > Unicode code point.
>
> I'm not sure what you mean here. All 256 possible values have a meaning.

You're of course right. So we don't have a problem here. Maybe I was
thinking of an encoding (7-bit ASCII?) where some of the 256 values
are invalid.

> > My proposal is for I/O functions to specify the encoding they use if
> > they accept or return Chars (and Strings). If they deal in terms of
> > bytes (e.g. socket functions) they should accept and return Word8s.
>
> I would be more inclined to suggest they default to a particular well
> understand encoding, almost certainly UTF8. Another interface could give
> access to other encodings.

That might be a good option. However, it would be nice if beginners
could write simple console programs using System.IO and have them work
correctly even if their system's encoding is not byte compatible with
UTF-8. People who do I/O over the network etc. need to be more careful
and should specify the encoding used. How would a UTF-8 default work
on different Windows versions?

> > Optionally, text I/O functions could default to the system locale
> > setting.
>
> That is a disastrous idea.

I'm not sure about that as long as decode is called on the input to
make sure that it's a valid encoding given the input bytes. Same point
as above. What I would like to avoid is having to write:

main = do
  putStrLn systemLocalEncoding "What's your name?"
  name <- getLine systemLocalEncoding
  putStrLn systemLocalEncoding  $ "Hi " ++ name ++ "!"

I guess we could solve this by putting the functions in different modules:

System.IO  -- requires explicit encoding
System.IO.DefaultEncoding  -- implicit use of system locale setting

And have the modules export the same functions. Another option would
be to include the fact that encoding is implied in the name of the
function. Maybe we should start by giving some type signatures and
function names. That often helps my thinking. I'll try to write
something down when I get home from work.

-- Johan


More information about the Haskell-prime mailing list