H98 Text IO

Matthew Bentham matthew_bentham at yahoo.com
Wed Feb 27 06:35:38 EST 2008


----- Original Message ----
> From: Johan Tibell <johan.tibell at gmail.com>
> To: Duncan Coutts <duncan.coutts at worc.ox.ac.uk>
> Cc: Haskell Libraries <libraries at haskell.org>; GHC-users list <Glasgow-haskell-users at haskell.org>; Simon Marlow <simonmarhaskell at gmail.com>
> Sent: Wednesday, February 27, 2008 9:25:39 AM
> Subject: Re: H98 Text IO
> 
> On Wed, Feb 27, 2008 at 1:06 AM, Duncan Coutts
>  wrote:
> >  As a data point, Java and python use "always locale" as default if you
> >  don't specify an encoding when opening a text stream.
> >
> >  I think personally I'm coming round to the "always locale" point of
> >  view. We already have no cross-platform consistency for text files
> >  because of the lf vs cr/lf issue and we have no cross-implementation
> >  consistency.
> 
> I think following Java and Python in this matter is a good idea and
> leads to fewer surprises for developers. If you want files created on
> one machine to work on another you have to be explicit about encoding.

I'm a newcomer to haskell so I don't want to goof by offering any naive opinions, but I would like to inform the discussion by linking to a couple of useful things.

This blog post on encodings in python (mostly related to the behaviour of the standard input/output streams):

http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/

Particularly:

"It’s not unreasonable to have a program that wants to encode its outputin a particular encoding. The example I gave earlier still seemsreasonable to be, a program that takes input in one encoding andrecodes to a different encoding on its output, with both the input andoutput encoding specified on the command line. Clearly such a programshould be able to use stdin and stdout so that it can form part of apipe. So how in Python do I change sys.stdout to use a particular encoding?  It’s a right pain."

And the reference to the POSIX documentation related to locale, mentioned earlier in this thread:

http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html

The problem that I see is generally that because stdin/out/err are _already_open_ by the time they reach the haskell program, it is pretty hard for the developer "to be explicit about encoding" if no library is provided to effectively change the encoding used by an already-open stream.






More information about the Glasgow-haskell-users mailing list