patch applied (cabal): First pass at parsing .cabal files as UTF8

Ross Paterson ross at soi.city.ac.uk
Mon Feb 25 06:53:23 EST 2008


On Sun, Feb 24, 2008 at 05:46:35PM +0000, Duncan Coutts wrote:
> I've added readTextFile and writeTextFile to the Utils module and
> checked all other uses of readFile and writeFile.
> 
> I've also switched the rawSystemStdout to assume UTF8 output format.

The read and write functions ought to open their files in binary mode.
It's just wrong to read Unicode characters (which is what a plain text
Handle promises you) and treat them as bytes.  There's a similar problem
with using toUTF on stdout and stderr.  Haskell 98 is very clear that
putChar on those Handles takes Unicode characters, though it does not
specify how these are encoded in the environment.  GHC has historically
assumed an ISO-8859-1 encoding, truncating larger characters, but other
implementations could map them to the current locale (as Hugs does).
Perhaps a future GHC will map them to UTF.  I think you should just
hand the characters to putChar and leave their presentation to the
implementation, flawed though GHC's currently is.



More information about the cabal-devel mailing list