[GHC] #1079: refinement for GHC's support of UTF-8 encoding

GHC trac at galois.com
Tue Jan 2 10:05:12 EST 2007


#1079: refinement for GHC's support of UTF-8 encoding
--------------------------------+-------------------------------------------
    Reporter:  mukai at jmuk.org   |       Owner:         
        Type:  feature request  |      Status:  new    
    Priority:  normal           |   Milestone:         
   Component:  Compiler         |     Version:  6.6    
    Severity:  major            |    Keywords:         
  Difficulty:  Unknown          |    Testcase:         
Architecture:  Unknown          |          Os:  Unknown
--------------------------------+-------------------------------------------
From 6.6, GHC supports UTF-8 encoding in the source programs.  GHC can
 read UTF-8 files and convert them into Unicode characters.  However, there
 are no support to read/print them.

 For example, we can compile the following program,
 {{{
 main = putStrLn "あ"
 }}}
 but we only get `B', the least 8bit of the character `あ' (U+3042).
 Because of this incompleteness, we cannot print any non-ascii characters
 without converting for the case of writing Haskell codes with UTF-8.
 Although it is easy to write converting functions for this purpose, such
 converting should be supported by the compiler.

 IMHO, desired approach is similar to Hugs.  In Hugs, when printing non-
 ascii characters, it first converts the characters to UTF-8 octets and
 then prints them.  However, with binary-mode Handle, it just print
 characters without convert.  This behavior will be acceptable for many
 haskell programmers.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/1079>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the Glasgow-haskell-bugs mailing list