[Haskell-cafe] Re: hSetEncoding on socket handles

Simon Marlow marlowsd at gmail.com
Wed May 12 09:14:45 EDT 2010


On 12/05/2010 01:56, David Powell wrote:
> Greetings,
>
> I am having trouble sending unicode characters as utf8 over a socket handle.
> Despite setting the encoding on the socket handle to utf8, it still seems to
> use some other encoding when writing to the socket.  It works correctly when
> writing to stdout, but not to a socket handle.  I am using ghc 6.12.1 and
> network-2.2.1.7.  I can get it to work using System.IO.UTF8, but I was under
> the impression this was no longer necessary?
>
> I also don't seem to understand the interaction between hSetEncoding and
> hSetBinaryMode because if I set the binary mode to 'False' and the
> encoding to
> utf8 on the socket, then when writing to the socket the string seems to be
> truncated at the first non-ascii codepoint.
>
> Here is a test snippet, which can be used with netcat as a listening server
> (ie. nc -l 1234).
>
>  > import System.IO
>  > import Network
>  > main = do
>  >  let a="λ"
>  >  s <- connectTo "127.0.0.1" (PortNumber 1234)
>  >  hSetEncoding s utf8
>  >  hSetEncoding stdout utf8
>  >  hPutStrLn s a
>  >  putStrLn a
>  >  hClose s

You've found a bug, thanks.  The bug is that a socket is bidirectional 
and we're only setting the encoding for one side (the read side) but we 
should be setting it for both sides.

I just created a ticket:

http://hackage.haskell.org/trac/ghc/ticket/4066

Expect a fix in GHC 6.12.3.  In the meantime you can work around it, 
e.g. this worked for me to create a write-only socket that hSetEncoding 
works with:

connectTo hostname (PortNumber port) = do
     proto <- getProtocolNumber "tcp"
     bracketOnError
	(socket AF_INET Stream proto)
	(sClose)  -- only done if there's an error
         (\sock -> do
       	  he <- getHostByName hostname
       	  connect sock (SockAddrInet port (hostAddress he))
           socketToHandle sock WriteMode
  	)

Cheers,
	Simon


More information about the Haskell-Cafe mailing list