[Haskell-beginners] hPutChar: invalid argument (Invalid or incomplete multibyte or wide character

Daniel Fischer daniel.is.fischer at web.de
Sun Jun 13 07:07:03 EDT 2010


On Sunday 13 June 2010 08:00:15, Erik de Castro Lopo wrote:
> HI all,
>
> I've managed to use the Curl bindings to pull down a web page, and I'm
> using TagSoup to parse it, but when I try to print the text in a TagText
> I get
>
>    hPutChar: invalid argument (Invalid or incomplete multibyte or wide
>    character)
>
> The code looks like:
>
>     parsePage :: String -> IO ()
>     parsePage page = do
>         let tags = map deTag $ filter isTagText $ parseTags page
>         mapM_ putStrLn tags
>       where
>         deTag (TagText s) = s
>         deTag x = error $ "Bad Tag '" ++ show x ++ "' in deTag."
>
>
> This is with ghc-6.12.1 on Debian Linux.
>
> Any clues appreciated.
>
> Cheers,
> Erik

Probably the page you've tried it on wasn't encoded in your locale 
encoding. If the page was in latin1 and your locale is UTF-8, there will 
likely be invalid (for UTF-8) byte sequences in it.
For a locally stored page, the code above worked fine with tagsoup-0.6 and 
tagsoup-0.10 when the page was utf-8-encoded, but if it was latin1-encoded 
(and contained non-ASCII chars), it raised an

invalid argument (Invalid or incomplete multibyte or wide character)

error (on hGetContents, though, I suppose that's because I used readFile 
and not th Curl-bindings).


More information about the Beginners mailing list