[Haskell-cafe] invalid character encoding

Glynn Clements glynn at gclements.plus.com
Thu Mar 17 21:20:23 EST 2005


Marcin 'Qrczak' Kowalczyk wrote:

> > E.g. Gtk-2.x uses UTF-8 almost exclusively, although you can force the
> > use of the locale's encoding for filenames (if you have filenames in
> > multiple encodings, you lose; filenames using the "wrong" encoding
> > simply don't appear in file selectors).
> 
> Actually they do appear, even though you can't type their names
> from the keyboard. The name shown in the GUI used to be escaped in
> different ways by different programs or even different places in one
> program (question marks, %hex escapes \oct escapes), but recently
> they added some functions to glib to make the behavior uniform.

In the last version of Gtk-2.x which I tried, "invalid" filenames are
just omitted from the list. Gtk-1.x displayed them (I think with
question marks, but it may have been a box).

I've just tried with a more recent version (2.6.2); the default
behaviour is similar, although you can now get around the issue by
using G_FILENAME_ENCODING=ISO-8859-1. Of course, if your locale is
a long way from ISO-8859-1, that isn't a particularly good solution.

The best test case would be a system used predominantly by Japanese,
where (apparently) it's common to have a mixture of both EUC-JP and
Shift-JIS filenames (occasionally wrapped in ISO-2022, but usually
raw).

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the Haskell-Cafe mailing list