[Haskell-cafe] Core packages and locale support

Felipe Lessa felipe.lessa at gmail.com
Sat Jun 26 08:44:20 EDT 2010


On Sat, Jun 26, 2010 at 09:29:29AM +0300, Roman Beslik wrote:
> Incorrect encoding of filepaths is common in e.g. Cyrillic Linux
> (because of multiple possible encodings --- CP1251, KOI8-R, UTF-8)
> and is solved by fiddling with the current locale and media mount
> options. No need to change a program, or to tell character encoding
> to a program. It is not a programming language issue.

If your program saves files using filepaths given by the user or
created programatically from another filepath, then you don't
need to decode/encode anything and the problem isn't in the
programming language.

However, suppose your program needs to create a file with a name
based on a database information.  Your database is UTF-8.  How do
you translate that UTF-8 data into a filepath?  This is the
problem we got in Haskell.  We have a nice coding-agnostic String
datatype, but we don't know how to create a file with this very
name.

The opposite also may also be problem.  Okay, you got an already
correctly-encoded filepath.  But you want to store this
information in your database.  Now, you have two options:

  a) Save the enconded filepath.  Each record of your database
  will potentially have a different encoding, which is very bad.

  b) Recode into, say, UTF-8.  But to do that you need to know
  the original coding using in the filepath, so we got the same
  problem above.

Even if we said "we don't care", we at least should change
FilePath to be [Word8], and not [String].  Currently filepaths
are silently "truncated" if any codepoint is beyond 255.

Cheers,

--
Felipe.


More information about the Haskell-Cafe mailing list