<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<br>
On 25.06.10 20:09, Jason Dagit wrote:
<blockquote
cite="mid:AANLkTik_s8rr23Lgq7-olobiTvChxfNLX3c7i0j18YVw@mail.gmail.com"
type="cite"><br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">
you got everything right here. So, as you said, there is a
mismatch<br>
between representation in Haskell (list of code points) and<br>
representation in the operating system (list of bytes), so we
need to<br>
know the encoding. Encoding is supplied by the user via locale<br>
(<a moz-do-not-send="true"
href="https://secure.wikimedia.org/wikipedia/en/wiki/Locale"
target="_blank">https://secure.wikimedia.org/wikipedia/en/wiki/Locale</a>),
particularly<br>
LC_CTYPE variable.<br>
<br>
The problem with encodings is not new -- it was already solved
e.g. for<br>
input/output.<br>
</blockquote>
<div><br>
</div>
<div>This is the part where I don't understand the problem well.
I thought that with IO the program assumes the locale of the
environment but that with filepaths you don't know what locale
(more specifically which encoding) they were created with. So
if you try to treat them as having the locale of the current
environment you run the risk of misunderstanding their
encoding.</div>
</div>
<br>
</blockquote>
Incorrect encoding of filepaths is common in e.g. Cyrillic Linux
(because of multiple possible encodings — CP1251, KOI8-R, UTF-8) and
is solved by fiddling with the current locale and media mount
options. No need to change a program, or to tell character encoding
to a program. It is not a programming language issue.<br>
<pre class="moz-signature" cols="72">--
Best regards,
Roman Beslik.
</pre>
</body>
</html>