<div dir="ltr"><br><br><div class="gmail_quote">On Tue, Aug 17, 2010 at 10:08 AM, Ketil Malde <span dir="ltr"><<a href="mailto:ketil@malde.org">ketil@malde.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Benedikt Huber <<a href="mailto:benjovi@gmx.net">benjovi@gmx.net</a>> writes:<br>
<br>
> Despite of all this, I think the performance of the text<br>
> package is very promising, and hope it will improve further!<br>
<br>
I agree, Data.Text is great. Unfortunately, its internal use of UTF-16<br>
makes it inefficient for many purposes.<br>
<br>[..] </blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
>From a quick glance, it appears that utf8-string is the most complete<br>
and well maintained of the crowd, but I could be wrong. It'd be nice if<br>
a similar effort as Data.Text has seen could be applied to<br>
e.g. utf8-string, to produce a similarly efficient and effective library<br>
and allow the deprecation of the others. IMO, this could in time<br>
replace .Char8 as the default ByteString string representation.<br>
Hackathon, anyone?<br>
<br></blockquote><div>Let me ask the question a different way: what are the motivations for having the text package use UTF-16 internaly? I know that some system APIs in Windows use it (at least, I think they do), and perhaps it's more efficient for certain types of processing, but overall do those benefits outweigh all of the reasons for UTF-8 pointed out in this thread?</div>
<div><br></div><div>Michael </div></div></div>