String != [Char]

Gabriel Dos Reis gdr at integrable-solutions.net
Mon Mar 26 16:30:05 CEST 2012


On Mon, Mar 26, 2012 at 8:35 AM, Gábor Lehel <illissius at gmail.com> wrote:
> On Sun, Mar 25, 2012 at 5:19 AM, Greg Weber <greg at gregweber.info> wrote:
>> On Sat, Mar 24, 2012 at 7:26 PM, Gabriel Dos Reis
>> <gdr at integrable-solutions.net> wrote:
>>> On Sat, Mar 24, 2012 at 9:09 PM, Greg Weber <greg at gregweber.info> wrote:
>>
>>>> Problem: we want to write beautiful (and possibly inefficient) code
>>>> that is easy to explain. If nothing else, this is pedagologically
>>>> important.
>>>> The goals of this code are to:
>>>>  * use list processing pattern matching and functions on a string type
>>>
>>> I may have missed this question so I will ask it (apologies if it is a
>>> repeat):  Why is it believed that list processing pattern matching is
>>> appropriate or the right tool for text processing?
>>
>> Nobody said it is the right tool for text processing. In fact, I think
>> we all agreed it is the wrong tool for many cases. But it is easy for
>> students to understand since they are already being taught to use
>> lists for everything else.  It would be great if you can talk with
>> teachers of Haskell and figure out a better way to teach text
>> processing.
>>
>
> I think a helpful question might be whether [Char] is mainly used to
> teach about lists, or whether it's mainly used to teach about how to
> do Unicode text processing correctly. If it's mainly used to teach
> about lists, pattern matching, etc., as I suspect, then the fine
> details of Unicode don't matter so much, you could even work with
> ASCII-only strings and it would work equally well for teaching about
> lists.

I agree that if the purpose is to teach list and list pattern matching,
it does not matter much what the element type is as long as it follows
reasonable constraints.  However, as someone observed earlier, the
Haskell Report is not a vehicle to prescribe how Haskell should be
taught or for what reasons Haskell should be taught.  That argument,
while it was made to support String = [Char] for pedagogical purposes,
is in fact a good argument against.

>  How to do Unicode text processing correctly is a topic that
> seems like it would become important much later, when someone's going
> to write code that's meant to be used in a production environment.
> Most students in an introductory university course probably don't get
> close to that point. If you do want to teach about how to do Unicode
> text processing correctly (which, for the record, is an important
> issue irrespective of which programming language you're using) then
> presumably you want to teach about Text, but hopefully your students
> will be more advanced by then and it won't be so much of a problem.

The Haskell Report claims very prominently that it uses the Unicode
character set.  The question is whether it should be using it correctly
at all, and if so should it even try to pretend that its default string type
use those characters correctly.

I do not subscribe to the notion that simple correct text processing
is something
students would have to learn only in "advanced" classes dedicated to
Unicode.  In the region of this side of the Atlantic Ocean where I teach, the
student population is very diverse and I do think it would responsible to stand
in front of students and say:
     You are all welcome; this class is open to all cultures and we
are committed
      to diversity and equal opportunity.  However, for the purpose of
simplicity and
      pedagogy, we would refrain from looking at texts from this and
other students.
      If you are really interested, you should take an advanced class.
 I hope you
      enjoy the class.

Furthermore, I am not convinced it is a good strategy to try hard to reflect
the notion that text processing is hard, either in the language or in
its presentation
(e.g. it is advanced topic, you need to be advanced before we talk about it.)

> I'm not really sure what that recommends in terms of policy. Mainly
> what you need is "it should be possible to work with lists of
> characters" and "it should be possible to work with Text", which we
> more-or-less have already. The important bits seem to be
> OverloadedStrings and ideally some way to avoid a pervasive API bias
> towards the wrong type (the tradeoffs there are probably more tricky).
> (So... basically what Simon M. said.)

-- Gaby



More information about the Haskell-prime mailing list