[web-devel] xml-types IsString instance for Name causes crashes

Michael Snoyman michael at snoyman.com
Sun Jun 12 14:23:43 CEST 2011


On Sun, Jun 12, 2011 at 2:27 PM, Aristid Breitkreuz
<aristidb at googlemail.com> wrote:
> I don't think there is any kind of consensus for removing it.

I agree. If I can suddenly join the fray, let's take a step back for a
second and reanalyze the issue here. We have this incredibly useful
IsString instance for Name, which in all honesty is a lie: the type of
fromString is "String -> Name", when not all Strings can be properly
converted into XML names. There are two *separate* reasons for this:

1) Not all characters can be used in a name. Simple example: a
less-than sign (<) is not allowed. For full information, see [1] and
[2].
2) xml-types implements Clark notation, which allows a very convenient
way to define namespaces. (This is a feature I use a lot in my own
code.) But missing a right brace invalidates the Clark notation.

Ideally, the compiler would catch invalid Names and error out.
Unfortunately, due to the way OverloadedStrings works, this isn't
possible currently. (Though I think such a solution would be ideal,
and is something we should consider separately.) I think we have three
possible responses to the dilemma:

a) Ignore invalid Names, and simply allow invalid XML to be generated.
b) Throw an asynchronous exception.
c) Realize that the IsString instance is not correct, and therefore remove it.

Currently, xml-types follows option (a) for (1) above, and (b) for (2)
above. I personally don't think either option is obviously better or
worse, but I do in general prefer consistency. And I think that
writing the validation rules for an XML name is outside the scope of
xml-types, so I prefer option (a)... but not by any great margin.

The one thing I'd hate to see happen is option (c). It's true that the
instance of IsString is not really "correct," but the same argument
could be made for ByteStrings as well, where characters over 255 get
truncated (I believe). The fact is that invalid input here should be
*incredibly* rare.

I suppose a fourth option would be to force the String into a valid
name, either through some escape mechanism or removing characters. But
again, I personally think it's outside the scope of xml-types.

Michael

PS: Quasi-quoting is actual a great fit here as well, but it's just
not nearly as convenient as OverloadedStrings.

[1] http://www.w3.org/TR/xml/#NT-NameStartChar
[2] http://www.w3.org/TR/xml/#NT-NameChar

> Am 12.06.2011 13:21 schrieb "Yitzchak Gale" <gale at sefer.org>:
>> John Millikin wrote:
>>> To me, the choice is between raising an exception
>>> or removing IsString.
>>
>> That would be a shame, but removing it may be the
>> only way out of this conundrum.
>>
>>> IsString without namespaces is pointless.
>>
>> I am making good use of it in a project that doesn't
>> involve namespaces at all. It would actually be
>> a lot of work to back out at this point.
>>
>>> IsString without input checking is dangerous. If fromString cannot
>>> fail on invalid input, then it shouldn't be defined.
>>
>> I appreciate your concerns, but Haskell has other means of
>> providing such guarantees. Raising an asynchronous exception
>> is just not an option in an IsString instance.
>>
>>>> The Name type already produces invalid XML.
>>
>>> You're right -- it is already possible for Names to be invalid. There
>>> should probably be stricter input checking on names, to ensure they
>>> match the XML spec. Something like this...
>>
>> Yes, as I mentioned earlier, newtype wrappers with hidden
>> constructors is the way we would do that if we wanted to
>> guarantee those kinds of things at the type level.
>> You could then provide several constructor functions that
>> either do or do not raise exceptions. See, for example,
>> Data.Text.Encoding, Neil Mitchell's Safe library, Michael's
>> xml-enumerator.
>>
>> But you certainly could not use the version that raises an
>> exception for an IsString instance.
>>
>> In fact, I don't think an IsString instance makes sense at
>> all for a validating type. So maybe just removing it
>> really is the right thing to do after all.
>>
>> Thanks,
>> Yitz
>>
>> _______________________________________________
>> web-devel mailing list
>> web-devel at haskell.org
>> http://www.haskell.org/mailman/listinfo/web-devel
>
> _______________________________________________
> web-devel mailing list
> web-devel at haskell.org
> http://www.haskell.org/mailman/listinfo/web-devel
>
>



More information about the web-devel mailing list