TextEncoding
A TextEncoding is a specification of a conversion scheme between sequences of bytes and sequences of Unicode characters.
For example, UTF-8 is an encoding of Unicode characters into a sequence of bytes. The TextEncoding for UTF-8 is utf8.
a string that can be passed to mkTextEncoding to create an equivalent TextEncoding.
Look up the named Unicode encoding. May fail with
* isDoesNotExistError if the encoding is unknown
The set of known encodings is system-dependent, but includes at least:
* TF
* UTF-16, UTF-16BE, UTF-16LE
* UTF-32, UTF-32BE, UTF-32LE
On systems using GNU iconv (e.g. Linux), there is additional notation for specifying how illegal characters are handled:
* a suffix of //IGNORE, e.g. UTF-8//IGNORE, will cause all illegal sequences on input to be ignored, and on output will drop all code points that have no representation in the target encoding.
* a suffix of //TRANSLIT will choose a replacement character for illegal sequences or code points.
On Windows, you can access supported code pages with the prefix CP; for example, "CP1250".