[Haskell-i18n] SourceForge Project Active

Sven Moritz Hallberg pesco@gmx.de
03 Sep 2002 01:06:40 +0200


On Thu, 2002-08-29 at 10:22, Martin Norb=E4ck wrote:
> tor 2002-08-29 klockan 04.20 skrev Ashley Yakeley:
> > A SourceForge project for the internationalisation effort is active at
> > <http://sourceforge.net/projects/haskell-i18n/>
> >=20
> > I've added my Unicode character properties code. Check it out (cvs co).
>=20
> Nice. Who will supply good UTF-8 code? I have some at
> http://www.dtek.chalmers.se/~d95mback/gettext/ but it is not in good
> shape.

With the ICFP contest finally over, I have just committed mine to CVS
(thanks for setting it up Ashley!). I hope it is of reasonable quality,
I've not performance-tested it. I'm looking forward to all feed-back.


> Where should a UTF-8 module be put? Text.UTF8?

In accordance with Simon's hierarchy page, I've put it into
Text.Encoding.UTF8.


> something like (just drafting here):
>=20
> Text.UTF8.encodeChar   :: Char -> [Word8] -- (or Array?)
> Text.UTF8.encodeString :: String -> [Word8] -- (or Array?)
> Text.UTF8.decodeChar   :: [Word8] -> Either (Char, [Word8]) Error
> Text.UTF8.decodeString :: [Word8] -> (String, [Word8], [Error])

Pretty much! I have:

encodeOne :: Char -> [Word8]     -- encodeChar is probably prettier
encode    :: String -> [Word8]   -- encodeString? I don't care.
decodeOne :: [Word8] -> (Either Error Char, Int, [Word8])
   -- 2nd. component: number of bytes consumed,
   -- 3rd. component: rest of bytes
decode    :: [Word8] -> (String, [(Error,Int)])
   -- 2nd. component: list of errors and their index in the byte stream
   --                 Maybe we should reverse the order of error/index
   --                 so it looks like any association list?

Comments welcome.


Regards,
Sven Moritz