[Haskell-cafe] haskell i18n best practices

Felipe Almeida Lessa felipe.lessa at gmail.com
Fri Sep 30 12:29:35 CEST 2011


On Thu, Sep 29, 2011 at 7:54 PM, Paulo Pocinho <pocinho at gmail.com> wrote:
> Still uncomfortable with i18n, I learned about the article "I18N in
> Haskell" in yesod blog [4]. I'd like to hear more about it.

Yesod's approach is pretty nice [1].  The idea is to have a data type
with all your messages, like

  data Message =
    Hello |
    WhatsYourName |
    MyNameIs String |
    Ihave_apples Int
    GoodBye

For each of your supported languages, you provide a rendering function
(they may be in separate source files)

  render_en_US :: Message -> String
  render_en_US Hello = "Hello!"
  render_en_US WhatsYourName = "What's your name?"
  render_en_US (MyNameIs name) = "My name is " ++ name ++ "."
  render_en_US (Ihave_apples 0) = "I don't have any apples."
  render_en_US (Ihave_apples 1) = "I have one apple."
  render_en_US (Ihave_apples n) = "I have " ++ n ++ " apples."
  render_en_US GoodBye = "Good bye!"

  render_pt_BR :: Message -> String
  render_pt_BR Hello = "Olá!"
  render_pt_BR WhatsYourName = "Como você se chama?"
  render_pt_BR (MyNameIs name) = "Eu me chamo " ++ name ++ "."
  render_pt_BR (Ihave_apples 0) = "Não tenho nenhuma maçã."
  render_pt_BR (Ihave_apples 1) = "Tenho uma maçã."
  render_pt_BR (Ihave_apples 2) = "Tenho uma maçã."
  render_pt_BR (Ihave_apples n) = "Tenho " ++ show n ++ " maçãs."
  render_pt_BR GoodBye = "Tchau!"

Given those functions, you can construct something like

  type Lang = String

  render :: [Lang] -> Message -> String
  render ("pt"   :_) = render_pt_BR
  render ("pt_BR":_) = render_pt_BR
  render ("en"   :_) = render_en_US
  render ("en_US":_) = render_en_US
  render (_:xs) = render xs
  render _ = render_en_US

So 'r = render ["fr", "pt"]' will do the right thing.  You just need
to pass this 'r' around in your code.  Using is easy and clear:

  putStrLn $ r Hello
  putStrLn $ r WhatsYourName
  name <- getLine
  putStrLn $ r MyNameIs "Alice"
  putStrLn $ r (Ihave_apples $ length name `mod` 4)
  putStrLn $ r GoodBye

This approach is nice for several reasons:

 - Builtin support for complicated messages.  Making something like
Ihave_apples in gettext would be hard.  Each language has its own
rules, and you need to encode all of them in your code.  On this
example, my render_pt_BR recognizes and treats differently the 2
apples case.  If you didn't think about it when you wrote your code
(using gettext), you'd need to change your code for pt_BR.

 - Fast processing.  "render" as I've coded above looks at the
language list just once.  After that, it's just GHC's pattern
matching.

 - Fast startup.  No need to look for strings on the hard drive.

 - Flexible.  You may try several extensions, depending on your needs

    (a) Using a type class (like Yesod) if you don't want one big data type.

    (b) Using Text instead of String.  Or even Builder.

The biggest drawback is lack of tool support and lack of "translators'
expertise".  gettext has a lot of inertia and is used everywhere on a
FLOSS system.  But as Ertugrul Soeylemez said, if you're targeting
Windows, _not_ using gettext should be an advantage (less pain while
create installers).

HTH,

[1] http://hackage.haskell.org/packages/archive/yesod-core/0.9.2/doc/html/Yesod-Message.html

-- 
Felipe.



More information about the Haskell-Cafe mailing list