Instantbird speaks your language

Instantbird 1.3 became available on November, 15, and it comes with new features (you can see them clicking here). The Instantbird team is interested in offering this IM client to everyone, so there’s a new language in this release: Brazilian Portuguese - Sim, nós falamos Português!. Now Brazilians users can enjoy Instantbird in their language.

Additionally, Instantbird brought back the Russian language in this release. This language was in Instantbird since version 0.2, but it was not present in version 1.2. Thanks go out to our Russian localization team for making this possible!

Since the 0.2 release, Instantbird has been available in 5 languages - English, Finnish, French, Polish and Russian. The 1.0 release brought six more languages and now Instantbird is available in 13 languages!

The Instantbird team has a goal: “to redefine the way instant messaging is used, to work the way you want.” To reach this goal, it’s necessary that Instantbird can be used by as many people as possible, in every country. So we encourage interested people to translate it to their languages. If you’re interested in helping us achieve this goal by creating new translations or improving Instantbird, you can get more information here.

Translations

We have been contacted by lots of individuals who volunteered to translate Instantbird into their native language and were eager to start working on it. As we were not ready to host the translations, we asked people to wait before starting their work on localized versions of Instantbird.

As we plan to release the next beta of Instantbird 0.2 in several languages, we feel that now is a good time to start translating the UI of Instantbird. Please note that the development work is not finished yet, and that there will still be string changes before we are ready to release this next milestone.

You will find information on the translation process on our wiki at http://wiki.instantbird.org/Instantbird:Translation. As usual, if you have any question, feel free to ask them in #instantbird on irc.mozilla.org or to contact us at contact AT instantbird DOT org.

Instantbird 0.2 Feature Preview: Localizability

As you may (or may not) know, we previously wrote that Instantbird 0.1.* was not localizable. The reason evoked for this was the use of gettext by libpurple, which is not compatible with the way XUL applications are localized. I’m going to give more details about the issue, and explain how we solved it for Instantbird 0.2.

Comparison of translation systems used by Mozilla and libpurple:

Inside libpurple, localizable strings are just marked by _("string"). For example, you can find this in the code:

description = _("Unknown error");

During the compilation, _() is expanded by the C preprocessor to a call to a gettext function. Gettext tools can analyze the source code, find all strings enclosed in _() markers, and produce a translation template. This template (a .pot file) is then handed to translators, who translate the strings and then provide a .po file for their language.

The translation system for XUL applications is quite different, here are 2 significant differences:

  • localizable strings are not directly in the source code. The source code uses unique identifiers, and these identifiers are used to find the actual string in the locale files.
  • the strings are spread across several localized files. Usually each window has its separate files, which makes it easy to decide at a later point that something will become an extension, and makes it easy to localize an extension like any other part of the application.

How do we deal with this in Instantbird?

Obviously, we don’t want Instantbird to use both of these localization systems, so one had to be removed. In Instantbird 0.1.*, we just removed gettext without replacing it. This means that the gettext _() macro was defined to something doing nothing, and the string used was just the one specified directly inside the source code.

For Instantbird 0.2, this is no longer acceptable, and we worked on a way to simulate the action of gettext, that is, hiding the 2 differences I’ve just explained.

Splitting the translation in different files wasn’t very difficult. Actually, gettext has a concept of packages that makes it possible to split the translation of an application into several packages, the feature is just unused by libpurple. With a little bit of build system tweaking, I finally got a translation file for the core of libpurple, and a separate translation file for each protocol plugin. This was needed so that libpurple protocol plugins packaged as extensions can be localized.

Creating a unique identifier for each localizable string was a bit more work. The solution we have settled on is:

  • Take the original string and remove all string formatters (words starting with %), hexadecimal numbers (words starting with 0x) and more generally, all non alphanumeric characters.
  • Remove all the whitespace in the remaining string, keep only the 7 first words, and convert to camel case.

At this point, we have an identifier for the original string, but it is not unique. Long strings that differ only at the end result in the same identifier, and strings that don’t contain any real word (‘%s:%s’ for instance) all result in an empty string. To disambiguate in these cases, and only in these cases, we append the 8 first characters of the hexadecimal MD5 hash of the original string to the identifier.

Now, how do we use this?

We have a .properties file for libpurple and one for each protocol plugin. When libpurple is compiled for Instantbird, the gettext macros are modified to point to some of our code instead of the gettext library. Our code uses the en-US string to build the identifier, and attempts to find it in the .properties file. If it isn’t found, it tries again with the identifier plus the 8 first characters of the MD5 hash of the string. If it still isn’t found, then it returns the en-US string as a fallback (and emits a warning in debug builds).

How do we make the .properties files for libpurple?

I wrote a python script that generates automatically the appropriate .properties files for the en-US language from the source code of libpurple. Additionnaly, it uses the various .po files of Pidgin to produce files that can be used as a base for localizing this part of Instantbird.

Does this mean I can start translating Instantbird into my own language?

No, not yet, but very soon! Once we are ready to accept contributions from translators, we will ask translators who volunteer to localize Instantbird to contact us so that we can provide them with these generated files.

An alpha build of Instantbird 0.2 will be available soon. We will provide an experimental French translation of this build (most people in our team are French, so French was the logical choice for testing all of this ourselves).