Skip to content

DevBest LocalizationGuide

jayallen edited this page Jun 29, 2011 · 5 revisions

Internationalization and Localization

Software intended for world-wide use must provide for easy, fluent translation of the application's text strings and/or images containing text into other languages. The two parts of this process are known as Internationalization and Localization.

  • Internationalization (I18N) ensures that all text intended for output to a user is abstracted from the code that produces it for purposes of translation, that the output can be dynamically switched based on the language of the user and that it can be properly transformed according to the rules of each language,

  • Localization (L10N) of an application is the process of translating each abstracted string into a particular language.

In short, I18N is making an application aware that other languages exist while L10N is making it as fluent as necessary in any one language.

10 Minute Primer

You have reached this page probably because you are interested in learning how localize your plugin, and/or create a translation for your plugin or theme. That is great! Thank you. So let's get you started.

Localizing Templates

Localizing templates requires you to wrap every string with special markup that will pass the string through a translation layer when it is rendered. Strings that are NOT wrapped in this way, will remain in the language they are written in and will NOT be translated. Here is an example of how to wrap the strings in your templates:

* `<__trans phrase="Hello World">`
* `<__trans phrase="Hello, [_1]" params="<mt:AuthorName>">`
* `<__trans phrase="Hello, [_1], your username is [_2]" params="<mt:AuthorName>%%<mt:AuthorUsername">`

Those three examples show you just about everything you need to know about preparing your templates for translation. The first example is a simple string. The second example shows a parameterized string, and the third shows how to demarkate multiple parameters in a parameterized string.

Done. Easy right?

Localizing Application Code

Localizing application code is done under the same principal. Strings are encapsulated in a wrapper that routes the string through the translation system prior to being rendered on the screen. So if you are writing perl code, this is how you prep your app logic for translation:

  • print MT->translate("Hello World");
  • print MT->translate("Hello, [_1]",$app->user->name);
  • print MT->translate("Hello, [_1], your username is [_2]",$app->user->name,$app->user->username);

That pretty much sums it up. Every developer should be encapsulating their strings in these tags and logic so that even if you are not able to do the translation yourself, your code will be ready for someone else to come in behind you and create the translation.

Generating a Translation File

So now that all your code in shape and ready for translation, it is time to generate your translation files. This is done using a simple command line tool (available in Melody 1.1) that scans your plugin for translation strings and creates the files for you. Here is how it is run:

prompt> cd $MELODY_HOME
prompt> perl ./build/make-l10n -dir=plugins/MyPlugin -lang=de -base=MyPlugin

The above command will generate a file in the working directory called de.pm that will contain all of the strings to be translated that were found in the plugins/MyPlugin directory hierarchy.

Once that file has been generated, place it in the following directory:

prompt> mv de.pm plugins/MyPlugin/lib/MyPlugin/L10N/

Internationalization in Melody

The current Melody community is already highly international one thanks in large part to its Melody roots from which Melody also inherits an excellent I18N/L10N framework and localizations in Spanish, French, German, Dutch, Japanese and Russian (Melody only).

For this reason, it is essential that all code developed for Melody (core and plugin) be designed and implemented in accordance with Melody's Internationalization standards. We don't expect you to suddenly have the talents of a 3PO-series protocol droid and translate all of your code's text into six million languages but we do expect you to structure your code in such a way that other fluent individuals easily can.

Doing so in a reliable and consistent way can (and, for monoglots, will) require a significant shift in perspective:

You must develop with the absolute certainty that the user's language will not be your own.

MT Base Class Methods

To start with, Melody's internationalization framework is integrated directly into heart of both the MT and MT::Component base classes meaning that the methods it provides are inherited by pretty much everything including:

  • All MT::App-subclassed applications (e.g. MT::App::CMS, MT::App::Comments, etc)
  • All addons and plugins
  • All command-line utilities which subclass MT
  • Any MT object instance

The last item means that even the most basic scripts which interact with Melody have access to these methods through the object instance returned in the bootstrapping step.

These top-level methods are documented in the [MT POD documentation] ((FIXME: link to POD) but are listed and briefly described below:

set_language($tag)

Globally sets the language and L10N context for the current program's execution affecting all localized strings and error messages output by the system. The system default is specified by the DefaultLanguage config directive.

translate($str[, $param, ...])

Used for translating strings (with optional parameters for variable sections) into the currently-set language.

translate_templatized($text)

Used for translating localized strings in Melody template code (i.e. <__trans> tags). The returned template code will contain localized strings in place of the <__trans> tags.

trans_error( $str[, $arg1, $arg2] )

Shortcut for $class->error( $class->translate( $str[, $arg1, $arg2]) ). errtrans is another alias for the same.

current_language

Used to get the language tag for the currently-set language.

supported_languages

Used to retrieve a mapping of the known language tags to their proper

language_handle

Used to retrieve an MT::L10N object instance of the currently active language.

MT::I18N Base Class Methods

Most of Perl's built-in string-related functions and operators (e.g. substr, length, regex operators, etc) are not compatible with non-Western character sets and trying to use them with languages like Japanese and Russian will lead to highly inconsistent results and many garbled strings.

Instead, we rely on a set of utility functions provided by MT::I18N to consistently transform text across all supported languages. MT::I18N provides an abstract interface to language-specific subclasses that define the gory implementation details and actually handle the transformations.

Melody developers should strictly rely on these methods for their respective actions. Failure to do so will very likely cause entire nations and even global regions to curse your name and Melody's name. We don't want that.

These methods are fully covered in MT::I18N's POD documentation (FIXME: link to POD) but are listed and described briefly below.

guess_encoding($text)

Attempt to determine the character encoding of the given text.

encode_text($text, $from, $to)

Transcode the given text from one encoding to another.

substr_text($text, $offset, $length)

Return the substring of the given text.

wrap_text($text, $columns, $tab_init, $tab_width)

Return the wrapped version of the given text.

length_text($text)

Return the length of the given text.

first_n($text, $n)

Return the first n characters of the given text.

first_n_text($text, $n)

Return the first n characters of the given text.

break_up_text($text, $max_length)

Return the text up to the given max_length.

convert_high_ascii($text)

Convert the given text from "high ASCII" encoding.

decode_utf8($text)

Decode UTF-8 in the given text.

utf8_off($text)

Turn off UTF-8 encoding in the given text

languages_list($app, $current)

A convenience method used for retrieving data about the application's supported languages that can be easily rendered by the Melody templating engine as a dropdown list for choosing a language (as seen on the user profile, system settings page, etc).

const($id)

Return the value of the given id method from the MT::I18N package for the current language.

decode($enc, $text)

Decode the given text from the charset specified in enc to UTF-8 string.

encode($enc, $text)

Encode the given text that is a UTF-8 string to the charset specified in enc.

lowercase($text, $enc)

Convert text to lowercase if the current language and enc has such a concept.

uppercase($text, $enc)

Convert text to uppercase if the current language and enc has such a concept.

Localization in Melody

Localization in Melody is driven by the MT::L10N class which is a Melody-specific subclass of the CPAN module, Locale::Maketext which is bundled with Perl. This class (and MT::L10N by extension) creates a framework by which localizable strings can be easily:

  • Abstracted from the code which outputs them
  • Translated into multiple languages
  • Applied dynamically to output when needed

It is the class that does all of the heavy lifting for the I18N/L10N methods in the MT base class (e.g. set_language, translate, translate_templatized, supported_languages, etc) and is the provider of the "language handler" through which all of this is performed.

Given all of that, it might not be a terrible idea to put the Locale::Maketext POD on your short reading list.

Core Code Localization

Because of the internationalization work done in the core of Melody and Locale::Maketext's featureset, localizing the application is fairly simple. There are only two things that you need to do:

  • Wrap all localizable strings with one of the three translate method calls
  • Maintain the list of localizable strings in the U.S. English localization file which others will translate

Localization of the core code is abstracted by MT::L10N and implemented by its language-specific subclasses which, by convention, take the form MT::L10N::LANG where LANG is the language code (e.g. en_us, de, fr, ja, etc).

Each of these classes have, as their key characteristic, a package variable %Lexicon which serves as a lookup table mapping each localizable string in the application to its translation for that language. For example:

Adding strings to the above wrapping


To be continued - Jay


Plugin Code Localization

Miscellaneous Notes

 


Questions, comments, can't find something? Let us know at our community outpost on Get Satisfaction.

Credits

  • Author: Jay Allen
  • Edited by: Violet Bliss Dietz
Clone this wiki locally