Skip to content

3.9.1 Setup MMIR for Internationalization

russa edited this page Jan 15, 2019 · 3 revisions

Setup the MMIR Applications for Internationalization

Only few steps are necessary to add a new dictionary to your application and set up MMIR to use a new language. These steps are described in the following.

Language specific resources are located in the www/config/ directory of your application (e.g. see config-dir in StarterKit example ).

Configuration

To configure your application for using a specific language, open the config/configuration.json and edit the value for property language: the value should match one of the sub-directories (by convention, the sub-directories for languages have the language code as its name, e.g. en for English. For example if you want to use English as the language of your application use en, for German use de etc.

Add a new Dictionary

The first step of supporting a new language in your application is to add a new dictionary to it. MMIR searches for dictionaries in config/languages/... and automatically loads all found dictionaries.

As an example, see the StarterKit which provides three example dictionaries for English, German, and Japanese language support (see its config/languages directory: en/dictionary.json, de/dictionary.json, ja/dictionary.json).

Dictionaries are well-formatted JSON-Files. The example below illustrates the English dictionary of the StarterKit:

attention
{
	"mmig": "MMIG",
	"login_header": "Please login",
	"password_place_holder": "password",
	"user_name_place_holder": "user name",
	"login_label": "Login",
	"registration_text": "or register yourself",
	"registration_label": "Sign Up",
	"mainPanelAudioConfirmation": "Audio Confirmation",
	"buttonOk": "Ok",
	"buttonCancel": "Cancel",
	"buttonBack": "back",
	"ratingStar": "star",
	"ratingStars": "stars",
	"dialogCapture": "Capture",
	"dialogPlay": "Play",
	"welcome_header": "MMIG",
	"welcome_date": "some Date",
	"welcome_text":"Welcome to MMIG"
}

If you want to add support for French in your application: create a new dictionary file in the appropriate folder, i.e. config/languages/fr/dictionary.json. The new dictionary will be available by its folder name, in this case fr. You can use this e.g. in the config/configuration.json file, or the LanguageManager.

For this example, first copy the contents of the English dictionary into the newly created file (or just copy the English dictionary file into the new sub-folder) and replace the English values with French translations:

attention
{
  "mmig": "MMIG",
  "login_header": "S'il vous plaplaît connectez-vous",
  "password_place_holder": "mot de passe"
}

Using Translations

Translations are looked up by their keys, as defined in the dictionary.json.

You can reference translations in eHTML templates using the @localize statement:

attention
  <label for="login">
    @localize("login_label")
  </label>

On rendering the value for the currently set language will be used to replace the statement. For the setting en as language and the example dictionary from above, the result would be:

attention
  <label for="login">
     Login
  </label>

In JavaScript code, you can use translations via the LanguageManager. For example, in the StarterKit getting the translation for login_label would look like the following:

attention
  //retrieve the translation for login_label (for current language):
  var translation = mmir.lang.getText('login_label');
  // translation -> "Login" ...

Add a new Grammar

If you want to use speech interactions in your application, you should also provide grammars for all languages. Grammars are used to process the ASR from a speech recognizer: it "translates" natural language input into "programmatic instructions", e.g. for input phrase please find movie XY, the result of executing the grammar could be something like

attention
  {
    "search": "XY",
    "displayResult": true
  }

This is only an example, the concrete result is defined within the grammar and generally depends on the application, i.e. you will have to encode the mapping phraseresult by specifying the grammar.

attention

After creating a new language directory (in config/languages/) or after creating a new file in a language directory, you need to re-generate the file list directories.json - use the ANT task generateFileListJSONFile for automatically generating the directories.json file.

Grammar definitions in MMIR are similar to context free grammars: input sentences are matched against grammar rules (in MMIR: utterances/phrases). The grammar rules (MMIR: phrases), may refer to other rules and/or to tokens-definitions. The following example shows a pseudo grammar for illustration:

TOKEN1:	"some","few"
TOKEN2:	"thing","things","object","objects"
TOKEN3:	"else"
...
RULE1:  TOKEN1 TOKEN3 TOKEN1
RULE2:  RULE1 TOKEN2

MMIR searches for grammars in config/languages/[language] and automatically loads all found grammars.

The StarterKit example provides small example grammars for its supported languages (config/languages/[de|en|ja]/grammar.json) as well as a larger German example grammar (grammar.json_large-example: you would need to rename it to grammar.json in order to enable it when running the StarterKit).

Grammars are defined in the form of well-formatted JSON files. The following example is an excerpt from the larger German example grammar.json_large-example in the StarterKit and illustrates a possible grammar definition for "translating" natural language phrases like "bitte abspielen" (please play), "spiele ab" (play) etc. into an event object Play (i.e. the result that would be returned, when the grammar matches the phrase: {semantic: Play{}}). This event object can then be used for further processing:

attention
{
   "stop_word": [
        "bitte",
        "doch",
        "der",
        "möchte",
        ...
    ],
    "tokens": {
        "PREPOSITION": [
            "an",
            "um",
            "am",
            "ab",
            ...
        ],
        "V_PLAY_IMP": [
            "spiel",
            "spiele"
            ...
        ],
        "V_PLAY_INF": [
            "spielen",
            "abspielen",
            "hören",
            ...
        ],
        ...
    },
    "utterances": {
	"PLAY": {
	    "phrases": [
		   "V_PLAY_INF",
		   "V_PLAY_IMP PREPOSITION"
	    ],
	    "semantic": {
		   "Play": {}
	    }
	 }
	 ...
    }
}

TBD: descriptions for following details

  • grammar definition details:
    • stopwords, tokens, utterances, semantic-resuts
    • lower-case "restriction": token values should all be lower case
    • umlaut encoding (e.g. ä &#8594)
  • grammar loading/selection mechanism
  • grammar compilation (compile grammar.json with build.xml)
attention

In the current version of MMIR we use the JS/CC parser compiler (alternatively jsion and PEG.js are available by configuring the /www/config/configuration.json). The grammar-editor branch of the starter-kit repository contains an HTML file for editing/testing the grammar in www/testSemanticInterpreter.html (see also the online demo for grammar editor) When running cordova prepare or cordova build the grammars are automatically compiled, if their JSON definition has changed. Alternatively, you can use the Ant build file /mmir-build.xml that provides tasks for compiling an executable grammar from the JSON grammar file.



< previous: "Template Expressions" | next: "Speech Processing in MMIR" >

Clone this wiki locally