-
Notifications
You must be signed in to change notification settings - Fork 0
3.9.1 Setup MMIR for Internationalization
Only few steps are necessary to add a new dictionary to your application and set up MMIR to use a new language. These steps are described in the following.
Language specific resources are located in the www/config/
directory of your application
(e.g. see config-dir in StarterKit example ).
-
/config/configuration.json
(see sect. Configuration) -
/config/languages/<LANGUAGE CODE>/dictionary.json
(see sect. Add a new Dictionary) -
/config/languages/<LANGUAGE CODE>/grammar.json
(see sect. Add a new Grammar)
To configure your application for using a specific language, open the
config/configuration.json
and edit the value for property language
: the value should match one of the sub-directories (by convention,
the sub-directories for languages have the language code as its name, e.g. en for English.
For example if you want to use English as the language of your application use en
, for German use de
etc.
The first step of supporting a new language in your application is to add a new dictionary to it.
MMIR searches for dictionaries in config/languages/...
and automatically loads all found dictionaries.
As an example, see the StarterKit
which provides three example dictionaries for English, German, and Japanese language support
(see its config/languages directory:
en/dictionary.json
, de/dictionary.json
, ja/dictionary.json
).
Dictionaries are well-formatted JSON-Files. The example below illustrates the English dictionary of the StarterKit:
{
"mmig": "MMIG",
"login_header": "Please login",
"password_place_holder": "password",
"user_name_place_holder": "user name",
"login_label": "Login",
"registration_text": "or register yourself",
"registration_label": "Sign Up",
"mainPanelAudioConfirmation": "Audio Confirmation",
"buttonOk": "Ok",
"buttonCancel": "Cancel",
"buttonBack": "back",
"ratingStar": "star",
"ratingStars": "stars",
"dialogCapture": "Capture",
"dialogPlay": "Play",
"welcome_header": "MMIG",
"welcome_date": "some Date",
"welcome_text":"Welcome to MMIG"
}
If you want to add support for French in your application: create a new dictionary file in the appropriate folder,
i.e. config/languages/fr/dictionary.json
.
The new dictionary will be available by its folder name, in this case fr
.
You can use this e.g. in the config/configuration.json
file, or the LanguageManager
.
For this example, first copy the contents of the English dictionary into the newly created file (or just copy the English dictionary file into the new sub-folder) and replace the English values with French translations:
{
"mmig": "MMIG",
"login_header": "S'il vous plaplaît connectez-vous",
"password_place_holder": "mot de passe"
}
Translations are looked up by their keys, as defined in the dictionary.json
.
You can reference translations in eHTML templates using the @localize
statement:
<label for="login">
@localize("login_label")
</label>
On rendering the value for the currently set language will be used to replace the statement.
For the setting en
as language and the example dictionary from above, the result would be:
<label for="login">
Login
</label>
In JavaScript code, you can use translations via the LanguageManager
. For example, in the StarterKit getting
the translation for login_label
would look like the following:
//retrieve the translation for login_label (for current language):
var translation = mmir.lang.getText('login_label');
// translation -> "Login" ...
If you want to use speech interactions in your application, you should also provide grammars
for all languages. Grammars are used to process the ASR from a speech recognizer: it "translates"
natural language input into "programmatic instructions", e.g. for input phrase please find movie XY
,
the result of executing the grammar could be something like
{
"search": "XY",
"displayResult": true
}
This is only an example, the concrete result is defined within the grammar and generally depends
on the application, i.e. you will have to encode the mapping phrase
→ result
by specifying the grammar.
After creating a new language directory (in
config/languages/
) or after creating a new file in a language directory, you need to re-generate the file listdirectories.json
- use the ANT taskgenerateFileListJSONFile
for automatically generating thedirectories.json
file.
Grammar definitions in MMIR are similar to context free grammars: input sentences are matched against grammar rules (in MMIR: utterances/phrases). The grammar rules (MMIR: phrases), may refer to other rules and/or to tokens-definitions. The following example shows a pseudo grammar for illustration:
TOKEN1: "some","few"
TOKEN2: "thing","things","object","objects"
TOKEN3: "else"
...
RULE1: TOKEN1 TOKEN3 TOKEN1
RULE2: RULE1 TOKEN2
MMIR searches for grammars in config/languages/[language]
and automatically loads all found grammars.
The StarterKit example provides
small example grammars for its supported languages (config/languages/[de|en|ja]/grammar.json
)
as well as a larger German example grammar
(grammar.json_large-example
:
you would need to rename it to grammar.json
in order to enable it when running the StarterKit).
Grammars are defined in the form of well-formatted JSON files. The following example is an excerpt
from the larger German example grammar.json_large-example
in the StarterKit and illustrates a possible grammar definition for "translating"
natural language phrases like "bitte abspielen" (please play), "spiele ab" (play) etc. into
an event object Play
(i.e. the result that would be returned, when the grammar matches
the phrase: {semantic: Play{}}
). This event object can then be used for further processing:
{
"stop_word": [
"bitte",
"doch",
"der",
"möchte",
...
],
"tokens": {
"PREPOSITION": [
"an",
"um",
"am",
"ab",
...
],
"V_PLAY_IMP": [
"spiel",
"spiele"
...
],
"V_PLAY_INF": [
"spielen",
"abspielen",
"hören",
...
],
...
},
"utterances": {
"PLAY": {
"phrases": [
"V_PLAY_INF",
"V_PLAY_IMP PREPOSITION"
],
"semantic": {
"Play": {}
}
}
...
}
}
TBD: descriptions for following details
- grammar definition details:
- stopwords, tokens, utterances, semantic-resuts
- lower-case "restriction": token values should all be lower case
- umlaut encoding (e.g. ä →)
- grammar loading/selection mechanism
- grammar compilation (compile grammar.json with build.xml)
In the current version of MMIR we use the JS/CC parser compiler (alternatively jsion and PEG.js are available by configuring the
/www/config/configuration.json
). Thegrammar-editor
branch of the starter-kit repository contains an HTML file for editing/testing the grammar inwww/testSemanticInterpreter.html
(see also the online demo for grammar editor) When runningcordova prepare
orcordova build
the grammars are automatically compiled, if their JSON definition has changed. Alternatively, you can use the Ant build file/mmir-build.xml
that provides tasks for compiling an executable grammar from the JSON grammar file.
< previous: "Template Expressions" | next: "Speech Processing in MMIR" >
- 1 Introduction
- 2 What is MMIR
- 3 MMIR Project Structure
- 4 Getting started