Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source: add 2 lines expressing the language script code and language code #31

Open
michal-fre opened this issue Sep 25, 2015 · 4 comments

Comments

@michal-fre
Copy link
Contributor

Parsing columns for languages will be improved by adding lines containing language-script-code and language-code

it needs to be discussed what codes to use as there are different systems.
ISO 639-3 (3-letters) and ISO_15924 (up to 4 letters) are good candidates as dialects/different versions are catched

The information on what codes are used in a translation can be provided by the translators or manual lookup.

ToDo: how to handle phonetics: as for now i propose using
Latn as scriptcode
the ISO 639 + phon: fas_phon, prs_phon

example using ISO_15924 (scriptcode) and ISO 639
German | English | Arabic (farsi) | Dari | Dari-phonetics
Latn | Latn | Arab | Arab | Latn
ger | eng | fas | prs | prs_phon

ref:
http://www-01.sil.org/iso639-3/scope.asp#M
http://www.unicode.org/iso15924/iso15924-num.html
http://www-01.sil.org/iso639-3/macrolanguages.asp

@michal-fre
Copy link
Contributor Author

we can use this for hyphenation, font-switching, typographic fine-tuning, formating of localization (i.e. dates and numbers) and of course direction of writing

@michal-fre
Copy link
Contributor Author

This is a quick list of languages used. duplicates are because of use of different conventions or added comments. Parsing the above-mentioned codes makes it possible to replace titles with variables in the future.

German
--THIS IS BULGARIAN!!--Kurdish (Sorani)
Albanian
Amharic
Amharic Phonetic
Arabic
Arabic (Syrian)
Arabic / Syrian Phonetic
Arabic(Fusha)
Armenian
Armenian 80% ready /proofreed needed Please mark in green
Armenian phonetic
Bangla
Bangla / বাংলা
Bangla Phonetic
Bosnian / Croatian / Serbian
Bosnian/Croatian/Serbian
Bulgarian
Croatian/Bosnian
Czech
Czech / Slovak
Dari
Dari Phonetic
Dutch
English
Farsi
Farsi Phonetic
Farsi/Dari
Filipino
Finnish
French
German
Greek alphabet
Greek phonetic
Hindi
Hungarian
Hungarian
Icelandic
Icons
Italian
Korean
Kurdish (Kurmancî)
Kurdish (Kurmanji)
Kurdish (Sorani)
Kurdish / (Sorani)
Lithuanian
Macedonian
Macedonian
Macedonian (preliminary! / has to be proofread!)
Macedonian phonetical (PLEASE ADD !)
Mandinka
Norwegian / Danish
Numbers for short section
Pashto
Pashto Phonetic
Polish
Polish
Polish 1
Portuguese
Romanian
Russian
Serbian
Slovak
Slovak / Czech
Slovenian
Somali
Spanish
Swedish
Swedish / Norwegian / Danish
Syrian phonetic/Fusha phonetics
Syrian/Arabic alphabet
Syrian/Arabic phonetic
Tigrinya
Turkish
Twi
Urdu
Urdu Phonetic
Vietnamese
Vietnamese
Woloff
amharic
mandarin / (chinese)
አማርኛ Amharic

@michal-fre
Copy link
Contributor Author

partially in 05_get_the_columns.sh - see #50

OUTPUT:
deu ara ara_PHONETIC ara-ara_PHONETIC eng fra slv nld tir tir_PHONETIC tir-tir_PHONETIC amh amh_PHONETIC som som urd urd-PHONETIC urd-urd_PHONETIC ben ben_PHONETIC ben-ben_PHONETIC fas fas_PHONETIC prs prs_PHONETIC pus pus_PHONETIC sqi hbs pol mkd rus slk-ces sdh bul kmh swe-nor-dan fin isl ita tur spa hun por ell_PHONETIC ell ron hye hye_PHONETIC lit fil vie a a
1 deu 2 3 ara 4 ara_PHONETIC 5 ara-ara_PHONETIC 6 eng 7 fra 8 slv 9 nld 10 tir 11 tir_PHONETIC 12 tir-tir_PHONETIC 13 amh 14 amh_PHONETIC 15 som 16 som 17 urd 18 urd-PHONETIC 19 urd-urd_PHONETIC 20 ben 21 ben_PHONETIC 22 ben-ben_PHONETIC 23 fas 24 fas_PHONETIC 25 prs 26 prs_PHONETIC 27 pus 28 pus_PHONETIC 29 sqi 30 hbs 31 pol 32 mkd 33 rus 34 slk-ces 35 sdh 36 bul 37 kmh 38 swe-nor-dan 39 fin 40 isl 41 ita 42 tur 43 spa 44 hun 45 por 46 ell_PHONETIC 47 ell 48 ron 49 hye 50 hye_PHONETIC 51 lit 52 fil 53 vie

@michal-fre
Copy link
Contributor Author

single language: 3-letters
multiple languages with dash as delimiter: swe-nor-dan
phonetic= language(3letters) with underline: ara_PHONETIC
if language is unknown: leave empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant