The trans module

This module translates national characters into similar sounding latin characters (transliteration). At the moment, Czech, Greek, Latvian, Polish, Turkish, Russian, Ukrainian, Kazakh and Farsi alphabets are supported (it covers 99% of needs).

Contents

Simple usage
Define user tables
ChangeLog
Finally

Simple usage

It's very easy to use

Python 3:

>>> from trans import trans
>>> trans('Привет, Мир!')

Python 2:

>>> import trans
>>> u'Привет, Мир!'.encode('trans')
u'Privet, Mir!'
>>> trans.trans(u'Привет, Мир!')
u'Privet, Mir!'

Work only with unicode strings

>>> 'Hello World!'.encode('trans')
Traceback (most recent call last):
    ...
TypeError: trans codec support only unicode string, <type 'str'> given.

This is readability

>>> s = u'''\
...    -- Раскудрить твою через коромысло в бога душу мать
...             триста тысяч раз едрену вошь тебе в крыло
...             и кактус в глотку! -- взревел разъяренный Никодим.
...    -- Аминь, -- робко добавил из склепа папа Пий.
...                 (c) Г. Л. Олди, "Сказки дедушки вампира".'''
>>>
>>> print s.encode('trans')
   -- Raskudrit tvoyu cherez koromyslo v boga dushu mat
            trista tysyach raz edrenu vosh tebe v krylo
            i kaktus v glotku! -- vzrevel razyarennyy Nikodim.
   -- Amin, -- robko dobavil iz sklepa papa Piy.
                (c) G. L. Oldi, "Skazki dedushki vampira".

Table "slug"

Use the table "slug", leaving only the Latin characters, digits and underscores:

>>> print u'1 2 3 4 5 \n6 7 8 9 0'.encode('trans')
1 2 3 4 5
6 7 8 9 0
>>> print u'1 2 3 4 5 \n6 7 8 9 0'.encode('trans/slug')
1_2_3_4_5__6_7_8_9_0
>>> s.encode('trans/slug')[-42:-1]
u'_c__G__L__Oldi___Skazki_dedushki_vampira_'

Table "id"

Table id is deprecated and renamed to slug. Old name also available, but not recommended.

Define user tables

Simple variant

>>> u'1 2 3 4 5 6 7 8 9 0'.encode('trans/my')
Traceback (most recent call last):
    ...
ValueError: Table "my" not found in tables!
>>> trans.tables['my'] = {u'1': u'A', u'2': u'B'};
>>> u'1 2 3 4 5 6 7 8 9 0'.encode('trans/my')
u'A_B________________'
>>>

A little harder

Table can consist of two parts - the map of diphthongs and the map of characters. Diphthongs are processed first by simple replacement in the substring. Then each character of the received string is replaced according to the map of characters. If character is absent in the map of characters, key None are checked. If key None is not present, the default character u'_' is used.

>>> diphthongs = {u'11': u'AA', u'22': u'BB'}
>>> characters = {u'a': u'z', u'b': u'y', u'c': u'x', None: u'-',
...               u'A': u'A', u'B': u'B'}  # See below...
>>> trans.tables['test'] = (diphthongs, characters)
>>> u'11abc22cbaCC'.encode('trans/test')
u'AAzyxBBxyz--'

The characters are created by processing of diphthongs also processed by the map of the symbols:

>>> diphthongs = {u'11': u'AA', u'22': u'BB'}
>>> characters = {u'a': u'z', u'b': u'y', u'c': u'x', None: u'-'}
>>> trans.tables['test'] = (diphthongs, characters)
>>> u'11abc22cbaCC'.encode('trans/test')
u'--zyx--xyz--'

Without the diphthongs

These two tables are equivalent:

>>> characters = {u'a': u'z', u'b': u'y', u'c': u'x', None: u'-'}
>>> trans.tables['t1'] = characters
>>> trans.tables['t2'] = ({}, characters)
>>> u'11abc22cbaCC'.encode('trans/t1') == u'11abc22cbaCC'.encode('trans/t2')
True

ChangeLog

2.1 2016-09-19

Add Farsi alphabet (thx rodgar-nvkz)

Use pytest

Some code style refactoring

2.0 2013-04-01

Python 3 support

class Trans for create different tables spaces

1.5 2012-09-12

Add support of kazakh alphabet.

1.4 2011-11-29

Change license to BSD.

1.3 2010-05-18

Table "id" renamed to "slug". Old name also available.

Some speed optimizations (thx to AndyLegkiy <andy.legkiy at gmail.com>).

1.2 2010-01-10

First public release.

Translate documentation to English.

Finally

Special thanks to Yuri Yurevich aka j2a for the kick in the right direction.
- http://python.su/forum/viewtopic.php?pid=28965
- http://code.djangoproject.com/browser/django/trunk/django/contrib/admin/media/js/urlify.js
I ask forgiveness for my bad English. I promise to be corrected.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG		CHANGELOG
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
setup.cfg		setup.cfg
setup.py		setup.py
tests.py		tests.py
trans.py		trans.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The trans module

Simple usage

It's very easy to use

Work only with unicode strings

This is readability

Table "slug"

Table "id"

Define user tables

Simple variant

A little harder

Without the diphthongs

ChangeLog

Finally

About

Releases

Packages

Contributors 4

Languages

License

zzzsochi/trans

Folders and files

Latest commit

History

Repository files navigation

The trans module

About

Topics

Resources

License

Stars

Watchers

Forks

Languages