Skip to content

Commit

Permalink
Merge pull request #259
Browse files Browse the repository at this point in the history
Added support for Multi-person skintones
Removed support for Python 2, 3.4, 3.5
  • Loading branch information
TahirJalilov committed Jun 8, 2023
2 parents 839fd34 + fa8c5a0 commit 3b2324c
Show file tree
Hide file tree
Showing 27 changed files with 29,450 additions and 28,621 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pythonTests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
max-parallel: 8
matrix:
python-version: [2.7, 3.5, 3.6, 3.7, 3.8, 3.9, "3.10", "3.11", "3.12-dev"]
python-version: [3.6, 3.7, 3.8, 3.9, "3.10", "3.11", "3.12-dev"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
Expand Down
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ Developing
$ cd emoji
$ python -m pip install -e .\[dev\]
$ pytest
$ coverage run -m pytest
$ coverage report
The ``utils/get_codes_from_unicode_emoji_data_files.py`` is used to generate
``unicode_codes/data_dict.py``. Generally speaking it scrapes a table on the
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 41 additions & 27 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,33 +7,47 @@ API Reference
:noindex:


+-----------------------------+--------------------------------------------------------------+
| Table of Contents | |
+=============================+==============================================================+
| **Functions:** | |
+-----------------------------+--------------------------------------------------------------+
| :func:`emojize` | Replace emoji names with Unicode codes |
+-----------------------------+--------------------------------------------------------------+
| :func:`demojize` | Replace Unicode emoji with emoji shortcodes |
+-----------------------------+--------------------------------------------------------------+
| :func:`replace_emoji` | Replace Unicode emoji with a customizable string |
+-----------------------------+--------------------------------------------------------------+
| :func:`emoji_list` | Location of all emoji in a string |
+-----------------------------+--------------------------------------------------------------+
| :func:`distinct_emoji_list` | Distinct list of emojis in the string |
+-----------------------------+--------------------------------------------------------------+
| :func:`emoji_count` | Number of emojis in a string |
+-----------------------------+--------------------------------------------------------------+
| :func:`is_emoji` | Check if a string/character is a single emoji |
+-----------------------------+--------------------------------------------------------------+
| :func:`version` | Find Unicode/Emoji version of an emoji |
+-----------------------------+--------------------------------------------------------------+
| **Module variables:** | |
+-----------------------------+--------------------------------------------------------------+
| :data:`EMOJI_DATA` | Dict of all emoji |
+-----------------------------+--------------------------------------------------------------+
| :data:`STATUS` | Dict of Unicode/Emoji status |
+-----------------------------+--------------------------------------------------------------+
+-------------------------------+--------------------------------------------------------------+
| Table of Contents | |
+===============================+==============================================================+
| **Functions:** | |
+-------------------------------+--------------------------------------------------------------+
| :func:`emojize` | Replace emoji names with Unicode codes |
+-------------------------------+--------------------------------------------------------------+
| :func:`demojize` | Replace Unicode emoji with emoji shortcodes |
+-------------------------------+--------------------------------------------------------------+
| :func:`analyze` | Find Unicode emoji in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`replace_emoji` | Replace Unicode emoji with a customizable string |
+-------------------------------+--------------------------------------------------------------+
| :func:`emoji_list` | Location of all emoji in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`distinct_emoji_list` | Distinct list of emojis in the string |
+-------------------------------+--------------------------------------------------------------+
| :func:`emoji_count` | Number of emojis in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`is_emoji` | Check if a string/character is a single emoji |
+-------------------------------+--------------------------------------------------------------+
| :func:`version` | Find Unicode/Emoji version of an emoji |
+-------------------------------+--------------------------------------------------------------+
| **Module variables:** | |
+-------------------------------+--------------------------------------------------------------+
| :data:`EMOJI_DATA` | Dict of all emoji |
+-------------------------------+--------------------------------------------------------------+
| :data:`STATUS` | Dict of Unicode/Emoji status |
+-------------------------------+--------------------------------------------------------------+
| :class:`config` | Module wide configuration |
+-------------------------------+--------------------------------------------------------------+
| **Classes:** | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatch` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatchZWJ` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatchZWJNonRGI` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`Token` | |
+-------------------------------+--------------------------------------------------------------+


.. automodule:: emoji
Expand Down
144 changes: 114 additions & 30 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ emoji

Release v\ |version|. (:ref:`Installation <install>`)

emoji supports Python 2.7 and 3.4+
emoji supports Python 3.6+. The last version to support Python 2.7 and 3.5 was v2.4.0.

.. contents:: Table of Contents

Expand Down Expand Up @@ -72,6 +72,39 @@ Spanish (``'es'``), Portuguese (``'pt'``), Italian (``'it'``), French (``'fr'``)
Extracting emoji
^^^^^^^^^^^^^^^^

The function :func:`analyze` finds all emoji in string and yields the emoji
together with its position and the available meta information about the emoji.

:func:`analyze` returns a generator that yields each emoji, so you need to iterate or
convert the output to a list.

.. doctest::

>>> first_token = next(emoji.analyze('Python is 👍'))
Token(chars='👍', value=EmojiMatch(👍, 10:11))
>>> emoji_match = first_token.value
EmojiMatch(👍, 10:11)
>>> emoji_match.data
{'en': ':thumbs_up:', 'status': 2, 'E': 0.6, 'alias': [':thumbsup:', ':+1:'], 'variant': True, 'de': ':daumen_hoch:', 'es': ':pulgar_hacia_arriba:', 'fr': ':pouce_vers_le_haut:', 'ja': ':サムズアップ:', 'ko': ':올린_엄지:', 'pt': ':polegar_para_cima:', 'it': ':pollice_in_su:', 'fa': ':پسندیدن:', 'id': ':jempol_ke_atas:', 'zh': ':拇指向上:'}
>>> list(emoji.analyze('A 👩‍🚀 aboard a 🚀'))
[Token(chars='👩\u200d🚀', value=EmojiMatch(👩‍🚀, 2:5)), Token(chars='🚀', value=EmojiMatch(🚀, 15:16))]
>>> list(emoji.analyze('A👩‍🚀B🚀', non_emoji=True))
[Token(chars='A', value='A'), Token(chars='👩\u200d🚀', value=EmojiMatch(👩‍🚀, 1:4)), Token(chars='B', value='B'), Token(chars='🚀', value=EmojiMatch(🚀, 5:6))]
..
The parameter ``join_emoji`` controls whether `non-RGI emoji <#non-rgi-zwj-emoji>`_ are handled as a single token or as multiple emoji:

.. doctest::

>>> list(emoji.analyze('👨‍👩🏿‍👧🏻‍👦🏾', join_emoji=True))
[Token(chars='👨\u200d👩🏿\u200d👧🏻\u200d👦🏾', value=EmojiMatchZWJNonRGI(👨‍👩🏿‍👧🏻‍👦🏾, 0:10))]

>>> list(emoji.analyze('👨‍👩🏿‍👧🏻‍👦🏾', join_emoji=False))
[Token(chars='👨', value=EmojiMatch(👨, 0:1)), Token(chars='👩🏿', value=EmojiMatch(👩🏿, 2:4)), Token(chars='👧🏻', value=EmojiMatch(👧🏻, 5:7)), Token(chars='👦🏾', value=EmojiMatch(👦🏾, 8:10))]

..

The function :func:`emoji_list` finds all emoji in string and their position.
Keep in mind that an emoji can span over multiple characters:

Expand Down Expand Up @@ -227,6 +260,43 @@ You can find the version of an emoji with :func:`version`:
..

Non-RGI ZWJ emoji
^^^^^^^^^^^^^^^^^

Some emoji contain multiple persons and each person can have an individual skin tone.

Unicode supports `Multi-Person Skin Tones <http://www.unicode.org/reports/tr51/#multiperson_skintones>`__ as of Emoji 11.0.
Skin tones can be add to the nine characters known as `Multi-Person Groupings <https://www.unicode.org/reports/tr51/#MultiPersonGroupingsTable>`__.

Multi-person groups with different skin tones can be represented with Unicode, but are not yet RGI (recommended for general interchange). This means Unicode.org recommends not to show them in emoji keyboards.
However some browser and platforms already support some of them:

.. figure:: 1F468-200D-1F469-1F3FF-200D-1F467-1F3FB-200D-1F466-1F3FE.png
:height: 4em
:alt: A family emoji 👨‍👩🏿‍👧🏻‍👦🏾 with four different skin tone values

The emoji 👨‍👩🏿‍👧🏻‍👦🏾 as it appears in Firefox on Windows 11

It consists of eleven Unicode characters, four person emoji, four different skin tones joined together by three ``\u200d`` **Z**\ ero-\ **W**\ idth **J**\ oiner:

#. 👨 ``:man:``
#. 🏽 ``:medium_skin_tone:``
#. ``\u200d``
#. 👩 ``:woman:``
#. 🏿 ``:dark_skin_tone:``
#. ``\u200d``
#. 👧 ``:girl:``
#. 🏻 ``:light_skin_tone:``
#. ``\u200d``
#. 👦 ``:boy:``
#. 🏾 ``:medium-dark_skin_tone:``

On platforms that don't support it, it might appear as separate emoji: 👨🏽👩🏿👧🏻👦🏾

In the module configuration :class:`config` you can control how such emoji are handled.



Migrating to version 2.0.0
--------------------------

Expand Down Expand Up @@ -270,11 +340,11 @@ expression yourself like this:
# Sort emoji by length to make sure multi-character emojis are
# matched first
emojis = sorted(emoji.EMOJI_DATA, key=len, reverse=True)
pattern = u'(' + u'|'.join(re.escape(u) for u in emojis) + u')'
pattern = '(' + '|'.join(re.escape(u) for u in emojis) + ')'
return re.compile(pattern)
exp = get_emoji_regexp()
print(exp.sub(repl='[emoji]', string=u'A 🏌️‍♀️ is eating a 🥐'))
print(exp.sub(repl='[emoji]', string='A 🏌️‍♀️ is eating a 🥐'))
..
Output:
Expand Down Expand Up @@ -313,33 +383,47 @@ Reference documentation of all functions and properties in the module:

api

+-----------------------------+--------------------------------------------------------------+
| API Reference | |
+=============================+==============================================================+
| **Functions:** | |
+-----------------------------+--------------------------------------------------------------+
| :func:`emojize` | Replace emoji names with Unicode codes |
+-----------------------------+--------------------------------------------------------------+
| :func:`demojize` | Replace Unicode emoji with emoji shortcodes |
+-----------------------------+--------------------------------------------------------------+
| :func:`replace_emoji` | Replace Unicode emoji with a customizable string |
+-----------------------------+--------------------------------------------------------------+
| :func:`emoji_list` | Location of all emoji in a string |
+-----------------------------+--------------------------------------------------------------+
| :func:`distinct_emoji_list` | Distinct list of emojis in the string |
+-----------------------------+--------------------------------------------------------------+
| :func:`emoji_count` | Number of emojis in a string |
+-----------------------------+--------------------------------------------------------------+
| :func:`is_emoji` | Check if a string/character is a single emoji |
+-----------------------------+--------------------------------------------------------------+
| :func:`version` | Find Unicode/Emoji version of an emoji |
+-----------------------------+--------------------------------------------------------------+
| **Module variables:** | |
+-----------------------------+--------------------------------------------------------------+
| :data:`EMOJI_DATA` | Dict of all emoji |
+-----------------------------+--------------------------------------------------------------+
| :data:`STATUS` | Dict of Unicode/Emoji status |
+-----------------------------+--------------------------------------------------------------+
+-------------------------------+--------------------------------------------------------------+
| API Reference | |
+===============================+==============================================================+
| **Functions:** | |
+-------------------------------+--------------------------------------------------------------+
| :func:`emojize` | Replace emoji names with Unicode codes |
+-------------------------------+--------------------------------------------------------------+
| :func:`demojize` | Replace Unicode emoji with emoji shortcodes |
+-------------------------------+--------------------------------------------------------------+
| :func:`analyze` | Find Unicode emoji in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`replace_emoji` | Replace Unicode emoji with a customizable string |
+-------------------------------+--------------------------------------------------------------+
| :func:`emoji_list` | Location of all emoji in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`distinct_emoji_list` | Distinct list of emojis in the string |
+-------------------------------+--------------------------------------------------------------+
| :func:`emoji_count` | Number of emojis in a string |
+-------------------------------+--------------------------------------------------------------+
| :func:`is_emoji` | Check if a string/character is a single emoji |
+-------------------------------+--------------------------------------------------------------+
| :func:`version` | Find Unicode/Emoji version of an emoji |
+-------------------------------+--------------------------------------------------------------+
| **Module variables:** | |
+-------------------------------+--------------------------------------------------------------+
| :data:`EMOJI_DATA` | Dict of all emoji |
+-------------------------------+--------------------------------------------------------------+
| :data:`STATUS` | Dict of Unicode/Emoji status |
+-------------------------------+--------------------------------------------------------------+
| :class:`config` | Module wide configuration |
+-------------------------------+--------------------------------------------------------------+
| **Classes:** | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatch` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatchZWJ` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`EmojiMatchZWJNonRGI` | |
+-------------------------------+--------------------------------------------------------------+
| :class:`Token` | |
+-------------------------------+--------------------------------------------------------------+


Links
Expand Down
9 changes: 4 additions & 5 deletions emoji/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
# -*- coding: UTF-8 -*-


"""
emoji for Python
~~~~~~~~~~~~~~~~
Expand All @@ -20,8 +17,10 @@

__all__ = [
# emoji.core
'emojize', 'demojize', 'emoji_count', 'emoji_list',
'distinct_emoji_list', 'replace_emoji', 'version', 'is_emoji',
'emojize', 'demojize', 'analyze', 'config',
'emoji_list', 'distinct_emoji_list', 'emoji_count',
'replace_emoji', 'is_emoji', 'version',
'Token', 'EmojiMatch', 'EmojiMatchZWJ', 'EmojiMatchZWJNonRGI',
# emoji.unicode_codes
'EMOJI_DATA', 'STATUS', 'LANGUAGES',
]
Expand Down
Loading

0 comments on commit 3b2324c

Please sign in to comment.