-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-person skin tones #204
Comments
Could you open this again, it's actually still a bug. Not sure why I referenced it in v2.0.0, that was wrong |
@TahirJalilov The problem: For example a family emoji can look like this
The That means when we do
Or remove the
If we keep the Also emoji.replace_emoji(family_emoji, 'X') == 'XXXX'
#OR
emoji.replace_emoji(family_emoji, 'X') == 'X\u200dX\u200dX\u200dX' Note that My current solution could support both behaviors. So I would suggest to have switch/parameter to control it: emoji.demojize(family_emoji, keep_zwj=True)
emoji.replace_emoji(family_emoji, keep_zwj=False)
# Or a global switch
import emoji
emoji.config.demojize.keep_zwj = True
emoji.config.replace_emoji.keep_zwj = False I think the global config is better, because it is not something that you want to change more than once.
|
Let me pre-face that what I am doing is replacing emoji with alternatives ones. f.e. replacing a Generally I think it makes sense to be able to do something like For my use case a list could be convenient since I would want to iterate over all of the "sub-emoji" to create permutations of modified emojis. |
Thanks for you input! I agree a list of the "sub-emoji" might be nice. Possibly a list could be available in the callback of |
in my case I am processing emoji one at a time using https://github.com/explosion/spacymoji Note they currently creating separate tokens for |
I will add a new function to the module (probably call it My progress so far: |
FYI these changes will remove support for Python 2.7 and probably 3.5. Ref #243 |
The logic from demojize() is moved to two separate function tokenize and filter_tokens in a new file emoji/tokenizer.py Also the logic for the search tree is moved to that file. A new public function analyze() is available, that supports the multi-person skintones The handling of the multi-person skintones can be controlled by the new `emoji.config` class, which is a static class that works as a module-wide configuration.
Multi-Person Skin Tones on unicode.org
Edit: here's a tool to create these: https://codepen.io/cvzi/full/RwQNJBK
These are currently not RGI by unicode (Recommended for General Interchange), which means they should not be generated with
emojize()
.However they work in some phones and browsers. For example a family of 4 persons with 4 different skin tones: 👨🏽👩🏿👧🏻👦🏾
This emoji consists of:
:man_medium_skin_tone:
:woman_dark_skin_tone:
:girl_light_skin_tone:
:boy_medium-dark_skin_tone:
demojize()
currently converts that emoji to:'👨:medium_skin_tone:\u200d👩:dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:'
Possible solutions:
Convert the man and woman as well to (minimal solution):
:man::medium_skin_tone:\u200d:woman::dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:
or combine the skin tones into man and woman as well:
:man_medium_skin_tone:\u200d:woman_dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:
remove the skin tones
:family_man_woman_girl_boy:'
Or with the skin tones:
:family_man_woman_girl_boy_medium_dark_light_skin_medium-dark_tone:'
Edit:
Probably the easiest one is this:
:man_medium_skin_tone:\u200d:woman_dark_skin_tone:\u200d:girl_light_skin_tone:\u200d:boy_medium-dark_skin_tone:
Have to decide if we want to remove the
\u200d
or not.If we keep the
\u200d
,emojize()
can revert the string correctly i.e.emojize(demojize(str)) == str
.I don't know what's the effect of having them though,
:\u200d:
might be displayed strangely.The text was updated successfully, but these errors were encountered: