-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: normalize emoji characters for consistency #37
Conversation
Summary - replace literal emojis with Pro:
Possible Cons:
Big Worry:
We don't have skintone modifiers in our emojicode completion widget so those can only be entered by people typing them as literal emojis. I usually type emojis as literals on OSX (with ctrl-cmd-space) and people on phones probably do also. I'm not set up to build and test this code right now - can someone see what happens with compound emojis? Here's one to test with which is a sequence of [man, skintone-modifier, microphone]: Dark Skinned Male Singer On recent platforms it renders as a single character with rockstar hair. It's ok enough if it renders as 2 emojis [dark-skinned-man, microphone]. |
Sorry, I could've been more specific in my description. This replaces literal emojis with
This is a huge concern. thanks for pointing this out! I was able to test and I can confirm that it does not erase compound emoji, it just leaves them alone as-is. Before
After
|
Just pushed a commit that now supports compound emoji! InputDark Skinned Male Singer 👨🏾🎤 ssbMd.block(input, { emoji: e => `<span class="Emoji">${e}</span>` }) Output<p>Dark Skinned Male Singer <span class="Emoji">👨🏾🎤</span></p>
🎉 🎉 🎉 |
🎉 Awesome, yay! Which OS did you test on? I'm worried that this might break on one of the OSs for some reason but I don't understand the layers very well. Besides that, 👍 |
I'm confused by this. I'm admittedly pretty tired today. Is "This replaces literal emojis with :emojicodes: on input, that way they get handled the same way that emojicodes do." anything more than an implementation detail? And I don't like that they're all converted to images (instead of emoji+noto) but that's what the patchwork PR is "fixing", if I understand correctly? |
I tested on Linux, but I since the tests are happening on string literals I don't think those should change between operating systems. It's possible that there's some operating system that will generate a funky byte sequence that we don't read as emoji, but worse case scenario we should fall back to the system emoji font.
I think that sentence had info about both. Doing a transform on input is an implementation detail, but unicode characters being handled the same way as shortcodes is a behavior change.
Yep! This PR gives the client the option to handle unicode characters as emoji, which they could use to reference an image (boo) or they could just make sure the emoji font is available for those characters (woo). |
Thanks for being so thoughtful about this change! This library only renders Markdown posts as HTML, it doesn't change anything on text input.When I mentioned "on input" above I only meant "on input into this library", although I totally understand how that sounds like "on input into an SSB composer". Using your graph, I think this library only handles I agree that it would be best to normalize as actual emoji characters in the composer before publishing, but I don't think there's a way to do that in this module. |
Aha, I was confused! Sorry for derailing. I've read the code now. So it's about to render Markdown to HTML. First it converts unicode emoji to shortcodes...
Renders to HTML...
Then it looks for emojis using emojiRegex, and runs the custom emoji handler function (from
It would be nice to document in README what kind of input the I'm afraid I'm being a drag on the process here, please ignore me if you want ❤️ |
This wasn't actually being used like I thought it was, huge thanks to @cinnamon-bun for taking the time to point this out! Yay, fewer moving parts. Warning: super similar code in lib/block and lib/inline that should probably be refactored in the future. Not similar enough that it's easy to refactor right now. (Read: energy dwindling at the expense of best practices.)
You're right! That part wasn't being used at all, I've just removed it and pushed a commit with comments. There was previously an emoji handler that worked the same way, but it would be really nice to have more of that info in the readme. Previously |
Major version bump sounds like a good idea. There are a lot of things depending on this package |
Cool! I'll merge and publish as a major version. Thanks for all of your feedback on this PR! I know it takes loads of time to discuss code and I really appreciate how much time you've spent thinking about this. ❤️ |
Previously when you selected a suggestion it would add the shortcode to the composer, which was *fine* but means that we all have to agree on shortcodes. This changes it so that the emoji suggestions put the actual emoji into the composer. Note that this doesn't stop someone from manually typing in `:ghost:` or something, but maybe in the future we could replace those shortcodes with actual emoji so that we're never publishing shortcodes to our feeds. Originally brought up by @cinnamon-bun here: > If we're changing the input (affecting the data that gets into the SSB > message) it seems safer to use unicode chars as the canonical format because > > 1. The shortcode libraries generally lag behind the unicode emoji standard > 2. The unicode emoji standard is more well-defined than shortcode names, > which might not match across different clients > > -- ssbc/ssb-markdown#37 (comment)
this means that both
:coffee:
and☕
will render the same way! Previously:coffee:
was an image and☕
wasn't, which caused some funky bugs.