Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unicode Regex miscounting emoji length #2942

Merged
merged 1 commit into from
Aug 15, 2023

Conversation

calculuschild
Copy link
Contributor

@calculuschild calculuschild commented Aug 14, 2023

Description

Many emojis are 2+ unicode chars long. The \u tag which allows searching for punctuation also counts emojis as single chars, which throws off character count when slicing strings. Spreading the strings into an array restores the correct character count. There is probably some overhead slowdown but not that I can detect.

Eventually the \v regex tag (in Node 20) can replace\u to get an accurate char count natively.

Contributor

  • Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • no tests required for this PR.
  • If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

Many emojis are 2+ unicode bytes long. The \u tag which allows searching for punctuation also counts emojis as single chars. Slicing the strings into an array restores the correct character count.
@vercel
Copy link

vercel bot commented Aug 14, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
marked-website ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 14, 2023 6:35pm

@UziTech UziTech linked an issue Aug 14, 2023 that may be closed by this pull request
@UziTech UziTech merged commit f3af23e into markedjs:master Aug 15, 2023
github-actions bot pushed a commit that referenced this pull request Aug 15, 2023
## [7.0.3](v7.0.2...v7.0.3) (2023-08-15)

### Bug Fixes

* Fix unicode Regex miscounting emoji length ([#2942](#2942)) ([f3af23e](f3af23e))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improper emoji rendering with v5.1.0
4 participants