Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode combining characters treated individually when editing #2942

Closed
natevw opened this issue Feb 22, 2020 · 3 comments
Closed

Unicode combining characters treated individually when editing #2942

natevw opened this issue Feb 22, 2020 · 3 comments

Comments

@natevw
Copy link

natevw commented Feb 22, 2020

Starting at the end of a piece of text like

diacriticism

In a native macOS text control, it takes me 12 backspace presses to get rid of all the letters of the word. With Quill, it takes tons more, as it deletes each underlying code point one at a time.

To be fair… :-)

a) This is an extreme example (taken from http://demo.danielmclaren.com/2015/diacriticism/ via https://stackoverflow.com/a/51004127/179583) but could come up at a smaller scale when editing languages that use e.g. diacritical marks above letters for which there are no pre-composed characters. [Haven't tested if Quill normalizes characters to pre-composed for at least the cases where there are such characters.]

b) I also noticed that neither Chrome nor Firefox themselves handle this well either (!!), so I suspect they must have also implemented their own text handling themselves in a similar way as Quill did.

Flagging this for potential future enhancement at some point, or if it's indicative of related Unicode handling issues (xref #1230).

@david-jezek
Copy link

david-jezek commented Nov 1, 2023

These combined characters break formatting when used in edited text. The formatting or references after these characters are moved when text with the combining character is loaded into the Quill editor.
For example html code:

<p>c&#780;c&#780;</p>
<p><a href="http://page"><strong>link to issues</strong></a></p>

is incorrectly displayed. Link start on letter i not l.
It seams that position of displayed character is different then position of corresponding character in String.

The character č has normal representation č (one UTF character) but from some reason some one copied text from external source and insert it into page with Quill editor. The external editor used this strange form which broke formatting of text.

@natevw
Copy link
Author

natevw commented Nov 2, 2023

@david-jezek Sorry it's been a while since I was actively working around Quill stuff like this but re.

The character č has normal representation č (one UTF character) but from some reason some one copied text from external source and insert it into page with Quill editor. The external editor used this strange form which broke formatting of text.

You might be able to hook into Quill's paste event handlers and intercept in a way to call .normalize() on the string before letting it actually get inserted, if you don't have to support older browsers? Given the situation in Quill you probably want either the default "NFC" or perhaps "NFKC" but either way you'd want to prefer pre-composed characters whenever they're available to lessen encounters with this in some cases.

@quill-bot
Copy link

Quill 2.0 has been released (announcement post) with many changes and fixes. If this is still an issue please create a new issue after reviewing our updated Contributing guide 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants