-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode character seems to swallow other characters during a round-trip conversion from HTML to RTF and back #8264
Comments
Simple repro:
The c disappears. |
Relevant part of tokenization:
|
Relevant parts of the spec:
|
In this case |
So it's eating the |
Maybe it's a RTF writer issue? The parameter (160) is supposed to have a delimiter, which is a space or nonalphabetic, nonnumeric character. Here that's going to be the '?', which I think is actually meant to stand in for the character if it can't render the unicode character. Putting a space before the |
Wow that was quick. Thanks! |
Explain the problem.
When I have a non-breaking space in HTML and convert to RTF and then back to HTML, this causes adjoining characters to be swallowed:
I feel that the
c
ofcurious
shouldn't be destroyed somehow during this round-trip.Pandoc version?
pandoc 2.19.2
macOS 12.5.1 (Monterey) Apple M1 Max (ARM)
The text was updated successfully, but these errors were encountered: