-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pasting from word results in unintended soft line breaks #13497
Comments
Is there a way to distinguish a paste from Word from other pastes with \n characters? |
The raw paste has heaps of meta data you could troll through to prove it's come from word. <html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 15">
<meta name=Originator content="Microsoft Word 15">
<link rel=File-List
href="file:///C:/Users/phixx/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<link rel=themeData
href="file:///C:/Users/phixx/AppData/Local/Temp/msohtmlclip1/01/clip_themedata.thmx">
<link rel=colorSchemeMapping
href="file:///C:/Users/phixx/AppData/Local/Temp/msohtmlclip1/01/clip_colorschememapping.xml">
<!--[if gte mso 9]><xml>
<w:WordDocument>
...
... |
Is there a way to distinguish between the generated line breaks and explicit ones added by the author. |
Thanks for the report. Could you also provide the data that the browser's console outputs when you paste that content? |
Any more context here, this doesn't seem actionable at the moment. |
There was a dedicated button for this for a long while in the Classic editor. Someone on WP Core should know more about this. |
Previously, this was functionality offered by the TinyMCE paste plugin (the toolbar button which prompted you to paste text form Word). In Gutenberg, it seems like something which would fit appropriately as part of the raw handling processing. Given prior comments #13497 (comment) and #13497 (comment) , it seems a reasonable action item could be: As part of raw handling, consider whether pasted fragment is from Microsoft Word (using metadata) and, if it is, collapse newlines within produced paragraphs. For what it's worth, it doesn't seem obvious to me that this has a negative impact in its current form (note "appearing fine in the editor" from the original post) due to the standard behavior of HTML in collapsing whitespace/newlines (reference). cc @ellatrix |
Original bug report: "While not visible in the editor, when the final content is passed through 'the_content' filter, |
Hmm, I'd not expect that to be the case, and could not reproduce it in my own environment, since the behavior which adds |
It's been a while since I encountered this issue (kind of gave up on Gutenberg), but it may have been while switching between types of blocks. |
On a client project, I recently ran into this issue myself. For this particular client project, there is an external frontend renderer that uses the raw post content, queried off the REST API. These stray |
@tfrommen: could you provide debugging data? For instance, the full console log, per #13497 (comment), as well as clipboard data. Ideally, the way to make this actionable would be to write a test that covers this expectation. See https://github.com/WordPress/gutenberg/tree/master/test/integration/fixtures |
Hey @mcsf - this is what I got: Received HTML (click to expand):
See the newline after "so that we"? This is the Word file I copied the text from: Later this week, I will try to add some tests, and quite possibly a fix for this. 👍 |
@mcsf @aduth @danicholls @mattbolt - I just created a PR for this. I needed to fix some (false) expectations in the fixtures first, then target MS Word content and have Gutenberg handle it correctly. It'd be awesome if you could have a go with this. 👍 |
Describe the bug
An unfortunate consequence of how word copies content into the clipboard, is that it autowraps text content at the 80 character column position. The result is that when content is then pasted into the editor, those line breaks are included in the imported content. While not visible in the editor, when the final content is passed through "the_content" filter,
<br />
tags are added causing unintended breaks throughout the content on the rendered page.To Reproduce
Steps to reproduce the behavior:
Expected behavior
When pasting from Microsoft Word, the editor should ignore \n characters. As you can see in the screenshots below, when a user explicitly provides a soft-return in Word, Word includes it as a
<br />
tag. Hence ignoring all \n should not present an issue.Screenshots
Microsoft Word Document
Excerpt of the copied content in the clipboard (Notice the explicit
<br />
in the first paragraph)Content appearing fine in the editor
Excerpt of content as it appears in code editor (note the line breaks within the paragraph
Excerpt of content rendering in post preview (notice the unwanted
<br />
in effect)Desktop Specs
Additional context
The text was updated successfully, but these errors were encountered: