-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docx reader: smartTags omitted #3412
Labels
Comments
The problem is probably the
|
mb21
changed the title
Auto-capitalised words in .docx omitted on conversion
Docx reader: smartTags omitted
Feb 3, 2017
ah, this is a duplicate of #2242 then... |
Thank you very much for fixing this, @jgm! Random letters in my docx (beyond my control) are in "SmartTag" tags (for no discernible reason), and their disappearance was driving my crazy. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Deep apologies if this is known - I did search but it's tricky to search for.
I was converting a .docx to markdown (tried other formats after finding the bug and they all seem affected) and on conversion, certain words, all proper nouns, were being omitted.
I don't know for sure, I don't have access to MS Word, but I suspect what's happening is that Word is auto-capitalising these proper nouns and then perhaps marking them in some way with some hidden character and pandoc is reading them as garbage.
If I use google docs as a filter - i.e. upload the .docx to google docs then download it again (still as .docx) the problem is resolved.
I will attach the afflicted .docx file. It's under a CC-ND-SA license.
An example passage where the bug occurs is:
"but it’s worth remembering that King James, to whom Bacon dedicated the book—and who was at the time one of the finest scholars in Europe—was completely baffled"
On conversion, the word "Europe" is omitted.
sh.docx
Thanks!
The text was updated successfully, but these errors were encountered: