Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot read and immediately write back out Word file with invalid XML entities in content #720

Open
tjarrett opened this issue Feb 2, 2016 · 1 comment

Comments

@tjarrett
Copy link
Contributor

tjarrett commented Feb 2, 2016

If you load a Word 2007 file that has invalid XML characters in the content (for example xml-characters-in.docx) and then immediately try to save that file back out doing something like this:

$test = \PhpOffice\PhpWord\IOFactory::load('xml-characters-in.docx');
$test->save('xml-characters-out.docx');

You get a corrupted Word file back (see xml-characters-out.docx).

This is probably related to #671, #401, #514, and other similar issues. However, I don't have a chance to scrub the content because I am reading and then immediately trying to save the result.

Maybe the Reader should scrub the incoming $textContent?


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@EasySoftwarePoland
Copy link

EasySoftwarePoland commented Nov 8, 2016

hello, any news on it? I've got the same problem... A simple sollution (not full) was a modification in src/PhpWord/Reader/Word2007/AbstractPart.php line 241 to:
$parent->addText(htmlspecialchars($textContent), $fontStyle, $paragraphStyle);
(added htmlspecialchars)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants