-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latest version (2.5.1) of sanitize-html cuts extra word #542
Comments
Maybe, for the new version, we have to use it slightly differently to achieve the same result. Or maybe it's just a bug in their lib. |
It's probably a representation of a larger issue that we were lucky to catch in tests. Good to have a look so that recipients don't see truncated messages. |
@tomholub I will look into it, but please note for now we continue to use some previous version, so the problem does not show up now. |
Understood - good to know. It will be good to know why this is happening (and find a way to fix it, either in the lib or ourselves) because this is the kind of library that we should keep up to date, because newer versions should have more patches against exploits. |
Interesting - it is missing only in the plain text extracted from HTML. const { contentBlock, text } = fmtContentBlock(msgContentBlocks);
console.log(`>>>> contentBlock:\n${JSON.stringify(contentBlock)}\n\ntext:\n${text}`); prints:
HTML part above contains:
|
This one: text = dereq_html_sanitize(text, {
allowedTags: ['img', 'span'],
allowedAttributes: { img: ['src'] },
allowedSchemes: Xss.ALLOWED_SCHEMES,
transformTags: {
'img': (tagName, attribs) => {
return { tagName: 'span', attribs: {}, text: `[image: ${attribs.alt || attribs.title || 'no name'}]` };
},
}
}); removes "Above" from the: Below
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABHNCSVQICAgIfAhkiAAAAMFJREFUOE+lU9sRg0AIZDNpym9rSAumJm0hNfidsgic5w1wGJ1kZ3zgwvI4AQtIAHrq4zKY5uJ715sGP7C44BdPnZj1gaRVERBPpYJfUSpoGLeyir2Glg64mxMQg9f6xQbU94zrBDBWgVCBBmecbyGWbcrLgpX+OkR+L4ShPw3bdtdCnMmZfSig2a+gtcD1R0LyA1mh6OdmsJNnmW0Sfwp75LYevQ5AsUI3g0aKI+llEe3KQbcx28SsnZi9LNO/6/wBmhVJ7HDmOd4AAAAASUVORK5CYII=" alt="image.png" />
Above |
@tomholub I think it's bug in the |
sounds like - please give them a code sample they could run |
@tomholub Should I try to understand their sanitizer logic and try to fix that (will take time), or we will wait for their response? |
Let's wait a few weeks first. It may really be difficult. |
Un-assigning while we're waiting. |
There is new version 2.6.1 that solves the issue. I will try to integrate it. |
* issue #542 Upgrade to sanitize-html 2.5.2 * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * recreate package-lock.json * Recreate package-lock.json * Upgrade to santize-html 2.6.1 * remove .only * comment out debug stuff * Update prod bundle * remove debug stuff * package order * remove extra empty line * remove debugPrintArray * remove extra newline * Update prod bundle * #542 update fix-bundles.js * update semaphore node version * update fix-bundles.js Co-authored-by: Ivan Pizhenko <Ivan.Pizhenko@users.noreply.github.com> Co-authored-by: Roma Sosnovsky <roma.sosnovsky@gmail.com>
UPDATE so far there is already version 2.5.2, but the problem still happen.
Latest version (2.5.1) of sanitize-html cuts extra word:
Above
is cut awaySo I am going to revert and restore the fix.
It can be separate issue to upgrade it and fix this.
Originally posted by @IvanPizhenko in #541 (comment)
The text was updated successfully, but these errors were encountered: