Send data URI as base64 #22

mrfrase3 · 2019-07-16T07:03:14Z

this prevents invalid URI characters from messing with page rendering, as seen in #21

this prevents invalid URI characters from messing with page rendering, as seen in westmonroe#21

cr7boulos · 2019-07-23T03:50:23Z

src/convertHTMLToPDF.js

@@ -19,7 +19,7 @@ let convertHTMLToPDF = async (html, callback, options = null, puppeteerArgs=null
    }

    if (remoteContent === true) {
-        await page.goto(`data:text/html,${html}`, {
+        await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, {


@mrfrase3 This would be great functionality to include in pdf-puppetteer. Two comments on the PR. First, base64-encoding should be an opt-in feature so that users can explicitly determine how their html should be intepreted. Base64-encoded strings can get rather long and some users may not be willing to accept that tradeoff. Secondly, adding a unit test to this feature would be valuable to ensure that these two methods for creating a pdf return the same results. I wrote up a sample unit test below. Feel free to tweak if you have another way to test this functionality and let me know if you have any comments or questions regarding my feedback.

Thanks

Updated function signature:
convertHTMLToPDF = async (html, callback, puppeteerOptions = null, puppeteerArgs=null, remoteContent=true, options = {})

then in the body, something like this could be done:

if (remoteContent === true) { if(options.base64Encode){ console.log('base 64 encoded'); await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, { waitUntil: 'networkidle0' }); } else { console.log('regular'); await page.goto(`data:text/html,${html}`); } }

Sample unit test:

test('Converts remote Base64 encoded HTML To PDF with Puppeteer Arguments', async done => { let base64 = null; let reg = null; let options = { 'base64Encode': true }; let t1 = convertHTMLToPDF(remoteTemplate, pdf => { expect(Object.prototype.toString.call(pdf)).toBe('[object Uint8Array]'); base64 = pdf; }, {format: 'A4'}, { args: [ '--no-sandbox', '--disable-setuid-sandbox' ] }, true, options); let t2 = convertHTMLToPDF(remoteTemplate, pdf => { expect(Object.prototype.toString.call(pdf)).toBe('[object Uint8Array]'); reg = pdf; }, {format: 'A4'}, { args: [ '--no-sandbox', '--disable-setuid-sandbox' ] }, true. options); await Promise.all([t1, t2]); expect(base64.length).toBe(reg.length); let result = true; for(let i = 0; i < base64.length; i++){ if(base64[i] !== reg[i]){ result = false; } } expect(result).toBe(true); done(); });

Hey @cr7boulos,

When you say base64 is long, do you mean time-wise or storage space-wise? Storage space is effectively irrelevant for sending to a local process, and the increase in size is ~1.37x the original.

The # error is an invisible problem, this took me hours of debugging to track down. At the end of the day, HTML can contain illegal characters for a URI, so your default should always operate on the assumption that it does contain those characters. One could implement something that scans and sends a warning, but that would be costly. If someone wants to take on the risk for a speed up, then they should knowingly enable that.

I'm pushed on a deadline at the moment, so I can't devote more time to this PR.

@mrfrase3,

Sounds good. I am a new maintainer of the package so still learning how the code works. Please disregard my previous comments. I will approve and push out an updated version to npm. Thanks for submitting.

cr7boulos

Please see the comment on the PR.

cr7boulos · 2019-07-24T00:19:42Z

src/convertHTMLToPDF.js

@@ -19,7 +19,7 @@ let convertHTMLToPDF = async (html, callback, options = null, puppeteerArgs=null
    }

    if (remoteContent === true) {
-        await page.goto(`data:text/html,${html}`, {
+        await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, {


@mrfrase3,

Sounds good. I am a new maintainer of the package so still learning how the code works. Please disregard my previous comments. I will approve and push out an updated version to npm. Thanks for submitting.

Send data URI as base64

32e3270

this prevents invalid URI characters from messing with page rendering, as seen in westmonroe#21

cr7boulos self-requested a review July 23, 2019 03:36

cr7boulos reviewed Jul 23, 2019

View reviewed changes

cr7boulos requested changes Jul 23, 2019

View reviewed changes

cr7boulos approved these changes Jul 24, 2019

View reviewed changes

cr7boulos merged commit e83e0cd into westmonroe:master Jul 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Send data URI as base64 #22

Send data URI as base64 #22

mrfrase3 commented Jul 16, 2019

cr7boulos Jul 23, 2019

mrfrase3 Jul 23, 2019

cr7boulos Jul 24, 2019

cr7boulos left a comment

cr7boulos Jul 24, 2019

Send data URI as base64 #22

Send data URI as base64 #22

Conversation

mrfrase3 commented Jul 16, 2019

cr7boulos Jul 23, 2019

Choose a reason for hiding this comment

mrfrase3 Jul 23, 2019

Choose a reason for hiding this comment

cr7boulos Jul 24, 2019

Choose a reason for hiding this comment

cr7boulos left a comment

Choose a reason for hiding this comment

cr7boulos Jul 24, 2019

Choose a reason for hiding this comment