Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send data URI as base64 #22

Merged
merged 1 commit into from
Jul 24, 2019
Merged

Send data URI as base64 #22

merged 1 commit into from
Jul 24, 2019

Conversation

mrfrase3
Copy link
Contributor

this prevents invalid URI characters from messing with page rendering, as seen in #21

this prevents invalid URI characters from messing with page rendering, as seen in westmonroe#21
@cr7boulos cr7boulos self-requested a review July 23, 2019 03:36
@@ -19,7 +19,7 @@ let convertHTMLToPDF = async (html, callback, options = null, puppeteerArgs=null
}

if (remoteContent === true) {
await page.goto(`data:text/html,${html}`, {
await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrfrase3 This would be great functionality to include in pdf-puppetteer. Two comments on the PR. First, base64-encoding should be an opt-in feature so that users can explicitly determine how their html should be intepreted. Base64-encoded strings can get rather long and some users may not be willing to accept that tradeoff. Secondly, adding a unit test to this feature would be valuable to ensure that these two methods for creating a pdf return the same results. I wrote up a sample unit test below. Feel free to tweak if you have another way to test this functionality and let me know if you have any comments or questions regarding my feedback.

Thanks

  1. Updated function signature:
    convertHTMLToPDF = async (html, callback, puppeteerOptions = null, puppeteerArgs=null, remoteContent=true, options = {})

then in the body, something like this could be done:

if (remoteContent === true) {
        if(options.base64Encode){
            console.log('base 64 encoded');
            await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, {
                waitUntil: 'networkidle0'
            });
        }
        else {
            console.log('regular');
            await page.goto(`data:text/html,${html}`);
        }
        
    }
  1. Sample unit test:
test('Converts remote Base64 encoded HTML To PDF with Puppeteer Arguments', async done => {
        let base64 = null;
        let reg = null;
        let options = { 'base64Encode': true };
        let t1 = convertHTMLToPDF(remoteTemplate, pdf => {
            expect(Object.prototype.toString.call(pdf)).toBe('[object Uint8Array]');
            base64 = pdf;
        }, {format: 'A4'}, {
            args: [
                '--no-sandbox',
                '--disable-setuid-sandbox'
            ]
        }, true, options);

        let t2 =  convertHTMLToPDF(remoteTemplate, pdf => {
            expect(Object.prototype.toString.call(pdf)).toBe('[object Uint8Array]');
            reg = pdf;
        }, {format: 'A4'}, {
            args: [
                '--no-sandbox',
                '--disable-setuid-sandbox'
            ]
        }, true. options);

        await Promise.all([t1, t2]);

        expect(base64.length).toBe(reg.length);

        let result = true;

        for(let i = 0; i < base64.length; i++){
            if(base64[i] !== reg[i]){
                result = false;
            }
        }

        expect(result).toBe(true);

        done();
    });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cr7boulos,

When you say base64 is long, do you mean time-wise or storage space-wise? Storage space is effectively irrelevant for sending to a local process, and the increase in size is ~1.37x the original.

The # error is an invisible problem, this took me hours of debugging to track down. At the end of the day, HTML can contain illegal characters for a URI, so your default should always operate on the assumption that it does contain those characters. One could implement something that scans and sends a warning, but that would be costly. If someone wants to take on the risk for a speed up, then they should knowingly enable that.

I'm pushed on a deadline at the moment, so I can't devote more time to this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrfrase3,

Sounds good. I am a new maintainer of the package so still learning how the code works. Please disregard my previous comments. I will approve and push out an updated version to npm. Thanks for submitting.

Copy link
Collaborator

@cr7boulos cr7boulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comment on the PR.

@@ -19,7 +19,7 @@ let convertHTMLToPDF = async (html, callback, options = null, puppeteerArgs=null
}

if (remoteContent === true) {
await page.goto(`data:text/html,${html}`, {
await page.goto(`data:text/html;base64,${Buffer.from(html).toString('base64')}`, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrfrase3,

Sounds good. I am a new maintainer of the package so still learning how the code works. Please disregard my previous comments. I will approve and push out an updated version to npm. Thanks for submitting.

@cr7boulos cr7boulos merged commit e83e0cd into westmonroe:master Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants