Skip to content

Remove inefficient/failing base64 check and add Buffer support to Parse.File() #1532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
FransGH opened this issue Aug 18, 2022 · 6 comments
Closed
4 tasks done
Labels
type:bug Impaired feature or lacking behavior that is likely assumed

Comments

@FransGH
Copy link

FransGH commented Aug 18, 2022

New Issue Checklist

Issue Description

new Parse.File()fails with large base64 string caused by complex regex test.

Steps to reproduce

Review the source code of ParseFile.js 3.3.4, line 210.

In 3.3.4, this code is executed in Parse.File() to verify the base64 encoded file data:

        // Check if data URI or base64 string is valid
        const validationRegex = new RegExp(base64Regex.source + '|' + dataUriRegex.source, 'i');

And here the regex:

const base64Regex = new RegExp('([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))', 'i');
const dataUriRegex = new RegExp(`^data:([a-zA-Z]+\\/[-a-zA-Z0-9+.]+(;[a-z-]+=[a-zA-Z0-9+.-]+)?)?(;base64)?,(${base64Regex.source})*$`, 'i');

The passed base64 is created from Uint8Array.toString("base64");

The problem has meanwhile been worked around by passing a byte array. This introduces memory overhead as a new byte array has to be created with Array.from(Uint8Array) as an UInt8Array is not accepted by Parse.File. Parse.File than uses its own base64 encoding algorithm to create a base64 string instead of the Internal javascript runtime version => Uint8Array.toString("base64");

In short:

  • Parse.File does a highly inefficient, unnecessary and failing base64 check, breaking core functionality
  • Parse.File does not accept a JS buffer making it necessary to convert it to a plain byte array causing memory overhead
  • base64 encoding in Parse.File does not use the JS runtime function but it's own (much less efficient) implementation

The base64 check should simply be removed. Worse case the base64 data is invalid and either:

  • save/upload fails (?)
  • the storage adapter fails (?)
  • a client reading it fails (most definately)

The 1st two can be caught by the client-app while uploading.
The last one can also happen with pure binary data and is the client-app responsibility to catch, not the Parse SDK.

If I find some time I'm happy to create a PR with native Buffer support, something along the line of:

if(Buffer.isBuffer(data) {
        this._data = Uint8Array.from(data).toString("base64");
        this._source = {
          format: 'base64',
          base64: this._data,
          type: specifiedType
        };
}

Actual Outcome

With a ~2MB base64 string (jpeg image) it took around 6 minutes (!!!) with 100% cpu (=full single core) until the regex failed. The base64 was however correct as any image saved with Parse JS-SDK 3.3.0 and earlier could be downloaded/viewed without error.

Expected Outcome

Seamless passing of base64 string, leaving it to the client to make sure the base64 is valid.
Alternative, a much more efficient base64 check than the current complex, failing regex and/or Buffer support.

Environment

MaxOS, node 14
Linux Node 12

Server
n/a, client issue (Parse-SDK-JS)

Database
n/a, client issue (Parse-SDK-JS)

Client

  • Parse JS SDK version: 3.3.4

Logs

n/a (Exception with "Cannot create a Parse.File without valid data URIs or base64 encoded data.")

@parse-github-assistant
Copy link

parse-github-assistant bot commented Aug 18, 2022

Thanks for opening this issue!

  • 🚀 You can help us to fix this issue faster by opening a pull request with a failing test. See our Contribution Guide for how to make a pull request, or read our New Contributor's Guide if this is your first time contributing.

@FransGH FransGH changed the title Seemingly unnecessary base64 check in Parse.File() Remove inefficient/failing base64 check and add Buffer support to Parse.File() Aug 18, 2022
@mtrezza mtrezza added the type:bug Impaired feature or lacking behavior that is likely assumed label Aug 18, 2022
@mtrezza
Copy link
Member

mtrezza commented Aug 18, 2022

I think you are right, I'm unsure whether a client-side base64 check provides much value. If this is related to a security aspect it would need to be done server-side anyway. It seems to me that the validation client-side is just a simple check for the developer, because even in case of a user initiated file upload, what kind of file would a user have to pick for a base64 validation to fail?

Did you take a look at other Parse SDKs to see what kind of validation they run - if any?

You could split this issue into 2: remove the base64 validation (if we agree on doing that) and add buffer support.

@FransGH
Copy link
Author

FransGH commented Aug 19, 2022

Checked php, swift and objectiveC, these all do binary upload. JS-SDK always does base64 uploads (assume for historic reasons, supporting a broad range of browsers).

Not sure how to best move forward. Think #1467 just needs to be rolled back as this basically broke things and doesn't add any significant value to a robust interface.

The root cause of related #1466 is not that Parse.File misinterpreted the data (wrong regex) but that the client app passed non-base64 data (an RFC2397 string) into the base64 data parameter. Adding a new complex/expensive/failing check to catch this specific error just makes things worse. The Parse-JS-SDK doc clearly states "an Object like { base64: "..." } with a base64-encoded String".

Seems that at some point a minimum support for RFC2397 strings was added but instead of creating a new Object type like { rfc2397: "..." }, it was hacked into the base64 object. And now it ending up with hack over hack...

I leave the suggestion for Buffer support for the moment for what it is. Will keep it in mind and maybe create a PR at some point.

@mtrezza
Copy link
Member

mtrezza commented Aug 19, 2022

Think #1467 just needs to be rolled back as this basically broke things and doesn't add any significant value to a robust interface.

@RazvanCristian What do you think? I'm unsure if we should for example add an option to disable any regex validation for the passed string, or if that would just be an unnecessary complication. Or maybe just remove the validation all together.

What matters most is probably what the server expects and how it handles (validates) the file data transmitted. It seems that there has been no change there in regards to the base64 validation, so any JS SDK validation seems to be just a check for the developer whether they correctly created a base64 string?

@mtrezza
Copy link
Member

mtrezza commented Sep 2, 2022

@FransGH Do you want to open a PR that completely removes the base64 validation?

I think your suggestion makes sense. I would also say that since it's a client-side validation, it doesn't seem to be a validation for security reasons, as such a validation would only make sense server-side. I'll wait a few days with the beta release so we can settle this issue.

cc @parse-community/js-sdk please comment if you see any concerns regarding removing validation.

@mtrezza
Copy link
Member

mtrezza commented Sep 12, 2022

Closing via #1543

@mtrezza mtrezza closed this as completed Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Impaired feature or lacking behavior that is likely assumed
Projects
None yet
Development

No branches or pull requests

2 participants