Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(fromArrayBuffer): use less memory for large buffers #242

Merged
merged 1 commit into from
Oct 16, 2021

Conversation

raphinesse
Copy link
Contributor

Platforms affected

All

Motivation and Context

This improves performance for converting large files to Base64 strings (see apache/cordova-plugin-file#364 for more info, especially the extensive performance analysis by @LightMind).

Another win here, is that we reduce the code size considerably. A downside is that we actually loose a bit of performance for small instances, but I do not think that this would be noticeable.

I explored different approaches here, like converting the original algorithm to use ArrayBuffers. The problem here is that we still need to convert the end result to a string which then becomes a performance bottleneck if you do not have support for TextDecoder (which we cannot assume with our current ES5 target).

Fixes #241

Description

base64.fromArrayBuffer now uses btoa to convert bytes to a base64 encoded string. We already use its counterpart atob in base64.toArrayBuffer. Since btoa unfortunately operates on binary strings instead of buffers, we first need to convert the raw bytes to a binary string. This is the main performance bottleneck here, but applying String.fromCharCode to large chunks of data works reasonably well.

Testing

I added a test that should hopefully preventing people from making stupid changes in the future. However, its reliance on an absolute expected runtime might lead to problems in the future.

I also did some performance comparisons (ops/s) between the new and old version in my local Chrome browser:

bytes old new new/old
1K 61409 33029 54%
32K 1569 913 58%
1M 11 24 218%
32M .3 .9 300%

@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2021

Codecov Report

Merging #242 (691ccc1) into master (af904b7) will decrease coverage by 0.57%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #242      +/-   ##
==========================================
- Coverage   84.23%   83.65%   -0.58%     
==========================================
  Files          14       14              
  Lines         539      520      -19     
==========================================
- Hits          454      435      -19     
  Misses         85       85              
Impacted Files Coverage Δ
src/common/base64.js 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af904b7...691ccc1. Read the comment docs.

Copy link
Member

@erisu erisu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me

@raphinesse
Copy link
Contributor Author

For future reference: replacing bytesToBinaryString(array) with new TextDecoder('utf-16').decode(Uint16Array.from(array)) doubles the throughput for large files in my experiments. In ten years we might be able to use it 😉

@ath0mas
Copy link

ath0mas commented Jun 14, 2023

Hello, should it be fine and time to switch to new TextDecoder('utf-16').decode(Uint16Array.from(array)) as you suggest now?

Shipping it in next major release seems right given:

@ath0mas ath0mas mentioned this pull request Jun 14, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

base64.fromArrayBuffer requires too much memory and can crash apps
4 participants