-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TextDecoder().decode() pollyfill performance issue #316
Comments
Good analysis, thanks! How about this: implement our own TextDecoder polyfill using the C functions you created under the hood. Did the original one support streaming? I didn't think it did... |
The original and UTF-8-only pollyfills do support streaming. I'm relying on it right now to decode while reading single bytes from raw mode stdin. I think to correctly handle streaming, ignoreBOM, and fatal, it'd have to go deeper and use QuickJS's |
Oh I see. So another approach is to go the patch-package route. We can patch a Node module at install time (see the patches directory) and either fix the current dep or revert back and slash other encodings? |
I never noticed the patches directory. That's perfect. I did start working on implementing the TextDecoder API in C, only using To do it right, I think I'd need to expose |
The UTF-8-only encoder/decoder was performing poorly due to inefficient array manipulation. Applied changes from the original text-encoding package that reduce the amount of copying. Relates to saghul#316
The UTF-8-only encoder/decoder was performing poorly due to inefficient array manipulation. Applied changes from the original text-encoding package that reduce the amount of copying. Relates to #316
Fixed in #321 |
Since changing from the
text-encoding
pollyfill totext-encoding-utf-8
(in #311), the performance of text decoding has dropped significantly.In my testing, it was taking over 3 seconds to read a 175kb file and decode it into a JS string. It does depends on the size of the chunk you're giving to the text decoder.
Part of the problem is that the decoding routine in
text-encoding-utf-8
examines each byte in the input buffer, it shifts the byte from the start (see the Stream class here) .The original
text-encoding
does this better (see the Stream class here) by reversing the input buffer and operating on the end of the buffer rather than the start.I've applied the reversing changes to
text-encoding-utf-8
(in samuelbrian@99b895c) and it's seems on-par with the original version. But I'm not sure the best way to manage patching the node_modules dependencies, short of forking and maintaining a copy oftext-encoding-utf-8
.Even better is using text encoding and decoding from QuickJS which I have done here samuelbrian@f308a6d, and that makes a file read that was taking >3 seconds into a <1 millisecond operation.
I've exposed these as tjs functions rather than implement them as TextEncoder/TextDecoder and remove the pollyfill, mainly because I didn't want to have to deal with the streaming option. But it might be viable to do.
The text was updated successfully, but these errors were encountered: