Skip to content

Conversation

wlgh1553
Copy link
Contributor

Replace splice(0, idx) with
state.buffer = ArrayPrototypeSlice(state.buffer, idx) when compacting the Readable buffer.

Rationale:

Benchmark (local):

                                      confidence improvement accuracy (*)   (**)  (***)
streams/readable-bigread.js n=1000           *      0.77 %       ±0.73% ±0.97% ±1.26%

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/streams

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. stream Issues and PRs related to the stream subsystem. labels Aug 29, 2025
Copy link

codecov bot commented Aug 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.89%. Comparing base (95bef5a) to head (384d35e).
⚠️ Report is 23 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##             main   #59676    +/-   ##
========================================
  Coverage   89.89%   89.89%            
========================================
  Files         667      667            
  Lines      196675   196852   +177     
  Branches    38612    38652    +40     
========================================
+ Hits       176796   176966   +170     
+ Misses      12332    12312    -20     
- Partials     7547     7574    +27     
Files with missing lines Coverage Δ
lib/internal/streams/readable.js 96.20% <100.00%> (+<0.01%) ⬆️

... and 45 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@daeyeon
Copy link
Member

daeyeon commented Sep 2, 2025

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1726/

11:15:07                                                                                          confidence improvement accuracy (*)    (**)   (***)
11:15:07 streams/compose.js n=1000                                                                                0.26 %       ±0.69%  ±0.91%  ±1.19%
11:15:07 streams/creation.js kind='duplex' n=50000000                                                            -0.60 %       ±1.59%  ±2.13%  ±2.80%
11:15:07 streams/creation.js kind='readable' n=50000000                                                           1.53 %       ±2.07%  ±2.78%  ±3.68%
11:15:07 streams/creation.js kind='transform' n=50000000                                                          0.68 %       ±0.82%  ±1.10%  ±1.45%
11:15:07 streams/creation.js kind='writable' n=50000000                                                           0.17 %       ±0.53%  ±0.70%  ±0.91%
11:15:07 streams/destroy.js kind='duplex' n=1000000                                                               0.22 %       ±0.35%  ±0.47%  ±0.62%
11:15:07 streams/destroy.js kind='readable' n=1000000                                                            -0.00 %       ±0.57%  ±0.76%  ±0.99%
11:15:07 streams/destroy.js kind='transform' n=1000000                                                           -0.17 %       ±0.51%  ±0.68%  ±0.88%
11:15:07 streams/destroy.js kind='writable' n=1000000                                                             0.08 %       ±0.56%  ±0.75%  ±0.98%
11:15:07 streams/pipe-object-mode.js n=5000000                                                                   -0.41 %       ±0.89%  ±1.19%  ±1.57%
11:15:07 streams/pipe.js n=5000000                                                                       ***     -2.28 %       ±0.39%  ±0.52%  ±0.68%
11:15:07 streams/readable-async-iterator.js sync='no' n=100000                                                   -0.72 %       ±1.52%  ±2.02%  ±2.63%
11:15:07 streams/readable-async-iterator.js sync='yes' n=100000                                                  -0.13 %       ±4.36%  ±5.80%  ±7.54%
11:15:07 streams/readable-bigread.js n=1000                                                                       0.36 %       ±2.97%  ±3.95%  ±5.16%
11:15:07 streams/readable-bigunevenread.js n=1000                                                        ***      3.41 %       ±0.66%  ±0.89%  ±1.16%
11:15:07 streams/readable-boundaryread.js type='buffer' n=2000                                           ***     -0.71 %       ±0.35%  ±0.47%  ±0.61%
11:15:07 streams/readable-boundaryread.js type='string' n=2000                                           ***     -4.09 %       ±1.24%  ±1.65%  ±2.15%
11:15:07 streams/readable-from.js type='array' n=10000000                                                 **     -1.82 %       ±1.13%  ±1.51%  ±1.97%
11:15:07 streams/readable-from.js type='async-generator' n=10000000                                               0.01 %       ±0.50%  ±0.66%  ±0.87%
11:15:07 streams/readable-from.js type='sync-generator-with-async-values' n=10000000                             -0.28 %       ±1.80%  ±2.40%  ±3.12%
11:15:07 streams/readable-from.js type='sync-generator-with-sync-values' n=10000000                      ***      5.80 %       ±2.21%  ±2.94%  ±3.85%
11:15:07 streams/readable-readall.js n=5000                                                                       1.28 %       ±2.26%  ±3.01%  ±3.92%
11:15:07 streams/readable-uint8array.js kind='encoding' n=1000000                                                -0.35 %       ±0.76%  ±1.01%  ±1.32%
11:15:07 streams/readable-uint8array.js kind='read' n=1000000                                                    -1.00 %       ±1.14%  ±1.53%  ±1.99%
11:15:07 streams/readable-unevenread.js n=1000                                                                    0.11 %       ±0.41%  ±0.55%  ±0.72%
11:15:07 streams/writable-manywrites.js len=1024 callback='no' writev='no' sync='no' n=100000                     0.70 %       ±4.52%  ±6.01%  ±7.82%
11:15:07 streams/writable-manywrites.js len=1024 callback='no' writev='no' sync='yes' n=100000                    6.39 %      ±10.34% ±13.76% ±17.93%
11:15:07 streams/writable-manywrites.js len=1024 callback='no' writev='yes' sync='no' n=100000                    3.19 %       ±6.21%  ±8.27% ±10.76%
11:15:07 streams/writable-manywrites.js len=1024 callback='no' writev='yes' sync='yes' n=100000                  -0.45 %      ±11.68% ±15.54% ±20.24%
11:15:07 streams/writable-manywrites.js len=1024 callback='yes' writev='no' sync='no' n=100000                   -0.59 %       ±3.02%  ±4.01%  ±5.23%
11:15:07 streams/writable-manywrites.js len=1024 callback='yes' writev='no' sync='yes' n=100000                   6.19 %       ±8.00% ±10.64% ±13.85%
11:15:07 streams/writable-manywrites.js len=1024 callback='yes' writev='yes' sync='no' n=100000                   3.78 %       ±5.77%  ±7.68% ±10.00%
11:15:07 streams/writable-manywrites.js len=1024 callback='yes' writev='yes' sync='yes' n=100000                  7.83 %      ±11.86% ±15.78% ±20.54%
11:15:07 streams/writable-manywrites.js len=32768 callback='no' writev='no' sync='no' n=100000                    3.88 %       ±5.70%  ±7.58%  ±9.87%
11:15:07 streams/writable-manywrites.js len=32768 callback='no' writev='no' sync='yes' n=100000                   4.24 %      ±11.26% ±14.99% ±19.51%
11:15:07 streams/writable-manywrites.js len=32768 callback='no' writev='yes' sync='no' n=100000                   0.87 %       ±5.49%  ±7.30%  ±9.50%
11:15:07 streams/writable-manywrites.js len=32768 callback='no' writev='yes' sync='yes' n=100000                  8.32 %      ±11.50% ±15.30% ±19.93%
11:15:07 streams/writable-manywrites.js len=32768 callback='yes' writev='no' sync='no' n=100000                   2.53 %       ±5.20%  ±6.92%  ±9.01%
11:15:07 streams/writable-manywrites.js len=32768 callback='yes' writev='no' sync='yes' n=100000                 -3.56 %       ±8.32% ±11.07% ±14.41%
11:15:07 streams/writable-manywrites.js len=32768 callback='yes' writev='yes' sync='no' n=100000                 -1.20 %       ±4.37%  ±5.82%  ±7.57%
11:15:07 streams/writable-manywrites.js len=32768 callback='yes' writev='yes' sync='yes' n=100000                -5.65 %       ±9.32% ±12.41% ±16.15%
11:15:07 streams/writable-uint8array.js kind='object-mode' n=50000000                                             0.01 %       ±0.14%  ±0.19%  ±0.25%
11:15:07 streams/writable-uint8array.js kind='write' n=50000000                                                  -0.26 %       ±0.33%  ±0.44%  ±0.58%
11:15:07 streams/writable-uint8array.js kind='writev' n=50000000                                                  0.20 %       ±0.33%  ±0.44%  ±0.57%
11:15:07 
11:15:07 Be aware that when doing many comparisons the risk of a false-positive
11:15:07 result increases. In this case, there are 44 comparisons, you can thus
11:15:07 expect the following amount of false-positive results:
11:15:07   2.20 false positives, when considering a   5% risk acceptance (*, **, ***),
11:15:07   0.44 false positives, when considering a   1% risk acceptance (**, ***),
11:15:07   0.04 false positives, when considering a 0.1% risk acceptance (***)
11:15:07 ++ mv output010925-145040.csv /w/bnch-comp
11:15:08 Sending e-mails to: gareth@gsellis.com
11:15:08 Collecting metadata...
11:15:08 Metadata collection done.
11:15:09 Notifying upstream projects of job completion
11:15:09 Finished: SUCCESS

Copy link
Member

@ronag ronag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause a copy and an allocation. I don't see how this is an improvement. Slicing in JavaScript is not like in go, it will copy the elements, not just create a view.

@wlgh1553
Copy link
Contributor Author

wlgh1553 commented Sep 2, 2025

@ronag Thanks for the review! 😀

My primary intention was to avoid the O(N) element shifting cost incurred by splice. As you correctly pointed out, slice introduces a new allocation. However, my reasoning was that the low-level implementation of slice would be a more efficient operation than the element shifting required by splice.

This approach is also a proven optimization pattern used in PR #59406 to solve the same issue in Writable streams.

I see the Jenkins CI benchmark results as data that empirically validates this hypothesis in a real-world scenario.

@ronag
Copy link
Member

ronag commented Sep 2, 2025

However, my reasoning was that the low-level implementation of slice would be a more efficient operation than the element shifting required by splice.

Why is it more efficient? slice still needs to copy the elements just like splice does in this case.

@ronag
Copy link
Member

ronag commented Sep 2, 2025

Here is an illustration of what needs to happen in both cases:

function slice(arr, idx) {
  const ret = []
  let n = 0
  for (let i = idx; i < arr.length; i++) {
    ret[n++] = arr[i]
  }
  return ret
}

function splice(arr, idx) {
  let n = 0
  for (let i = idx; i < arr.length; i++) {
    arr[n++] = arr[i]
  }
  arr.length = arr.length - idx
}

splice does the exact same thing as slice but without the extra allocation. Unless there is some V8 magic involved that I am unaware of. I think the benchmark results of both this and the other PR is within the margin of error.

@wlgh1553
Copy link
Contributor Author

wlgh1553 commented Sep 2, 2025

Thank you for the detailed illustration with the code examples. It was very helpful for clarifying the logical operations. You are correct, and I agree that splice uses a smaller temporary allocation since slice must allocate for the larger, remaining portion of the array.

I believe the 'V8 magic' we're discussing is the most likely explanation for the consistent benchmark improvements we see for slice in both the CI for this PR and the previous one (#59406).

To isolate this performance difference more clearly, I ran a more targeted local microbenchmark with a larger array.

"use strict";

function bench(label, fn) {
  const t0 = performance.now();
  fn();
  const t1 = performance.now();
  console.log(`${label}:`.padEnd(25), `${(t1 - t0).toFixed(2)} ms`);
}

function makeArray(size) {
  const arr = new Array(size);
  for (let i = 0; i < size; i++) {
    arr[i] = i;
  }
  return arr;
}

const ARRAY_SIZE = 20_000_000;
const REMOVE_COUNT = 100;
const RUNS = 100;

bench("Array.prototype.splice()", () => {
  for (let i = 0; i < RUNS; i++) {
    const arr = makeArray(ARRAY_SIZE);
    arr.splice(0, REMOVE_COUNT);
  }
});

bench("Array.prototype.slice()", () => {
  const arr = makeArray(ARRAY_SIZE);
  for (let i = 0; i < RUNS; i++) {
    const newArr = arr.slice(REMOVE_COUNT);
  }
});

The results are consistently and significantly in favor of slice:

ubuntu@ip-172-31-38-159:~$ node test.js
Array.prototype.splice(): 22854.06 ms
Array.prototype.slice():  19305.41 ms
ubuntu@ip-172-31-38-159:~$ node test.js
Array.prototype.splice(): 22501.54 ms
Array.prototype.slice():  19330.00 ms
ubuntu@ip-172-31-38-159:~$ node test.js
Array.prototype.splice(): 22613.90 ms
Array.prototype.slice():  19190.69 ms

Test Environment:

  • OS: Ubuntu 24.04.3 LTS
  • CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
  • RAM: 1GB
  • Node.js: v22.19.0

This data strongly supports the idea that the high CPU cost of shifting millions of elements for splice is a much more impactful bottleneck than the cost of slice's single allocation and optimized bulk-copy operation.

This is the core rationale for my proposal: I believe trading the one-time memory allocation for a significant gain in CPU performance is a worthwhile improvement for this hot-path in the streams implementation.

@ronag
Copy link
Member

ronag commented Sep 2, 2025

I believe your benchmark is flawed. V8 is probably entirely optimizing away newArr and slice since it has no side effect.

clk: ~3.43 GHz
cpu: Apple M3 Pro
runtime: node 24.5.0 (arm64-darwin)

benchmark                   avg (min  max) p75 / p99    (min  top 1%)
------------------------------------------- -------------------------------
Array.prototype.splice()      42.12 µs/iter  36.21 µs                     
                       (24.92 µs  1.39 ms) 192.87 µs                     
                    ( 14.66 kb  285.54 kb) 157.71 kb ▂█▃▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

Array.prototype.slice()       56.49 µs/iter  56.07 µs   ██                
                      (48.54 µs  73.75 µs)  71.57 µs █▅▅██             
                    ( 77.85  b    6.09 kb) 701.80  b █████▁▁█▁▁▁▁▁▁▁▁▁█▁▁█
import { bench, run } from 'mitata'

function makeArray(size) {
  const arr = new Array(size);
  for (let i = 0; i < size; i++) {
    arr[i] = i;
  }
  return arr;
}

const ARRAY_SIZE = 20_000;
const REMOVE_COUNT = 100;

let counter = 0;

bench("Array.prototype.splice()", () => {
  const arr = makeArray(ARRAY_SIZE);
  counter += arr.splice(0, REMOVE_COUNT).length;
});

bench("Array.prototype.slice()", () => {
  const arr = makeArray(ARRAY_SIZE);
  counter += arr.slice(REMOVE_COUNT).length;
});

await run();

console.log(counter);

@wlgh1553
Copy link
Contributor Author

wlgh1553 commented Sep 3, 2025

Thank you for all the helpful feedback. You've helped me clear up my misunderstanding of the performance characteristics.

Based on our discussion, I'll go ahead and close this PR. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-ci PRs that need a full CI run. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants