-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align fs.ReadStream buffer pool writes to 8-byte boundary #24838
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It might be infinitesimally nicer to move it into a function so you can simply write pool.used = align(pool.used)
, with align()
being:
function align(n) {
return (n + 7) & ~7; // Align to 8 byte boundary.
}
e44cc01
to
7aaae68
Compare
👍 thanks @bnoordhuis, I moved the logic into a |
3c5ec58
to
f530dc5
Compare
lib/internal/fs/streams.js
Outdated
thisPool.used = roundUpToMultipleOf8(alignedOffset); | ||
} else if (toRead - bytesRead > kMinPoolSpace) { | ||
alignedOffset = roundUpToMultipleOf8(start + bytesRead); | ||
poolFragments.push(thisPool.slice(alignedOffset, start + toRead)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're only using alignedOffset
inside the if/else arms, can you make it const
and scope it to inside the arms?
Is start + toRead
correct when alignedOffset
may not be start + bytesRead
? What I mean is, shouldn't start + toRead
be rounded up too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rounding up is unsafe here, since toRead
is clamped within the size of the pool on initialization. I had the same thought initially, but rounding it up caused unrelated tests to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you agree it's slicing less than before? Note that slice() takes start and end arguments (inclusive), not start and length.
After thinking about this some more, I think that the rules/constraints should be:
-
thisPool.used
is always a multiple of 8 (the pool itself is already suitably aligned), and -
this particular slice() call should only slice multiples of 8 at offsets that are multiples of 8.
(1) implies (2) because (2) caches slices for reuse by (1).
(2) can be enforced by rounding start + toRead
down.
Potential pitfall: rounding up or down can result in slices smaller than kMinPoolSpace
. It should check that the slice's length >= kMinPoolSpace
before pushing it into poolFragments
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is slicing less than before. By rounding up the start position, we advance 8 - n_remainder
bytes past where we would have sliced. In this PR, pool.used
(and thisPool.used
) should always be a multiple of 8, which I believe addresses point (1).
With regards to (2), the byteLength of each slice as it is now isn't guaranteed to be a multiple of 8, though I think that's also a good idea. Performance-wise, it's best to align the toRead
value to the processor's cache line size, but I left it as-is to keep the surface area of this PR to a minimum. I would be happy to add this though.
Good catch on the adjusted size causing the pool fragment size to dip below kMinPoolSpace
, I'll update the PR with a fix here in a bit.
e13a20c
to
0d29ef3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % nits. CI: https://ci.nodejs.org/job/node-test-pull-request/19282/
lib/internal/fs/streams.js
Outdated
else if (toRead - bytesRead > kMinPoolSpace) | ||
poolFragments.push(thisPool.slice(start + bytesRead, start + toRead)); | ||
if (start + toRead === thisPool.used && thisPool === pool) { | ||
const alignedOffset = thisPool.used + bytesRead - toRead; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiniest of nits: alignedOffset
is a misnomer since it's not actually aligned (yet.)
lib/internal/fs/streams.js
Outdated
const alignedOffset = thisPool.used + bytesRead - toRead; | ||
thisPool.used = roundUpToMultipleOf8(alignedOffset); | ||
} else { | ||
const alignedEnd = (start + toRead) & ~7; // round down to multiple of 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more micro-nit: comments should be capitalized and punctuated (and normally have two spaces before the //
- I expect the linter will complain.)
899e641
to
73d1ca2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnoordhuis what else needs to be done here? Are those three failing checks related to this change? I tried parsing through the console output, but can't see any indication of why it's failing. |
Let's run the CI again and see what happens: https://ci.nodejs.org/job/node-test-pull-request/19989/ |
@trxcllnt Do you think you could rebase this? Sorry for the merge conflicts! |
@addaleax just resolved the conflict via the github UI, let me know if I should do anything further. Thanks! |
@trxcllnt One problem is that merge conflicts break our CI and make landing changes harder – can you use (We can probably figure this out another way if not, but this would definitely make things easier) |
01b5fd8
to
82ec8d3
Compare
@addaleax sure thing -- rebased from |
@bnoordhuis @addaleax anything left to do here? The node folks in the Apache Arrow project would appreciate if we could merge this soon 🙏 since we have a workaround in place that has to resort to copying chunks if the offset isn't aligned. It can be a nasty/hidden performance cliff for users if they're doing a lot of file I/O. Let me know if there's anything else I can do ❤️ |
@trxcllnt It looks like the rebase went wrong somehow? There’s 22000 added lines in this PR now… Generally, there’s nothing in the way of landing it, and bumping the thread is the exact right thing to do :) |
d6d9c3f
to
bdc2ba4
Compare
@addaleax thanks for the heads up! should be fixed now :-) |
Prevents alignment issues when creating a typed array from a buffer. Fixes: nodejs#24817
@addaleax bump :-] |
Landed in b884ceb 🎉 Thanks for the PR! |
fs: align fs.ReadStream buffer pool writes to 8-byte boundary
Prevents alignment issues when people create a typed array from a chunk buffer, similar to 285d8c6.
Fixes #24817
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes