-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize copying large ranges of undefmask blocks #58556
Conversation
r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) |
@bors try |
Optimize copying large ranges of undefmask blocks Hopefully fixes #58523
@rust-timer build ab1e694 |
Success: Queued ab1e694 with parent d215d95, comparison URL. |
☀️ Test successful - checks-travis |
Finished benchmarking try commit ab1e694 |
Local tests have shown that my stage 1 compiler needs 8 seconds for fn main() {
(&[0; 1 << 28]);
} while my stage 0 compiler needs 23 seconds. I'm not sure how to add such a test to the perf test suite without causing significant slowdown of the entire suite. For smaller arrays ( |
@bors try |
⌛ Trying commit 5aacd4c5c019169c0d8c77e39e1f78ac0931909d with merge 5f394731e058952a24cf21f80ba4b1bfef28f30f... |
@rust-timer build 5f394731e058952a24cf21f80ba4b1bfef28f30f |
Success: Queued 5f394731e058952a24cf21f80ba4b1bfef28f30f with parent f66e469, comparison URL. |
☀️ Test successful - checks-travis |
Finished benchmarking try commit 5f394731e058952a24cf21f80ba4b1bfef28f30f |
perf shows a small (less than 1.5%) but across the board improvement Once rust-lang/rustc-perf#349 is merged, we should see a big improvement, but I don't see a reason to wait for that. |
With rust-lang/rustc-perf#349 merged, is this ready to move forward? |
@bors try |
@bors ping |
😪 I'm awake I'm awake |
@bors try |
☀️ Try build successful - checks-travis |
Finished benchmarking try commit af4ce587f67d7256b7a647f1ad5b14e266bdb69a |
@pnkfelix this is ready for review and perf looks green |
pub fn new(size: Size) -> Self { | ||
pub const BLOCK_SIZE: u64 = 64; | ||
|
||
pub fn new(size: Size, state: bool) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I infer that state
here is interpreted as { true
=> defined, false
=> undefined }, (right?)
You might consider adding a comment above the header saying so. (My initial interpretation of "undef mask" was that if the bit is true, then it is undefined)
// across block boundaries | ||
if new_state { | ||
// set bita..64 to 1 | ||
self.blocks[blocka] |= u64::max_value() << bita; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(aside: i'm sort of amazed libstd doesn't have named methods for these operations; I would think turning big ranges of bits on or off within a uN
(or better still, arbitrary arrays or vectors of [uN]
), would be so common as to have higher-level methods than shifts and bitwise-or-masking.)
Discussed some implementation details with @oli-obk on zulip, namely about the motivation for the run-length encoded form of the undef-mask used in I'm satisfied that the optimization tends to help more than it hurts. |
@bors r+ |
📌 Commit 2a1eb1c has been approved by |
…felix Optimize copying large ranges of undefmask blocks Hopefully fixes rust-lang#58523
⌛ Testing commit 2a1eb1c with merge 8a042506b6c9abd9785883d508244a668250b4cb... |
💔 Test failed - checks-travis |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@bors retry 3 hour timeout? |
Optimize copying large ranges of undefmask blocks Hopefully fixes #58523
☀️ Test successful - checks-travis, status-appveyor |
Hopefully fixes #58523