Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirect __rust_dealloc to sdallocx #122329

Closed
wants to merge 1 commit into from
Closed

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Mar 11, 2024

This could use a perf run.

@rustbot
Copy link
Collaborator

rustbot commented Mar 11, 2024

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 11, 2024
@Kobzol
Copy link
Contributor

Kobzol commented Mar 11, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 11, 2024
@bors
Copy link
Contributor

bors commented Mar 11, 2024

⌛ Trying commit 0f657ac with merge ecb3315...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
Redirect `__rust_dealloc` to `sdallocx`

This could use a perf run.
@bors
Copy link
Contributor

bors commented Mar 11, 2024

☀️ Try build successful - checks-actions
Build commit: ecb3315 (ecb331519998b3890fd9db9720e01106318d4103)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ecb3315): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-1.2%, -0.2%] 148
Improvements ✅
(secondary)
-0.6% [-3.8%, -0.3%] 80
All ❌✅ (primary) -0.6% [-1.2%, -0.2%] 148

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.2% [2.2%, 4.6%] 8
Regressions ❌
(secondary)
3.5% [1.8%, 5.0%] 27
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.2% [2.2%, 4.6%] 8

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.8% [-1.9%, -1.7%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 645.581s -> 646.145s (0.09%)
Artifact size: 309.97 MiB -> 309.93 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 11, 2024
@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 11, 2024
@Kobzol
Copy link
Contributor

Kobzol commented Mar 11, 2024

This was previously tried here, although seemingly in a different way. CC @nnethercote

@petrochenkov
Copy link
Contributor

The PR needs a proper description - what exactly it changes and why it's different from jemalloc overrides below, why it helps, what __rust_dealloc and sdallocx are, etc.

@workingjubilee
Copy link
Member

sdallocx is the jemalloc function that allows passing a size when deallocating, as an optimization, allowing the allocator to skip the relevant metadata lookup it would otherwise have to do.

@Zoxc
Copy link
Contributor Author

Zoxc commented Mar 12, 2024

If #122362 works out we could probably just do this change with a #[global_allocator].

@oskgo
Copy link
Contributor

oskgo commented Jul 26, 2024

I'm going to mark this as blocked on #122362. @Zoxc; Feel free to relabel if you want to make progress on this directly.

@rustbot label -S-waiting-on-author +S-blocked

@rustbot rustbot added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 26, 2024
@Zoxc Zoxc closed this Aug 11, 2024
#[cfg(feature = "jemalloc-sys")]
#[no_mangle]
pub unsafe extern "C" fn __rust_dealloc(ptr: *mut u8, size: usize, _align: usize) {
unsafe { jemalloc_sys::sdallocx(ptr.cast(), size, 0) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zoxc I think you closed this PR to try using a #[global_allocator] here with the statically linked std, but just as a heads up:

  • I had quickly tried using jemalloc as a global allocator in the rustc driver and it segfaulted while building std
  • otherwise, don't we also need to take alignment into account in the __rust_dealloc code above in the general case, so that it matches what System.alloc does for big alignments and allocations w/ a smaller size than the alignment? That being said, in the context of rustc it shouldn't matter, these values should be coming from types that wouldn't need non-zero jemalloc flags. I did a perf run with both 0, and flags computed by checking size/alignment, and it seemed neither made any improvement over the master branch. Weird, after seeing wins in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I had quickly tried using jemalloc as a global allocator in the rustc driver and it segfaulted while building std

Do you have a branch for that?

  • otherwise, don't we also need to take alignment into account in the __rust_dealloc code above

My understanding is that the alignment argument is just a performance hint for sdallocx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it's just a performance hint, because it modifies the actual size (jemalloc/jemalloc-cmake@4cfe551#diff-a4cb09e38cfec8141b07c291f731a8e01a17412568a852884fd921e8e521766bR1850 - this is some old code, the new one is more complicated, but does the same thing). Sadly, it also seems like sdallocx takes a slow path when flags != 0, although that might not happen that often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a branch for that?

yes, some wip work here lqd@6819e3c

rustc-main symbols override are still there as their presence/absence didn't seem to impact the segfault, but maybe it does and llvm would need to be setup differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: Blocked on something else such as an RFC or other implementation work. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants