Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify core::hint::spin_loop #115547

Merged
merged 1 commit into from
Sep 18, 2023
Merged

Conversation

WaffleLapkin
Copy link
Member

The grouping was inconsistent and not really helpful.

r? t-libs

@rustbot
Copy link
Collaborator

rustbot commented Sep 4, 2023

Failed to set assignee to t-libs: invalid assignee

Note: Only org members, users with write permissions, or people who have commented on the PR may be assigned.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 4, 2023
@WaffleLapkin
Copy link
Member Author

r? libs

{
// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
// with support for the v6 feature.
unsafe { crate::arch::arm::__yield() };
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side-note, I've noticed that hexagon target has a pause instruction tailored for spin loops, but asm! macro doesn't seem to support hexagon yet, so I'm not sure if we can do anything here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that we actually do support hexagon ASM. It's just unstable and not documented (besides the tracking issue: #93335). I could try adding a pause here for it then.

{
// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
// with support for the v6 feature.
unsafe { crate::arch::arm::__yield() };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answer on SO seems to suggest that __isb(SY) is better than __yield() (for some definition of "better" I guess). It seems like both of those are available for both aarch64 and arm. Should we use __isb(SY) for arm too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SO answer is mistaken. At least, somewhat mistaken. ISB, the "Instruction Synchronization Barrier," flushes the instruction pipeline on the calling processor. It isn't explicitly for use in something as general-purpose as a spin-loop:

Instruction Synchronization Barrier flushes the pipeline in the PE and is a context synchronization event.

When used with the SY option it can cause a "full system barrier operation," which could only be more expensive than plain ISB.

I worry that __isb(SY) may only appear to improve performance, and only on select implementations.

The instruction that best fits the intent of a spin-loop hint is YIELD:

YIELD is a hint instruction. Software with a multithreading capability can use a YIELD instruction to indicate to the PE that it is performing a task, for example a spin-lock, that could be swapped out to improve overall system performance. The PE can use this hint to suspend and resume multiple software threads if it supports the capability.

Expanding the use of __isb(SY) from aarch64 to both targets seems misguided. If anything, the change from __yield() to __isb(SY) on aarch64 should be reverted.

Comment on lines +196 to +197
// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
// with support for the v6 feature.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure those (or any of the above) comments make sense. As far as I can tell, all the functions in core::arch::... are cfg-ed anyway, so we would not be able to call them on a wrong arch. Should I delete them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions in core::arch::... can be called just fine if the relevant cfg(target_feature = "...") is not true at compile time. In that case you need to do runtime detection to check if it is supported before calling however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this could be true.

__yield is defined in arm_shared. arm_shared is marked as #[cfg(any(target_arch = "arm", target_arch = "aarch64", doc))]. __yield itself is marked as #[cfg(any(target_feature = "v6", target_arch = "aarch64", doc))].

So this results in __yield being available if cfg(all(any(target_arch = "arm", target_arch = "aarch64", doc), any(target_feature = "v6", target_arch = "aarch64", doc))). doc and target_arch = "aarch64" are not true in this context, so we are left with cfg(all(any(target_arch = "arm"), any(target_feature = "v6"))), which can be simplified to cfg(all(target_arch = "arm", target_feature = "v6")).

Unless I'm missing something, this is exactly the same as the cfg on this block?...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you are right in this case. For x86 intrinsics this is definitively not the case though. Those are never cfged on target features.

@Mark-Simulacrum
Copy link
Member

@bors r+ rollup=iffy

I think this seems fine -- there's some comments here on possible changes to the code, but I don't think we need to do that in this PR.

@bors
Copy link
Contributor

bors commented Sep 17, 2023

📌 Commit 1811fe6 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 17, 2023
@bors
Copy link
Contributor

bors commented Sep 18, 2023

⌛ Testing commit 1811fe6 with merge 8a7cab8...

@bors
Copy link
Contributor

bors commented Sep 18, 2023

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 8a7cab8 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 18, 2023
@bors bors merged commit 8a7cab8 into rust-lang:master Sep 18, 2023
@rustbot rustbot added this to the 1.74.0 milestone Sep 18, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8a7cab8): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.2%, -0.2%] 1
Improvements ✅
(secondary)
-0.4% [-0.5%, -0.3%] 4
All ❌✅ (primary) -0.2% [-0.2%, -0.2%] 1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.4%] 1
Regressions ❌
(secondary)
3.5% [3.3%, 3.7%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.4% [2.4%, 2.4%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 634.064s -> 634.575s (0.08%)
Artifact size: 318.47 MiB -> 318.43 MiB (-0.01%)

@WaffleLapkin WaffleLapkin deleted the spin_looping branch November 19, 2023 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants