Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove late specifiers in __cpuid_count #1329

Merged
merged 2 commits into from
Sep 8, 2022
Merged

Conversation

newpavlov
Copy link
Contributor

@newpavlov newpavlov commented Sep 4, 2022

The late specifiers are useless in the case of __cpuid_count, there are no opportunities for correct register reuse.

Together with llvm/llvm-project#57550 it causes rust-lang/rust#101346. I strongly suggest checking use of late specifiers in other asm blocks as well.

This PR also changes the __cpuid_count asm to Intel syntax for consistency with asm in has_cpuid.

@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

@nbdd0121
Copy link

nbdd0121 commented Sep 6, 2022

I think the current code is correct without considering the LLVM bug.

@Amanieu
Copy link
Member

Amanieu commented Sep 6, 2022

Yes, the existing code is correct. I would prefer to see this fixed in LLVM directly.

@dotdash
Copy link

dotdash commented Sep 6, 2022

Is it really correct? The LLVM Language Reference says (emphasis theirs):

Normally, it is expected that no output locations are written to by the assembly expression until all of the inputs have been read. As such, LLVM may assign the same register to an output and an input. If this is not safe (e.g. if the assembly contains two instructions, where the first writes to one output, and the second reads an input and writes to a second output), then the “&” modifier must be used (e.g. “=&r”) to specify that the output is an “early-clobber” output. Marking an output as “early-clobber” ensures that LLVM will not use the same register for any inputs (other than an input tied to this output).

And the ebx lateout register is written to before all inputs have been read.

@nbdd0121
Copy link

nbdd0121 commented Sep 6, 2022

Well, I wouldn't consider that sentence as normative (especially since it says "normally" which means this isn't always the case), given that LLVM wouldn't peek into the assembly. Logically the code is correct if the register assigned isn't ebx (if it is, then whether it's out or lateout doesn't really matter).

I think the provided constraint is sufficient to prevent eax, ecx and edx to be assigned register to "ebx", modulo LLVM bugs.

@newpavlov
Copy link
Contributor Author

@Amanieu
While I agree that the code is correct, I will repeat my point that late specifiers in this case are completely useless. They do not enable any correct optimizations. Removing them will not regress codegen in any way. And as far as workarounds go, I think it is a much better option than marking the intrinsic as #[inline(never)]

@Amanieu
Copy link
Member

Amanieu commented Sep 8, 2022

Fair enough.

@Amanieu Amanieu merged commit 3049a31 into rust-lang:master Sep 8, 2022
@newpavlov newpavlov deleted the patch-1 branch September 8, 2022 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants