Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint against named asm labels #87324

Merged
merged 8 commits into from
Aug 14, 2021
Merged

Conversation

asquared31415
Copy link
Contributor

This adds a deny-by-default lint to prevent the use of named labels in inline asm!. Without a solution to #81088 about whether the compiler should rewrite named labels or a special syntax for labels, a lint against them should prevent users from writing assembly that could break for internal compiler reasons, such as inlining or anything else that could change the number of actual inline assembly blocks emitted.

This does not resolve the issue with rewriting labels, that still needs a decision if the compiler should do any more work to try to make them work.

@rust-highfive
Copy link
Collaborator

r? @davidtwco

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 20, 2021
@asquared31415
Copy link
Contributor Author

r? @Amanieu

@rust-highfive rust-highfive assigned Amanieu and unassigned davidtwco Jul 20, 2021
@asquared31415
Copy link
Contributor Author

I decided to initially make this a deny-by-default lint because the use of named labels is almost always wrong. The RFC even says that only GAS style labels should be used. However I think it could be important to be able to be disabled if a user really understands what that means. This could be made warn by default (which I don't think is a good idea, named labels are objectively wrong) or a hard error though.

@rust-log-analyzer

This comment has been minimized.

@asquared31415
Copy link
Contributor Author

Whoops, had some git issues and seemingly didn't rebase that file right when trying to fix things

compiler/rustc_lint_defs/src/lib.rs Outdated Show resolved Hide resolved
compiler/rustc_lint_defs/src/builtin.rs Outdated Show resolved Hide resolved
compiler/rustc_builtin_macros/src/asm.rs Show resolved Hide resolved
compiler/rustc_builtin_macros/src/asm.rs Outdated Show resolved Hide resolved
compiler/rustc_builtin_macros/src/asm.rs Show resolved Hide resolved
Comment on lines 523 to 524
"do not use named labels in inline assembly",
BuiltinLintDiagnostics::NamedAsmLabel("Only GAS local labels of the form `N:` where N is a number may be used in inline asm".to_string()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would word this slightly differently:

Suggested change
"do not use named labels in inline assembly",
BuiltinLintDiagnostics::NamedAsmLabel("Only GAS local labels of the form `N:` where N is a number may be used in inline asm".to_string()),
"avoid using named labels in inline assembly",
BuiltinLintDiagnostics::NamedAsmLabel("Only GAS local labels of the form `N:` where N is a number should be used in inline asm".to_string()),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you missed my comment or if you prefer sticking to the existing wording.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've incorporated this and also made the help a little more concise

@rust-log-analyzer

This comment has been minimized.

@asquared31415
Copy link
Contributor Author

I think it's right that that fails, technically, but that makes that test hard to do. What should I do, allow the lint there?

@Amanieu
Copy link
Member

Amanieu commented Jul 22, 2021

Yes, just allow the lint there.

@bors
Copy link
Contributor

bors commented Jul 24, 2021

☔ The latest upstream changes (presumably #87296) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Jul 25, 2021

☔ The latest upstream changes (presumably #87381) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Aug 4, 2021

☔ The latest upstream changes (presumably #87026) made this pull request unmergeable. Please resolve the merge conflicts.

for span in spans.into_iter() {
ecx.parse_sess().buffer_lint_with_diagnostic(
lint::builtin::NAMED_ASM_LABELS,
span,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffer_lint_with_diagnostic can take a Vec<Span> so that we can emit a single lint for all the labels in an asm!.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh neat I didn't realize that was what MultiSpan meant, though it seems pretty obvious in hindsight and I should have checked the impls for it. That looks a lot cleaner.

ecx.current_expansion.lint_node_id,
"avoid using named labels in inline assembly",
BuiltinLintDiagnostics::NamedAsmLabel("only local labels of the form `<number>:` should be used in inline asm".to_string()),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid the code duplication if the span is the only difference between the two calls.

@Amanieu
Copy link
Member

Amanieu commented Aug 4, 2021

(sorry about the late review, I thought that I submitted it last week but apparently something went wrong and it just got buffered)

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 14, 2021
@bors bors merged commit e55c13e into rust-lang:master Aug 14, 2021
@rustbot rustbot added this to the 1.56.0 milestone Aug 14, 2021
@npmccallum
Copy link
Contributor

@Amanieu @bjorn3 While I definitely think we want a lint similar to this, I wish it had gone through some broader review before merging. We awoke this morning to find our CI/CD broken because of this lint erroring on local symbolic labels (i.e. .Lfoo).

https://docs.oracle.com/cd/E19120-01/open.solaris/817-5477/esqaq/index.html
https://sourceware.org/binutils/docs-2.25/as/Symbol-Names.html

The motivation for this patch is to prevent emitting conflicting entries to the object file's symbol table. Numeric labels don't cause this problem and are therefore allowed. But this patch additionally disables the use of local symbolic labels which also don't cause this problem and these should be allowed too.

This lint breaks the compilation of existing, working and valid code. I propose that this patch be reverted until it allows local symbolic labels.

cc @haraldh @joshtriplett

@Amanieu
Copy link
Member

Amanieu commented Aug 16, 2021

Actually local labels (.Lfoo) can still cause conflicts if there are multiple definitions within the same object file/compilation unit. This can happen if a function containing inline assembly is inlined or duplicated within the same compilation unit. So the lint is correct as it is right now.

@Amanieu
Copy link
Member

Amanieu commented Aug 16, 2021

@npmccallum
Copy link
Contributor

@Amanieu This still seems to me like hammering a nail with a sledge hammer.

  1. Before this lint, the compiler produced a completely readable error message regarding the unsupported configuration. Therefore, this lint is not an improvement of an existing defect.
error: invalid symbol redefinition
  --> src/lib.rs:11:15
   |
11 |         asm!(".Lfoo: nop");
   |               ^
   |
note: instantiated into assembly here
  --> <inline asm>:2:2
   |
2  |     .Lfoo: nop
   |     ^

error: could not compile `playground` due to previous error
  1. If we want this lint (I'm not sure we do), it should only be active in the cases where this collision can occur (i.e. during inlining).

  2. It should definitely not be active on naked functions where the pain is most acute. Naked functions will never be inlined or duplicated. Naked functions also benefit the most from local named labels.

@Amanieu
Copy link
Member

Amanieu commented Aug 16, 2021

  1. Do you have any suggestions on how the lint message can be improved?
  2. Duplication can happen anywhere. Even if a function is marked #[inline(never)], compiler optimizations can still duplicate parts of the function body, including asm! blocks.
  3. This is a valid point, perhaps we can disable this lint for naked functions. This seems a bit difficult with the current implementation since it emits the lint during macro expansion. Could #[naked] implicitly add #[allow(named_asm_label)]?

@npmccallum
Copy link
Contributor

@Amanieu

  1. Do you have any suggestions on how the lint message can be improved?

The lint should specify that local named labels are unsupported outside naked functions.

  1. Duplication can happen anywhere. Even if a function is marked #[inline(never)], compiler optimizations can still duplicate parts of the function body, including asm! blocks.

I concede this point.

  1. This is a valid point, perhaps we can disable this lint for naked functions. This seems a bit difficult with the current implementation since it emits the lint during macro expansion. Could #[naked] implicitly add #[allow(named_asm_label)]?

I'd have to think through the implications of that. Would it mean that developers couldn't change this to an error for naked functions?

However, this PR still breaks existing, valid code and should be reverted until the implications of this change are properly understood and the semantics are agreed upon.

@bjorn3
Copy link
Member

bjorn3 commented Aug 16, 2021

However, this PR still breaks existing, valid code

Apart from naked functions, it has never been valid to use named labels, even before this PR. A future LLVM version or other backend may cause this to break at any point. The point of this PR is exactly to prevent people from writing seemingly valid, but not actually valid inline assembly.

I agree that it makes sense to disable this lint for naked functions though, but I don't think it warrants a revert of this PR, thus an extension to disable the lint for naked functions.

and should be reverted until the implications of this change are properly understood and the semantics are agreed upon.

The inline asm RFC has already defined it as not allowed:

https://rust-lang.github.io/rfcs/2873-inline-asm.html

  • You cannot assume that an asm! block will appear exactly once in the output binary. The compiler is allowed to instantiate multiple copies of the asm! block, for example when the function containing it is inlined in multiple places.
    • As a consequence, you should only use local labels inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.

Apart from naked functions I don't think there are any edge cases where it would be fine to allow named asm labels. If there are, an issue can be opened and a solution can be discussed. As asm!() is still nightly-only, it is allowed to break it. That doesn't mean we should randomly do so, but it is acceptable if there is an alternative that can be used. In this case that is using numbered labels. There have been much bigger breaking changes to nightly-only features in the past, including renaming asm!() to llvm_asm!() to make place for the much better asm!() we currently have.

@npmccallum
Copy link
Contributor

npmccallum commented Aug 16, 2021

@bjorn3 Which still means that this PR breaks existing, valid code in the case of naked functions. That is, a change in one feature has broken valid code in another feature. The implementor of the (now broken) feature wasn't consulted and is being asked to scramble to fix the breakage or live with it. This isn't contributor friendly.

It should be the responsibility of the PR author to fix breakage in other features caused by the PR. Ideally, this should be caught before merge. But when it isn't, how should we handle it? I would argue that it is still the responsibility of the PR author to fix the breakage. And the only way, procedurally, that we can enforce this is to revert the PR and reopen it.

@asquared31415
Copy link
Contributor Author

asquared31415 commented Aug 16, 2021

I agree with bjorn on all 3 points. Rust makes no stability guarantees about unstable features, and it is allowed and there is precedent to entirely change unstable features. If someone must rely on a nightly feature never changing, they should pin the nightly version to a specific, known working version, and update with caution, until the feature is stabilized.

I was not aware that #[naked] functions prevent code duplication or inlining, and because of that implemented this lint unconditionally to match the the RFC. If it is true that #[naked] functions are exempt from the potential issues, they should be granted an exception in this lint, and not be the rule.

@npmccallum
Copy link
Contributor

I agree with bjorn on all 3 points. Rust makes no stability guarantees about unstable features, and it is allowed and there is precedent to entirely change unstable features. If someone must rely on a nightly feature never changing, they should pin the nightly version to a specific, known working version, and update with caution, until the feature is stabilized.

Agreed completely.

I was not aware that #[naked] functions prevent code duplication or inlining, and because of that implemented this lint unconditionally to match the the RFC. If it is true that #[naked] functions are exempt from the potential issues, they should be granted an exception in this lint, and not be the rule.

I think the main question is who does this work and when. Reverting the PR gives us breathing room to figure this out and coordinate across features.

@asquared31415
Copy link
Contributor Author

asquared31415 commented Aug 16, 2021

#[naked] applies the same flag as #[inline(never)] for LLVM. So unless there's additional undocumented behavior to naked in LLVM, it is still possible for naked functions to be inlined. The noinline attribute says

This attribute indicates that the inliner should never inline this function in any situation. This attribute may not be used together with the alwaysinline attribute.

I believe that should never is not a guarantee, just a very strong recommendation, so I don't think this can be relied on.

Even if there is that undocumented behavior to the naked attribute that would force never inlining always, how much should we rely on that undocumented behavior.

@asquared31415
Copy link
Contributor Author

If it does turn out that we can rely on this behavior of #[naked] I will fix the lint to add an exception

@npmccallum
Copy link
Contributor

Naked functions MUST NOT ever be inlined. If there is some undefined behavior that causes one to be inlined, it is a bug that should be fixed.

When I disabled inlining for naked functions, I did not implement it as desugaring to #[inline(never)]. In fact, we lint so that mentioning inlining at all on a naked function is disallowed.

@asquared31415
Copy link
Contributor Author

asquared31415 commented Aug 16, 2021

Then this is a shortcoming of LLVM, it doesn't seem to have a way to specify "no never inline this ever, if you do this is wrong". The best is noinline which as stated above seems to be a very strong hint, and is currently what we pass to LLVM for naked functions, and is the same attribute used for #[inline(never)]

@bjorn3
Copy link
Member

bjorn3 commented Aug 16, 2021

I think naked functions can still be inlined if and only if you ensure that you setup all registers and the stack anyway as if you would actually do a real call. This is a silly thing to do, but indistinguishable from a real call except that it would duplicate the inline asm and possibly avoid a single call instruction in favor of falling through to the naked function.

@Amanieu
Copy link
Member

Amanieu commented Aug 16, 2021

At this point I'm fairly confident that we can guarantee that asm! in a naked function is never duplicated, so we should just go ahead and disable this lint for asm! in naked functions. If LLVM's behavior changes then we can always switch to lowering naked functions to global_asm! instead.

@npmccallum
Copy link
Contributor

@bjorn3 There are possibly other invariants as well (depending on platform). But while it is theoretically possible to inline a naked function, the complexity of this problem is too difficult and with too little gain. Therefore, in the Constrained Naked RFC, we commit the compiler to never inlining a naked function. This may be a promise the compiler cannot (currently) keep. But that might change in the future. And in the meantime, we clearly communicate that no one should ever depend on this behavior and can reliably depend on the opposite behavior.

@asquared31415
Copy link
Contributor Author

If we are advising users to not depend on naked functions being never inlined, shouldn't this lint still activate then, since the compiler cannot currently promise that?

@npmccallum
Copy link
Contributor

@asquared31415 Per the Constrained Naked Functions RFC we are advising users to depend on naked functions never being inlined.

@asquared31415
Copy link
Contributor Author

asquared31415 commented Aug 16, 2021

Oh I misinterpreted your previous statement then. I thought this behavior meant that they aren't inlined.

I still think it is possible that LLVM changes in the future to actually inline things that are noinline if it really has a good reason to, but since it currently doesn't inline these functions, we could make this exception and handle that scenario if it happens. I'm not sure whether there's any additional rules about duplicating code blocks, but I assume they're similar and that it does not currently happen.

@npmccallum
Copy link
Contributor

@asquared31415 I'm sure that if LLVM adds functionality to inline even those functions, the next thing people will ask for is a way to disable it.

@haraldh
Copy link
Contributor

haraldh commented Aug 17, 2021

Also the linter is triggered on comments:

#[naked]
pub unsafe extern "sysv64" fn _start() -> ! {
    asm!("
    nop          # here is a comment about syscall
    nop          // and another syscall comment
    jmp syscall
    hlt
.global syscall
syscall:
    hlt
    ",
            options(noreturn)
    );
}

gives:

error: avoid using named labels in inline assembly
  --> src/lib.rs:7:44
   |
7  |       nop          # here is a comment about syscall
   |  ____________________________________________^
8  | |     nop          // and another syscall comment
9  | |     jmp syscall
10 | |     hlt
11 | | .global syscall
12 | | syscall:
   | |_______^
   |
   = note: `#[deny(named_asm_labels)]` on by default
   = help: only local labels of the form `<number>:` should be used in inline asm
   = note: see the asm section of the unstable book <https://doc.rust-lang.org/nightly/unstable-book/library-features/asm.html#labels> for more information

Playground

@bjorn3
Copy link
Member

bjorn3 commented Aug 17, 2021

The lint triggers on the label definition (syscall:) and then tries to find uses of it using a heuristic. This heuristic skips // comments, but not platform dependent comments. It does not trigger on comments except if the comment contains a ;. It seems that ; is used as statement separator before // comments are stripped. It should have been the opposite way around. First stripping comments and then splitting at ;.

@asquared31415
Copy link
Contributor Author

@haraldh please file a separate issue about that, it's a lot easier to track the issues that way

I was not aware that platform specific comments existed, I tried to make sure that I wasn't missing platform specific behavior, but must have missed that anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants