-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimentally add ffi_const
and ffi_pure
extern fn attributes
#58327
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
e8ccb14
to
bf149c0
Compare
By curiosity, why not allow those attributes for rust code too (not only ffi)? |
Because that would tie rustc to using LLVM as a backend more than necessary. The plan is to eventually allow using other backends like cranelift. |
A couple of reasons. First, by restricting them to C FFI we can get away with just saying "they do whatever they do in C". This adds instant value and would result in a pretty uncontroversial RFC. Other languages can interpret C header files and use these attributes (e.g. C++, D, ...) but Rust C FFI wrappers need a way to express them. Second, we can always allow these in Rust in the future in a backwards compatible way. We would then have to specify what these attributes "mean" for Rust code (Rust is not C), what Rust functions using these attributes are exactly allowed to do, etc. We might also need to give these some other names, since
We don't want to tie Rust to LLVM by exposing "LLVM-isms" ( If there are some C attributes that you want to use in C FFI, PRs welcome I guess. This PR shows pretty much how to add them. Libraries like |
While the reasons by @jonas-schievink and @gnzlbg are convincing already, I’d like to point that this attribute is simply not necessary for Rust code, because LLVM is capable of deriving these attributes as part of optimisation process on its own. |
Right. I imagine that, in some cases, it might be useful for Rust code to guarantee some of these properties to their callers. But that would mean that for Rust code, the attribute / language feature goal would be to express this guarantee, e.g., failing to compile if the implementation of the API changes in such a way that these properties don't hold anymore. Since we already have For FFI, Rust can't derive these properties, so the purpose of these attributes is more clear. |
ba57d67
to
945cbe4
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The implementation side of things looks good to me, so r+ on that. Somebody from T-lang should sign off on this as well, I think. |
Sorry, my answer is a little late. You all raised interesting valid points that I didn't consider first! "I’d like to point that this attribute is simply not necessary for Rust code, because LLVM is capable of deriving these attributes as part of optimisation process on its own." To make the question "should rust support more llvm attributes (as it already support a few) ?" more actionable, I've found sorting attributes in categories is a great help (ideally one would visualize through Venn diagram). There is an uncontroversial, low hanging fruit category of attributes to support in rust code. And there is the last category, code behavior changing llvmisms not supported by GCC, while I don't consider realistic from rust to switch from llvm to GCC (because of human ressource management efficiency (that would unnecessarily split optimisations work and increase limitations for what advantage other than ironically supporting the expressiveness of GCC only attributes?)) I agree this category is a controversial high hanging fruit. Note : I guess that would be nice to open a meta issue for increasing rust attributes support, prioritizing inconsequential new compiler space/time optimisations hints. |
That would be better, discussing it here is turning a bit offtopic. |
I agree with @nagisa -- the implementation looks good but this should be OK'ed by the lang team first. |
Co-Authored-By: gnzlbg <gnzlbg@users.noreply.github.com>
I'm wary of the potential for unbounded increases in complexity for the purposes of FFI, especially when it comes to optimization rather than things which are impossible otherwise. I also wonder what role @rfcbot merge |
Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
I'm not quite clear on what the status of this is. Did @rust-lang/lang sign off on it? I think some boxes still need checking. |
I don't think so, no. cc @eddyb taking a look at these, I briefly thought about whether we could use them to implement the #[c_ffi_const]
fn bench_input<T>(x: T) -> T {
asm!("" : "=*m"(&x) : "*m"(&x) : : :);
x
} but I would be surprised if telling LLVM that a function that it can prove as "not |
☔ The latest upstream changes (presumably #58315) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping from triage, @aturon @withoutboats @pnkfelix @nikomatsakis |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
r? @rkruppe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code and tests look good, but I have concerns about the (documented) semantics of these attributes.
not affected by changes to the observable state of the program. | ||
|
||
The behavior of calling a `#[c_ffi_const]` function that violates these | ||
requirements is undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be UB to even have a wrongly annotated declaration, saying it's UB if the function is called doesn't justify speculatively executing it (e.g., hoisting out of branches) .
depend on the values of the function parameters and/or global memory. | ||
|
||
The behavior of calling a `#[c_ffi_pure]` function that violates these | ||
requirements is undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for const: it should be UB to have a declaration wrongly annotated c_ffi_pure
, even if it's not called.
|
||
That is, `#[c_ffi_pure]` functions shall have no effects except for its return | ||
value, which shall not change across two consecutive function calls and can only | ||
depend on the values of the function parameters and/or global memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't sound right. GCC docs specifically state pure functions can read arbitrary non-volatile memory. I don't know for sure what you mean by "global memory" is but certainly pure functions can be passed pointers to a caller's local variables and read them through those pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A pure
function can read static
s, including static
s that are pointers, and read memory through these pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can do all that and even more. And I don't think this paragraph properly conveys that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any suggestion?
Being able to read caller local variables via pointers passed to them is AFAICT already covered by "depend on the values of the function parameters", and by global memory I mean any memory that is globally accessible, e.g., via an static, a magic address, etc. (not necessarily something that's passed to the function as a parameter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry for the delay.)
I think language like in the GCC docs (arbitrary memory) is better. The phrase "values of the function parameters" is not clear for pointer arguments because it can be read as "only depend on the pointer itself, not the contents of the memory it points to" and indeed that's sort of the case for c_ffi_const
functions. For example, IIUC uintptr_t tag_pointer(void *, int tag)
that does arithmetic on the address can be c_ffi_const
and work with pointers to non-constant memory, but of course reading from the pointed-to memory would generally make it non-c_ffi_const
.
since any other function could modify global memory read by `#[c_ffi_pure]` | ||
functions, altering their result. The `#[c_ffi_const]` attribute allows | ||
sub-expression elimination regardless of any operations in between the function | ||
calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you're getting at here but this wording is misleading. It is harder for the optimizer to apply CSE because it has to show that no relevant state changed between two calls it wants to CSE, but "only apply across successive invocations" only communicates that if it means "directly adjacent without anything/much in between", which is too strong as stated (e.g., it might be known that the pure function doesn't read certain memory, or that another function called in between doesn't write memory).
I think I'd prefer to weaken this note to just briefly describe the difficulties imposed on the optimizer by the weaker rules of pure vs const and not get into the hairy business of predicting what that actually means for when the optimization fires or not.
A `pure` function that returns unit has no effect on the abstract machine's | ||
state. | ||
|
||
A diverging and `pure` C or C++ function is unreachable. Diverging via a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same point about termination as with const
.
without side-effects is undefined behavior in C++ and not possible in C. In C++, | ||
the behavior of infinite loops without side-effects is undefined, while in C | ||
these loops can be assumed to terminate. This would cause a diverging function | ||
to return, invoking undefined behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph confuses me, I think it might incidentally mislead users too, and I don't even think it is technically correct
- C's "assumed to terminate" is indistinguishable from "UB if non-terminating", so that last sentence is superfluous (and communicates a wrong picture of UB)
- C doesn't actually say this about all infinite loops, it has an exception carved out for loops controlled by constant expressions, such as
while (1) { ... }
I think we should just say as a separate axiom that functions marked with this attribute must not diverge. GCC's docs don't say this as far as I can see (at least explicitly -- arguably non-termination is an observable effect), but it's kind of the safer assumption because who knows what LLVM's implicit assumptions around readnone
/readonly
functions are at the moment. It also seems useful to assume termination of const/pure functions to be able to execute them speculatively.
Hello from triage - @gnzlbg This PR has not seen any activity in a month. |
@gnzlbg Triage: This PR looks like it's almost done. Giving it one more week before closing for inactivity. |
Ping from triage: @gnzlbg |
I'm interested in this feature as well. I've started to rebase it onto master and intend to take care of the docs and open a new PR soon. |
Experimentally add `ffi_const` and `ffi_pure` extern fn attributes Add FFI function attributes corresponding to clang/gcc/... `const` and `pure`. Rebased version of rust-lang#58327 by @gnzlbg with the following changes: - Switched back from the `c_ffi_const` and `c_ffi_pure` naming to `ffi_const` and `ffi_pure`, as I agree with rust-lang#58327 (comment) and this nicely aligns with `ffi_returns_twice` - (Hopefully) took care of all of @hanna-kruppe's change requests in the original PR r? @hanna-kruppe
Experimentally add `ffi_const` and `ffi_pure` extern fn attributes Add FFI function attributes corresponding to clang/gcc/... `const` and `pure`. Rebased version of rust-lang#58327 by @gnzlbg with the following changes: - Switched back from the `c_ffi_const` and `c_ffi_pure` naming to `ffi_const` and `ffi_pure`, as I agree with rust-lang#58327 (comment) and this nicely aligns with `ffi_returns_twice` - (Hopefully) took care of all of @hanna-kruppe's change requests in the original PR r? @hanna-kruppe
These attributes correspond to the
const
andpure
C attributes.Closes #46046 (the attributes can then be added to jemalloc-sys).