Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does our UB have "time travel" semantics? #407

Open
RalfJung opened this issue May 27, 2023 · 8 comments
Open

Does our UB have "time travel" semantics? #407

RalfJung opened this issue May 27, 2023 · 8 comments

Comments

@RalfJung
Copy link
Member

If a program performs an externally observable action (such as printing to stderr or volatile memory accesses) followed by UB, do we guarantee that the action occurs?

The fact that C does not guarantee this is known as "time-traveling UB" and it can be quite perplexing. It arises when the compiler reorders potentially-UB operations up across mustreturn functions:

eprintln!("I'm here!"); // let's say the compiler deduced that this must return
ptr::null::<i32>().read();

If the read is moved up across the print, we get a pagefault before printing, i.e. we have time travel.

From the theory perspective, this is a reasonably clear situation: either we can move potentially-UB code up across externally observable operations, or we guarantee that UB is properly sequenced wrt externally observable operations. We cannot have both. (In principle we can decide this on a per-operation basis, e.g. we could require proper sequencing for stdout/stderr but allow reordering for volatile accesses.)

So how bad is the cost of not doing such optimizations? Is that even something we can currently evaluate, or does LLVM just always perform such reorderings?

(I am assuming here a general consensus that potentially-UB operations can be reordered wrt not externally observable operations. This can only be observed with a debugger and IMO clearly falls on the side of the optimizations having too may benefits to do anything else.)

@saethlin
Copy link
Member

(I'm aware that I am going to ignore the questions you directly posed and generally address the broader issue)

I'm very wary of this question. I commented in the meeting that I'm concerned that the actual problem here is not so much about the suitability of some semantics, but about the user's ability to debug programs.

So I'm concerned that we have a sort of fundamental teaching/tooling issue. Many users learn to debug programs compiled in a debugging-friendly mode by inserting println! or such. And then they develop an incorrect intuition for how to debug their programs and eventually try to extend this to a program which executes UB and then have trouble.

I suggested in the meeting that what users probably need is a UB-proof debugging facility. Miri is one such facility, but it can't execute all programs so we definitely need more. Sanitizers are another such partial facility. If we are going to have reorderings over externally observable effects, I think we should be able to explain how users should debug programs in the face of such reorderings.

@digama0
Copy link

digama0 commented May 27, 2023

I suggested in the meeting that what users probably need is a UB-proof debugging facility. Miri is one such facility, but it can't execute all programs so we definitely need more. Sanitizers are another such partial facility. If we are going to have reorderings over externally observable effects, I think we should be able to explain how users should debug programs in the face of such reorderings.

I want to bring up a related point that likely needs a separate issue but I fear will get shot down too quickly out of context. C compilers often come with flags to control what standard they conform to. The obvious example of this is --std=c99 but I am thinking here more about flags like -fwrapv, which can be thought of as crossing out the line of the standard that says "signed overflow is UB" and replacing it with "signed overflow is defined to wrap".

Historically, we have shied away from such flags for fear of fragmenting the ecosystem, but perhaps this is a good place to use the technique. Turning off optimizations is obviously not the right way to get predictable behavior in the face of UB, but a flag that actually makes more things DB (at the expense of performance) might actually be a much more effective way to get many of the same benefits from Miri in the regular compiler. In this case, we could have a flag that makes all IO actions block propagation of UB. Another example along the same lines would be a flag to make all reads and writes volatile.

@RalfJung
Copy link
Member Author

I'm very wary of this question. I commented in the meeting that I'm concerned that the actual problem here is not so much about the suitability of some semantics, but about the user's ability to debug programs.

That sounds like an important axis for "suitability of the semantics" to me. Making printf-debugging work is a good argument against time-traveling UB.

@comex
Copy link

comex commented May 27, 2023

So how bad is the cost of not doing such optimizations? Is that even something we can currently evaluate, or does LLVM just always perform such reorderings?

I don't think there's a way to specifically tell LLVM not to perform such reorderings. But as far as I can tell, the allowed optimizations without time travel are exactly equivalent to the allowed optimizations if you assume that I/O functions might loop forever instead of returning. So in theory, it should just be a matter of removing the willreturn attribute from all functions that perform externally observable actions (as well as any copies of the attribute that are placed directly on call instructions).

In practice, though, I don't think I/O functions or calls ever get annotated with willreturn in the first place. rustc never applies willreturn annotations except in the case of asm! invocations marked as pure. Even if you use LTO to add in Clang-generated IR, Clang only applies willreturn to functions marked __attribute__((const)) or __attribute__((pure)) (plus the runtime functions used to implement atomics on some platforms).

Nor can LLVM deduce willreturn on its own. Even if the implementations of I/O functions are visible due to LTO, and even if LLVM can follow the chain of calls all the way down to the underlying syscalls, it has no way of knowing the syscalls must return without manual annotations. (Though things might differ if you're in some minimal embedded environment where I/O functions are in-process and just write to some buffer.)

As for volatile accesses, LLVM already treats volatile stores as potentially not returning, though not so for volatile loads, per LangRef.

In sum, with the exception of volatile loads, rustc already doesn't do time travel in practice. If anything, the question is how much performance could be gained by adding willreturn to I/O functions. Probably not much.

That said, if we wanted to officially guarantee the absence of time travel, we'd have to consider the possibility that Clang someday decides to add an attribute for willreturn, and OS vendors start using it for I/O functions, potentially infecting Rust code compiled with LTO. (If the C standard proposal to ban time travel is passed, that might become a non-issue.)

@deltragon
Copy link

a flag that actually makes more things DB (at the expense of performance) might actually be a much more effective way to get many of the same benefits from Miri in the regular compiler.

Isn't this what debug assertions are already doing with eg. rust-lang/rust#51713? I'm specifically thinking of rust-lang/rust#98112 here, which added assertions on the language level, not just in the library.
I assume that these assertions would also inhibit LLVM from reordering code around them.

@saethlin
Copy link
Member

I do not think so. Per @digama0's comment, what I'm doing is not like -fwrapv, it's like -fsanitize=undefined. I'm interested in making programs halt and report UB, not continue.

@Lokathor
Copy link
Contributor

@comex from the llvm langref

The compiler may assume execution will continue after a volatile operation, so operations which modify memory or may have undefined behavior can be hoisted past a volatile operation.

that to me reads that all volatile is "will return" semantically.

@JakobDegen
Copy link
Contributor

@Lokathor the very next paragraph is:

As an exception to the preceding rule, the compiler may not assume execution will continue after a volatile store operation. This restriction is necessary to support the somewhat common pattern in C of intentionally storing to an invalid pointer to crash the program. In the future, it might make sense to allow frontends to control this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants