-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoiding integer overflow #45
Conversation
One idea I had for making using the pub struct Checked<T>(pub T);
impl<T> Checked<T> {
/// Retrieves the stored value
pub fn get(self) -> T {
match self { Checked(x) => x }
}
}
impl<T: CheckedAdd> Add<Checked<T>, Option<T>> for Checked<T> {
fn add(&self, other: &Checked<T>) -> Option<T> {
match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_add(y) }
}
}
impl<T: CheckedSub> Sub<Checked<T>, Option<T>> for Checked<T> {
fn sub(&self, other: &Checked<T>) -> Option<T> {
match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_sub(y) }
}
}
impl<T: CheckedMul> Mul<Checked<T>, Option<T>> for Checked<T> {
fn mul(&self, other: &Checked<T>) -> Option<T> {
match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_mul(y) }
}
}
impl<T: CheckedDiv> Div<Checked<T>, Option<T>> for Checked<T> {
fn div(&self, other: &Checked<T>) -> Option<T> {
match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_div(y) }
}
} Could this be a simpler solution? Anyway, despite the above, I think both range and mod types would be extremely valuable for a safety conscious language like Rust. I think it might be worth looking at how other languages do range and mod types. For example Ada: type Score is range 1 .. 10;
type Hours is mod 24; I asked a question on stack overflow a while back regarding how these are implemented. This is probably already known to you though. |
Don't forget about arithmetic with saturation, there's very basic |
Nimrod has 2 types on integers. The default ones are not allowed to overflow. If they do in a debug build, there's warranted exception. In the release mode behaviour is undefined. I like this dependable crash in debug with the default integer type, but no check in release. Very fast when it matters and much more secure than simple wrapping. Also, I fully agree with bjz that both range and mod types would be very valuable. |
Relevant: the isl library used in GCC's graphite optimizations. I believe it's only used for optimizations because the generic case is exactly the Halting Problem. As for alternatives, an |
The paper Dependent Types in Practical Programming (http://www.cs.cmu.edu/~fp/papers/popl99.pdf) is a good example of this (and some other things it would be useful for), although it doesn't really deal with the size of the integer types (especially the issues with runtime-closed range types, which feel rather unpleasant). |
@akdjka: Rust is never going to have checks that are left out at runtime and lead to undefined behaviour. It doesn't belong in the language outside of |
@thestinger |
Rust isn't going to allow catching undefined behaviour only in debug builds and allowing it in release builds. Adding sanitizer-like flags to |
Could rust get another build mode with equivalent (dangerous) functionality to c++ release, but in rust terminology it would be an 'unsafe-performance' or something. I get that rusts focus is safe, you might add yet more checks for a 'debug' build' .. but for me it would be interesting to consider rust's default semantics as basically a "debug build on steroids" with lots of extra "assertions" built into the language semantics.. but for "unsafe-performance" matching C++ you dont want a single branch out of place. .. changes would be things like unwrap not testing for failiure, no bounds-check on array acess, probably some different numeric assumptions .. IMO Rust has a lot to offer besides safety - I looked for this language mostly because I dont like the way classes and headers work in c++, and was interested in 'a more FP style', whilst not being able to tolerate a GC language. Bear in mind the language shootouts... perhaps it would be nice to show that rust can match C++ 1:1, if you want it to. It shouldn't involve having to write uglier code than C++ (unwraps, not getting nice array sugar, extra nesting for unsafe or making your code incompatible with mainstream rust code, having to use nonstandard types like 'real' instead of float, etc) |
@dobkeratops , an "unsafe-performance" compiler flag is not acceptable. Thoughtlessly disabling safety features for the sake of performance is contrary to the language's philosophy. |
I would love some data on the performance impact of checked arithmetic. I don't think using the Clang flag is close enough. I would like to have fail on overflow implemented in Rust and profile Servo (or some other large code base) to see the performance impact. The downside to this is we might need some non-trivial optimisation in Rust code to make checking performant. |
Nothing short of hardware support is going to make it perform well. LLVM can't even deal with totally obvious value propagation within a local scope with only local pointers. It struggles just to optimize out the iterators built out of pairs of raw pointers because it's unable to optimize out a null check right next to code proving it's not null by dereferencing it. Doing something almost reasonable with code that looks like what a C programmer would write is about all you can expect from LLVM. |
Here's a trivial test case using the most efficient implementation of checked arithmetic that's available (LLVM intrinsics explicitly performing it and compiling down to near optimal machine code): extern crate test;
use std::intrinsics::{abort, u32_add_with_overflow};
use test::Bencher;
#[inline(never)]
pub fn no_overflow_check_inner(xs: &mut [u32]) {
for x in xs.mut_iter() {
*x += 1
}
}
#[inline(never)]
pub fn overflow_check_inner(xs: &mut [u32]) {
for x in xs.mut_iter() {
let i = *x;
let (result, overflow) = unsafe { u32_add_with_overflow(i, 10) };
if overflow { unsafe { abort() } }
*x = result;
}
}
#[bench]
pub fn no_overflow_check(b: &mut Bencher) {
let mut xs = [0, ..300000];
b.iter(|| {
no_overflow_check_inner(xs);
})
}
#[bench]
pub fn overflow_check(b: &mut Bencher) {
let mut xs = [0, ..300000];
b.iter(|| {
overflow_check_inner(xs);
})
}
This is not atypical of what the hot loops in an application look like. Servo will likely have many loops like this (loops or series of scalar integer operations), and performing these checks makes it far slower than Java or even JavaScript. It's not possible to do this and claim Rust is a systems language - it will be fighting for the performance crown with PyPy, not C++. It's all well and good to pretend there's a sufficiently smart compiler but there is not; hardware support is required to make this perform well and even if it became available on modern x86 CPUs it wouldn't be used by the x86 binaries If someone fixes this test case, and the next one, and the next one, and many far more complicated cases with all kinds of pointer aliasing then perhaps it would start making more sense. This case should be trivial to optimize, because it's clear that every one of these is going to overflow at the same time from some interprocedural. Real world cases are far more complex than this, and LLVM isn't even able to begin tackling the Hello World of interprocedural optimization. |
I don't think that a compiler will help us, not easily at least (the kind of optimisation I imagine are not the kind that LLVM would do, but that we would do in rustc, e.g., hoist the overflow check out of the loop in I know that adding an overflow check for this kind of micro-benchmark will be ruinous to performance, but the question is really how many tight loops like this exist in Servo and how often they get executed. Even if performance in such a loop degrades 100% or more in the worst case, we have no idea how this will affect real performance in real software. |
Optimizing in
You first have to prove that they're all equal via interprocedural analysis, which is trivial in this case but impossible in most cases.
This is far from the worst case. I can write a bigger program where the code bloat from all the branches on nearly every primitive operation causes far pipeline stalls. |
I don't see a point in Rust existing if it doesn't take performance seriously. In my opinion, Rust is already missing the performance and compile-time speed it needs to be a competitor to C and C++. At the moment, the abstractions hit too many missed optimizations and optimal code ends up requiring lots of custom The array bounds checks are a killer performance issue and working around it is very ugly. The extensive use of unwinding is leading to ridiculously long compile-time and causing missed optimizations all over the place too. There's a very good reason for performance-conscious projects like LLVM disabling unwinding and Rust isn't viable in that niche right now. As far as I can tell, Swift completely avoids exceptions and unwinding and as a language designed by the lead developer of LLVM that says a lot. Rust can bury its head in the sand and ignore the current compile-time and performance issues, but C++ programmers won't care about the excuses. They won't take a second look at the language if it branches on every single integer arithmetic operation. It's already pretty hard to sell the sacrifices the languages is making. I'm not sure how viable a C++ replacement the language is when anything doing array indexing is going to perform poorly without lots of |
@thestinger if you are right (about the performance impact), than I would not support this, but I want to see evidence. Currently we have no data to indicate any performance hit in real software (and likewise no data to indicate there won't be a performance hit). Dismissing the feature because of poor performance in a micro-benchmark is premature optimisation of every program ever to be written in Rust. Comments on array bounds checks, unwinding, and other performance concerns are off-topic for this RFC, please take them elsewhere. |
You're proposing unwinding via a branch on every arithmetic operation, so comments about the performance and compile-time woes caused by unwinding are very relevant. It's already a 2x compile-time hit, and adding more unwinding and lots of new branches will make it worse.
Rust should stop being advertised as a C competitor / systems language if it doesn't care about the obvious code mapping down to good machine code. If it's easier to write fast numeric code in Java, then the language isn't a systems language. |
In addition to performance, and the fact that it makes operations not associative, overflow checking is bad because it solves the wrong problem. If one adds two integers that don't fit into a 32/64-bit integer, the ideal behavior is generally not to abort the program, but rather to simply store the result in a type that is large enough, even if that means the integer type is larger than 32/64 bits. The RFC does propose that, and applied to @thestinger test case, would allow to easily express a program that at the LLVM bytecode level uses an array of 64-bit to 128-bit integers (depending on settings and whether the 1/10 addend is constant or not) instead of 32-bit integers, making overflow impossible. |
@bill-myers wasting half the registers and cache is not much better than excessive branching. |
I'm in agreement with @thestinger. We've long since decided to eat the cost of bounds checking (which nobody is proposing that we remove), and that eats up the entirety of our imposed-overhead budget. I'm also starting to be convinced that the cost of unwinding will be unacceptable in the long run. Given these, we just don't have space for a decadent feature like this, especially one that doesn't close any memory safety holes. Moving optimization passes into the frontend sounds like a bad idea as well. Having well-defined integer overflow is already a step forward from C and C++. Baby steps. |
Per today's RFC triage we're going to close this. The default unchecked behavior is unlikely to change, and there are several other related proposals whose approach would be preferred over this. |
Make stream::Fuse return None after finishing, not NotReady
Solicit feedback about IE8 and IE9 support in Ember 2.x
This is an RFC for avoiding integer overflow, which first discusses various options, and then focuses on and explains a specific proposal that does not make use of failure, and introduces a new kind of integer type to handle things like loop counters.