Handle overflow properly in core::slice #25300

lilyball · 2015-05-11T08:17:26Z

core::slice was originally written to tolerate overflow (notably, with
slices of zero-sized elements), but it was never updated to use wrapping
arithmetic when overflow traps were added.

Also correctly handle the case of calling .nth() on an Iter with a
zero-sized element type. The iterator was assuming that the pointer
value of the returned reference was meaningful, but that's not true for
zero-sized elements.

Fixes #25016.

core::slice was originally written to tolerate overflow (notably, with slices of zero-sized elements), but it was never updated to use wrapping arithmetic when overflow traps were added. Also correctly handle the case of calling .nth() on an Iter with a zero-sized element type. The iterator was assuming that the pointer value of the returned reference was meaningful, but that's not true for zero-sized elements. Fixes rust-lang#25016.

rust-highfive · 2015-05-11T08:17:30Z

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

lilyball · 2015-05-11T08:19:12Z

src/libcore/slice.rs

@@ -684,13 +666,11 @@ macro_rules! iterator {
            fn next(&mut self) -> Option<$elem> {
                // could be implemented with slices, but this avoids bounds checks
                unsafe {
-                    ::intrinsics::assume(!self.ptr.is_null());
-                    ::intrinsics::assume(!self.end.is_null());


I'm not sure why these assumes were added in the first place, they showed up in f9ef8cd without anything relevant in the commit message, but they aren't valid.

helping llvm to not insert redundant null checks in certain loops. Ask @dotdash and @huonw

Why are they invalid?

I don't have everything in fresh memory, but if these are invalid, we are in big trouble anyway (can't use .offset). Split in cases based on the size of T here too?

They're potentially invalid if the element size is zero, because the iterators are free to wrap around in that case. Slices of zero-sized elements don't really have a meaningful pointer, but the iterators behave as though the element has a size of 1 in order to count the number of yielded elements correctly. This means that a slice with a sufficiently high pointer value and sufficiently large length will result in an iterator that wraps around.

I also question the edge case of a slice of non-zero-sized elements that includes the highest byte in the address space. It should be theoretically possible to have a valid slice that includes an element placed in the absolute highest address, and if so, then the end pointer will have wrapped around to 0 and the ptr value will wrap around when the iterator yields its final element. Although in this particular edge case I don't know if the use of offset() is valid.

I just added the assumptions back in a slightly different form. I believe the new version should be valid in all cases.

I'm of the impression that such an object size and placement would be illegal/impossible. If it isn't, we should make sure it is illegal in safe rustc. We can't safely index/address all bytes of that object otherwise.

GEP is our .offset().

If the GEP has the inbounds keyword, the result value is undefined (a “trap value”) if the GEP overflows (i.e. wraps around the end of the address space).

http://llvm.org/docs/GetElementPtr.html#what-happens-if-a-gep-computation-overflows

@bluss Ok, I wasn't sure what actually happened if it overflowed. I assumed that the resulting value couldn't actually be dereferenced, but I thought it was plausible that it could still be used as a pointer comparison (since one-past-the-end is a valid location, and having an object occupy the last byte in the address space seems like a plausible thing to support). But if the documentation explicitly says it's a poison value then I guess it is.

That said, this only applies to slices of non-zero-sized elements. offset() on a slice of zero-sized elements results in the same pointer (because anything multiplied by 0 is still 0). The iterators use that macro specifically to get around this case, but e.g. indexing into a slice of zero-sized elements uses offset() and that's fine (it just results in the same pointer that it was called on).

In light of this, I can re-add the end assumption to next().

The previous assumptions were not valid for slices of zero-sized elements.

lilyball · 2015-05-11T19:00:57Z

src/libcore/slice.rs

+                    unsafe {
+                        if mem::size_of::<T>() != 0 {
+                            ::intrinsics::assume(!self.ptr.is_null());
+                            ::intrinsics::assume(!self.end.is_null());


I'm assuming here that there's no need to make these assertions in the case where self.ptr == self.end. We don't care about the pointer value in that case.

Optimizing compilers are weird and chaotic. This will simply be revisited until it benches/codegens perfectly (it's the most important iterator after all).

Gankra · 2015-05-11T19:12:56Z

@kballard would you be willing to confirm that none of the benchmarks that the assertions were introduced to add have changed?

lilyball · 2015-05-11T19:16:39Z

@gankro If you can tell me which benchmarks were expected to be affected by the assertions then I can run them locally. But the commit that introduced them didn't make any mention of the assertions so I wasn't sure what the actual effect was.

Gankra · 2015-05-11T19:18:55Z

@kballard Oh whoops, I didn't mean to post that comment until I found which benches to check :)

I thought it was #21886 but that seems to be different assertions. Perhaps @dotdash knows.

lilyball · 2015-05-11T19:21:26Z

@gankro Looking at #22466 (the PR that introduced these assumptions) Kimundi said:

I talked with dotdash about - its not exactly duplication because the assumes are at a different location, and depending how the code will be inlined there might be cases where either of the two changes might not apply. Also, giving llvm multiple optimization hints can't really hurt.

However, currently these hints are unneeded for my code anyway, because the optimized codepaths that depended on them did not end up in the current iteration of this PR, so if you'd rather have them removed for now I can do that too.

So it's not clear that there even were any benchmarks affected by the addition of the assumptions in the first place.

Gankra · 2015-05-11T19:22:41Z

Good enough for me (still looking over the nitty gritty details). 🌴

lilyball · 2015-05-11T19:26:06Z

Looking at #21886 actually reveals another issue: Iter.as_slice() can yield a slice with a ptr of null. I'll add a commit to fix that as well.

core::slice::Iter.ptr can be null when iterating a slice of zero-sized elements, but the pointer value used for the slice itself cannot. Handle this case by always returning a dummy pointer for slices of zero-sized elements.

lilyball · 2015-05-11T19:41:04Z

I've added a commit to handle that case as well. I haven't actually run the tests locally for it yet, I'm going to let Travis CI do that for me.

Gankra · 2015-05-11T20:06:44Z

r=me once tests pass. Don't care a lot about my nits.

lilyball · 2015-05-11T20:10:32Z

Hmm, did Travis CI only run make tidy? I thought it did more than that.

Gankra · 2015-05-11T20:10:49Z

Nah it only does tidy these days

lilyball · 2015-05-11T20:11:09Z

I'll run a make check-stage1 locally first, unless you'd rather just let bors handle it?

Gankra · 2015-05-11T20:12:14Z

@bors r+

what are automation systems for but to test our junk for us?

bors · 2015-05-11T20:12:14Z

📌 Commit f2614f5 has been approved by Gankro

core::slice was originally written to tolerate overflow (notably, with slices of zero-sized elements), but it was never updated to use wrapping arithmetic when overflow traps were added. Also correctly handle the case of calling .nth() on an Iter with a zero-sized element type. The iterator was assuming that the pointer value of the returned reference was meaningful, but that's not true for zero-sized elements. Fixes #25016.

bors · 2015-05-12T14:39:34Z

⌛ Testing commit f2614f5 with merge 2a5a320...

bors · 2015-05-12T16:25:20Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-32-nopt-t, auto-win-32-opt, auto-win-64-nopt-t, auto-win-64-opt

rust-highfive assigned aturon May 11, 2015

lilyball mentioned this pull request May 11, 2015

core::slice code may overflow with slices of zero-sized elements #25016

Closed

lilyball reviewed May 11, 2015
View reviewed changes

Reintroduce non-null assumptions in core::slice iterators

e1e34e9

The previous assumptions were not valid for slices of zero-sized elements.

lilyball force-pushed the core-slice-overflow branch from 9ff95a6 to e1e34e9 Compare May 11, 2015 18:59

lilyball reviewed May 11, 2015
View reviewed changes

frewsxcv mentioned this pull request May 12, 2015

Certain file causes panic upon unpack alexcrichton/tar-rs#23

Closed

bors merged commit f2614f5 into rust-lang:master May 12, 2015

lilyball deleted the core-slice-overflow branch May 12, 2015 19:48

Handle overflow properly in core::slice #25300

Handle overflow properly in core::slice #25300

Uh oh!

Conversation

lilyball commented May 11, 2015

Uh oh!

rust-highfive commented May 11, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

Gankra commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

Gankra commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

Gankra commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

Gankra commented May 11, 2015

Uh oh!

lilyball commented May 11, 2015

Uh oh!

Gankra commented May 11, 2015

Uh oh!

bors commented May 11, 2015

Uh oh!

bors commented May 12, 2015

Uh oh!

bors commented May 12, 2015

Uh oh!

Uh oh!