Skip to content

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented May 12, 2020

On local one-off benchmarking of libcore metadata-only, debug asserts in std are a significant hit (15s to 20s). Provide an option for compiler developers to disable them. A build with a nightly compiler is around 10s, for reference.

@rust-highfive
Copy link
Contributor

@Mark-Simulacrum: no appropriate reviewer found, use r? to override

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 12, 2020
@Mark-Simulacrum
Copy link
Member Author

Happy to let @nnethercote approve (or perhaps @alexcrichton has some time, not sure).

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Collaborator

bors commented May 12, 2020

📌 Commit 6c41545 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 12, 2020
@bjorn3
Copy link
Member

bjorn3 commented May 13, 2020

Should std debug assertions be enabled at

RUST_CONFIGURE_ARGS="$RUST_CONFIGURE_ARGS --enable-debug-assertions"

@Mark-Simulacrum
Copy link
Member Author

This shouldn't change behavior of any existing configurations, just allows for more control. So I think no.

@RalfJung
Copy link
Member

RalfJung commented May 14, 2020

Given numbers like this, should we stop pursuing issues like #51713 #53871 #54915 ?

@Mark-Simulacrum
Copy link
Member Author

I don't have time right now to look into it, but I think that the answer is probably yes -- or at least we should consider using something even more stringent than just debug asserts. I'd be down for example to add a separate --cfg for std that enabled "slow mode."

Interestingly, looking at a perf profile of a debug asserting std in the rustc libcore case I used for this PR's measurements, I see that a lot of time is spent in core::intrinsics::is_nonoverlapping.

It looks like it has 3 separate panic calls in the assembly -- presumably from subtractions and the checked_mul unwrap? That makes its code size pretty large for what it does.

I think we should be replacing any panics in the debug asserts in std with just aborts -- it doesn't make sense to pay the cost of unwinding and the panic message being formatted etc. At the very least I stop seeing most of the calls to is_nonoverlapping with this patch applied; presumably they get inlined?

It doesn't seem to result in a significant difference in timing though, I'm still at approximately 20 seconds, so maybe this is the wrong approach :/

diff --git a/src/libcore/intrinsics.rs b/src/libcore/intrinsics.rs
index 962bcbe6198..4c8650168c1 100644
--- a/src/libcore/intrinsics.rs
+++ b/src/libcore/intrinsics.rs
@@ -1954,8 +1954,13 @@ pub(crate) fn is_aligned_and_not_null<T>(ptr: *const T) -> bool {
 pub(crate) fn is_nonoverlapping<T>(src: *const T, dst: *const T, count: usize) -> bool {
     let src_usize = src as usize;
     let dst_usize = dst as usize;
-    let size = mem::size_of::<T>().checked_mul(count).unwrap();
-    let diff = if src_usize > dst_usize { src_usize - dst_usize } else { dst_usize - src_usize };
+    let size = mem::size_of::<T>().checked_mul(count).unwrap_or_else(|| unsafe { abort() });
+    // Wrapping subs here are fine because we've already checked the condition
+    let diff = if src_usize > dst_usize {
+        src_usize.wrapping_sub(dst_usize)
+    } else {
+        dst_usize.wrapping_sub(src_usize)
+    };
     // If the absolute distance between the ptrs is at least as big as the size of the buffer,
     // they do not overlap.
     diff >= size

results in this assembly:

core::intrinsics::is_nonoverlapping  /home/mark/Build/rust/build/x86_64-unknown-linux-gnu/stage1/lib/librustc_driver-a1a55821ab603792.so [Percent: local period]
 11.47mov    %rdx,%rax
 12.08mov    $0x8,%ecx
  9.55mul    %rcx
 13.37 │    ↓ jo     24
  7.63mov    %rdi,%rcx
sub    %rsi,%rcx
 12.11neg    %rcx
sub    %rsi,%rdi
  1.92 │      cmovbe %rcx,%rdi
 12.74cmp    %rax,%rdi
 10.23 │      setae  %al
  8.91 │    ← retq                                                                                                                                                                                                 ▒
24:   ud2
ud2

versus the old assembly:

core::intrinsics::is_nonoverlapping  /home/mark/Build/rust/build/x86_64-unknown-linux-gnu/stage1/lib/librustc_driver-a1a55821ab603792.so [Percent: local period]
  9.14push  %rax
 13.83mov   %rdx,%rax
  0.93mov   $0x8,%ecx
  1.88mul   %rcx
  1.88 │    ↓ jo    43
  1.17mov   %rdi,%rcx
  8.18sub   %rsi,%rcx
       │    ↓ jbe   33
  0.94 │    ↓ jae   3b                                                                                                                                                                                             ▒
lea   str.0,%rdi
lea   anon.602fa13ea877b28742aac08a06b3d28a.220.llvm.2151656835915294274+0x31d0,%rdx
mov   $0x21,%esi
       │    → callq *0x25f0617(%rip)        # 8567168 <core::panicking::panic@Base>                                                                                                                                ▒
ud2
  1.8833:   sub   %rdi,%rsi
  1.18mov   %rsi,%rcx
  0.94 │    ↓ jb    5e                                                                                                                                                                                             ▒
  1.88 │3b:   cmp   %rax,%rcx
  0.47 │      setae %al
  3.05pop   %rcx
 52.64 │    ← retq                                                                                                                                                                                                 ▒
43:   lea   str.0+0x21,%rdi
lea   anon.602fa13ea877b28742aac08a06b3d28a.220.llvm.2151656835915294274+0x31b8,%rdx
mov   $0x2b,%esi
       │    → callq *0x25f05ec(%rip)        # 8567168 <core::panicking::panic@Base>                                                                                                                                ▒
ud2                            │5e:   lea   str.0,%rdi
lea   anon.602fa13ea877b28742aac08a06b3d28a.220.llvm.2151656835915294274+0x31e8,%rdx
mov   $0x21,%esi
       │    → callq *0x25f05d1(%rip)        # 8567168 <core::panicking::panic@Base>                                                                                                                                ▒
ud2

@RalfJung
Copy link
Member

Interesting. Aborts don't generate backtraces, which makes them much harder to debug, but that might still be worth it.

@Mark-Simulacrum
Copy link
Member Author

Yeah, but we should only be using them in places where unsafe code would otherwise have UB and I think the tradeoff may be worth it. Ideally we'd replace debug_assert! to call abort rather than panic! and measure, I think.

bors added a commit to rust-lang-ci/rust that referenced this pull request May 15, 2020
Rollup of 7 pull requests

Successful merges:

 - rust-lang#71809 (Use `LocalDefId` in `DumpVisitor::nest_tables`)
 - rust-lang#72062 (Add built in PSP target)
 - rust-lang#72146 (Provide separate option for std debug asserts)
 - rust-lang#72172 (Forbid stage arguments to check)
 - rust-lang#72173 (Make intra links work inside trait impl block)
 - rust-lang#72200 (Add prioritize_on attribute to triagebot)
 - rust-lang#72214 (Minor fixes to comments)

Failed merges:

r? @ghost
@bors bors merged commit 24cd427 into rust-lang:master May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants