Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup of 6 pull requests #84804

Closed
wants to merge 14 commits into from

Conversation

Dylan-DPC-zz
Copy link

Successful merges:

Failed merges:

r? @ghost
@rustbot modify labels: rollup

Create a similar rollup

dario23 and others added 14 commits April 21, 2021 14:09
On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282

static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553

Tested on Graviton2 aarch64-linux with `./x.py test`.
All fields except the discriminant (including `outer_fields`)
should be put into structures inside the variant part, which gives
an equivalent layout but offers us much better integration with
debuggers.
 - Literally, variants are not artificial. We have `yield` statements,
   upvars and inner variables in the source code.
 - Functionally, we don't want debuggers to suppress the variants. It
   contains the state of the generator, which is useful when debugging.
   So they shouldn't be marked artificial.
 - Debuggers may use artificial flags to find the active variant. In
   this case, marking variants artificial will make debuggers not work
   properly.

Fixes rust-lang#79009.
This should fix linking of other C code (and soon Rust-generated code)
on aarch64 musl.
…or,RalfJung

Make AssertKind::fmt_assert_args public
[Arm64] use isb instruction instead of yield in spin loops

On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

```
$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282
```

```
static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553
```

Tested on Graviton2 aarch64-linux with `./x.py test`.
Fix debuginfo for generators

First, all fields except the discriminant (including `outer_fields`) should be put into structures inside the variant part, which gives an equivalent layout but offers us much better integration with debuggers.

Second, artificial flags in generator variants should be removed.
 - Literally, variants are not artificial. We have `yield` statements, upvars and inner variables in the source code.
 - Functionally, we don't want debuggers to suppress the variants. It contains the state of the generator, which is useful when debugging. So they shouldn't be marked artificial.
 - Debuggers may use artificial flags to find the active variant. In this case, marking variants artificial will make debuggers not work properly.

Fixes rust-lang#79009.

And refer https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Debuginfo.20for.20generators.
…ns, r=Amanieu

Update compiler-builtins to 0.1.41 to get fix for outlined atomics

This should fix linking of other C code (and soon Rust-generated code)
on aarch64 musl.
@rustbot rustbot added the rollup A PR which is a rollup label May 1, 2021
@Dylan-DPC-zz
Copy link
Author

@bors r+ rollup=never p=5

@bors
Copy link
Contributor

bors commented May 1, 2021

📌 Commit 6beac08 has been approved by Dylan-DPC

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label May 1, 2021
@bors
Copy link
Contributor

bors commented May 2, 2021

⌛ Testing commit 6beac08 with merge 5cfa3c185aacb49812f4b70e50fd6e9d8d25c03b...

@rust-log-analyzer
Copy link
Collaborator

The job dist-aarch64-msvc failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
   Compiling compiler_builtins v0.1.41
   Compiling libc v0.2.93
The following warnings were emitted during compilation:

warning: <unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
error: failed to run custom build command for `compiler_builtins v0.1.41`

Caused by:
Caused by:
  process didn't exit successfully: `D:\a\rust\rust\build\x86_64-pc-windows-msvc\stage1-std\release\build\compiler_builtins-ba1d5695d3dcf262\build-script-build` (exit code: 1)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:compiler-rt=C:\Users\runneradmin\.cargo\registry\src\github.com-1ecc6299db9ec823\compiler_builtins-0.1.41\compiler-rt
  cargo:rustc-cfg=feature="unstable"
  cargo:rerun-if-changed=D:\a\rust\rust\src/llvm-project/compiler-rt\lib/builtins\aarch64/lse.S
  TARGET = Some("aarch64-pc-windows-msvc")
  OPT_LEVEL = Some("3")
  HOST = Some("x86_64-pc-windows-msvc")
  CC_aarch64-pc-windows-msvc = None
  CC_aarch64_pc_windows_msvc = None
  TARGET_CC = None
  CC = Some("D:/a/rust/rust/citools/clang-rust/bin/clang-cl.exe")
  CFLAGS_aarch64-pc-windows-msvc = None
  CFLAGS_aarch64_pc_windows_msvc = None
  TARGET_CFLAGS = None
  CRATE_CC_NO_DEFAULTS = None
  CRATE_CC_NO_DEFAULTS = None
  CARGO_CFG_TARGET_FEATURE = Some("crt-static,fp,neon")
  DEBUG = Some("true")
  CC_aarch64-pc-windows-msvc = None
  CC_aarch64_pc_windows_msvc = None
  TARGET_CC = None
  CC = Some("D:/a/rust/rust/citools/clang-rust/bin/clang-cl.exe")
  CFLAGS_aarch64-pc-windows-msvc = None
  CFLAGS_aarch64_pc_windows_msvc = None
  TARGET_CFLAGS = None
  CRATE_CC_NO_DEFAULTS = None
  CRATE_CC_NO_DEFAULTS = None
  CARGO_CFG_TARGET_FEATURE = Some("crt-static,fp,neon")
  running: "D:/a/rust/rust/citools/clang-rust/bin/clang-cl.exe" "-nologo" "-MT" "-O2" "-Z7" "-Brepro" "--target=aarch64-pc-windows-msvc" "-I" "D:\\a\\rust\\rust\\src/llvm-project/compiler-rt\\lib/builtins" "/Zl" "-D__func__=__FUNCTION__" "-DL_cas" "-DSIZE=1" "-DMODEL=1" "-FoD:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage1-std\\aarch64-pc-windows-msvc\\release\\build\\compiler_builtins-3d58653139d156db\\out\\lse.o" "-c" "D:\\a\\rust\\rust\\src/llvm-project/compiler-rt\\lib/builtins\\aarch64/lse.S"
  cargo:warning=<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives

  --- stderr



  error occurred: Command "D:/a/rust/rust/citools/clang-rust/bin/clang-cl.exe" "-nologo" "-MT" "-O2" "-Z7" "-Brepro" "--target=aarch64-pc-windows-msvc" "-I" "D:\\a\\rust\\rust\\src/llvm-project/compiler-rt\\lib/builtins" "/Zl" "-D__func__=__FUNCTION__" "-DL_cas" "-DSIZE=1" "-DMODEL=1" "-FoD:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage1-std\\aarch64-pc-windows-msvc\\release\\build\\compiler_builtins-3d58653139d156db\\out\\lse.o" "-c" "D:\\a\\rust\\rust\\src/llvm-project/compiler-rt\\lib/builtins\\aarch64/lse.S" with args "clang-cl.exe" did not execute successfully (status code exit code: 1).

warning: build failed, waiting for other jobs to finish...
[RUSTC-TIMING] core test:false 21.355
error: build failed
error: build failed
command did not execute successfully: "\\\\?\\D:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\stage0\\bin\\cargo.exe" "build" "--target" "aarch64-pc-windows-msvc" "-Zbinary-dep-depinfo" "-j" "8" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace profiler compiler-builtins-c" "--manifest-path" "D:\\a\\rust\\rust\\library/test/Cargo.toml" "--message-format" "json-render-diagnostics"
failed to run: D:\a\rust\rust\build\bootstrap\debug\bootstrap dist
Build completed unsuccessfully in 0:29:15

@bors
Copy link
Contributor

bors commented May 2, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 2, 2021
@Dylan-DPC-zz Dylan-DPC-zz deleted the rollup-eozupe4 branch May 3, 2021 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR which is a rollup S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants