Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] llvm-bolt crash on aarch64 platform #61075

Closed
rickyzhang012500 opened this issue Mar 1, 2023 · 15 comments
Closed

[AArch64] llvm-bolt crash on aarch64 platform #61075

rickyzhang012500 opened this issue Mar 1, 2023 · 15 comments
Labels
BOLT crash Prefer [crash-on-valid] or [crash-on-invalid]

Comments

@rickyzhang012500
Copy link

rickyzhang012500 commented Mar 1, 2023

tztek:/userdata/# llvm-bolt --version
LLVM (http://llvm.org/):
  LLVM version 15.0.7
  Optimized build with assertions.
  Default target: aarch64-unknown-linux-gnu
  Host CPU: (unknown)

BOLT revision <unknown>
  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_32 - AArch64 (little endian ILP32)
    aarch64_be - AArch64 (big endian)
    arm64      - ARM64 (little endian)
    arm64_32   - ARM64 (little endian ILP32)
ubuntu:$ /opt/llvm/bin/clang -v
clang version 13.0.1
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Candidate multilib: .;@m64
Selected multilib: .;@m64

Compiler Flags:

bazel build -c opt --config=j5 //speed:speed_finder_bm --copt=-O2 --copt=-flto --copt=-fwhole-program-vtables --copt=-gline-tables-only --copt=-funique-internal-linkage-names --copt=-fdebug-info-for-profiling --linkopt="-Wl,--no-rosegment" --copt=-fprofile-sample-use=/qcraft/speed_finder_bm.prof --copt=-emit-llvm --linkopt="-Wl,--emit-relocs,-znow"
tztek:/userdata# llvm-bolt ./speed_finder_bm -instrument -o speed_finder_bm_ins
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: <unknown>
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1600000, offset 0x1600000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 0 out of 42694 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: merged 1 duplicate CFG edge
BOLT-INFO: UCE removed 0 blocks and 0 bytes of code.
BOLT-INFO: Starting stub-insertion pass
BOLT-INFO: Inserted 1 stubs in the hot area and 0 stubs in the cold area. Shared 0 times, iterated 2 times.
#0 0x0000aaaace5d1804 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/llvm-project-15.0.7.src/llvm/lib/Support/Unix/Signals.inc:569:0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: ./llvm-bolt ./speed_finder_bm -instrument -o speed_finder_bm_ins
Segmentation fault (core dumped)
@EugeneZelenko EugeneZelenko added BOLT crash Prefer [crash-on-valid] or [crash-on-invalid] and removed new issue labels Mar 1, 2023
@llvmbot
Copy link
Member

llvmbot commented Mar 1, 2023

@llvm/issue-subscribers-bolt

@aaupov
Copy link
Contributor

aaupov commented Mar 1, 2023

BOLT instrumentation is not supported on AArch64.

@aaupov aaupov closed this as completed Mar 1, 2023
@rickyzhang012500
Copy link
Author

BOLT instrumentation is not supported on AArch64.

Another quesion: Is BOLT sampling mode supported on AArch64?
I'm having a problem, as shown in the following information:
tztek:/userdata/# llvm-bolt ./speed_finder_bm -o speed_finder_bm.bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version:
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1600000, offset 0x1600000
BOLT-INFO: enabling relocation mode
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: pre-processing profile using branch profile reader
BOLT-INFO: operating with basic samples profiling data (no LBR).
BOLT-INFO: normalizing samples by instruction count.
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 418 out of 42694 functions in the binary (1.0%) have non-empty execution profile
BOLT-INFO: 45 functions with profile could not be optimized
BOLT-INFO: merged 1 duplicate CFG edge
BOLT-INFO: basic block reordering modified layout of 352 (0.81%) functions
BOLT-INFO: UCE removed 0 blocks and 0 bytes of code.
BOLT-INFO: 5 Functions were reordered by LoopInversionPass
BOLT-INFO: hfsort+ reduced the number of chains from 463 to 328
BOLT-INFO: program-wide dynostats after all optimizations before SCTC and FOP:

         2319073 : executed forward branches
          760905 : taken forward branches
          890341 : executed backward branches
          762996 : taken backward branches
          395877 : executed unconditional branches
          715029 : all function calls
          232563 : indirect calls
          139967 : PLT calls
        31897190 : executed instructions
         3057972 : executed load instructions
               0 : executed store instructions
               0 : taken jump table branches
               0 : taken unknown indirect branches
         3605291 : total branches
         1919778 : taken branches
         1685513 : non-taken conditional branches
         1523901 : taken conditional branches
         3209414 : all conditional branches
               0 : linker-inserted veneer calls

         1880198 : executed forward branches (-18.9%)
           97504 : taken forward branches (-87.2%)
         1329216 : executed backward branches (+49.3%)
          613525 : taken backward branches (-19.6%)
          231035 : executed unconditional branches (-41.6%)
          715029 : all function calls (=)
          232563 : indirect calls (=)
          139967 : PLT calls (=)
        32039179 : executed instructions (+0.4%)
         3057972 : executed load instructions (=)
               0 : executed store instructions (=)
               0 : taken jump table branches (=)
               0 : taken unknown indirect branches (=)
         3440449 : total branches (-4.6%)
          942064 : taken branches (-50.9%)
         2498385 : non-taken conditional branches (+48.2%)
          711029 : taken conditional branches (-53.3%)
         3209414 : all conditional branches (=)
               0 : linker-inserted veneer calls (=)

BOLT-INFO: Starting stub-insertion pass
BOLT-INFO: Inserted 176 stubs in the hot area and 65 stubs in the cold area. Shared 0 times, iterated 3 times.
BOLT-INFO: setting _end to 0x14649a8
BOLT-INFO: setting _end to 0x14649a8
BOLT-INFO: setting __hot_start to 0x1800000
BOLT-INFO: setting __hot_end to 0x1869194
BOLT-INFO: patched build-id (flipped last bit)
tztek:/userdata# ./speed_finder_bm.bolt
Segmentation fault (core dumped)

@aaupov
Copy link
Contributor

aaupov commented Mar 1, 2023

Yes, sampling is the only supported profiling method on AArch64.
Can you please try to narrow down the runtime crash using bughunter script?

Invoke it like this:

BOLT=/path/to/llvm-bolt INPUT_BINARY=./speed_finder_bm BOLT_OPTIONS="-data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" /path/to/llvm-project/bolt/utils/bughunter.sh

cc @yota9

@aaupov aaupov reopened this Mar 1, 2023
@rickyzhang012500
Copy link
Author

Yes, sampling is the only supported profiling method on AArch64. Can you please try to narrow down the runtime crash using bughunter script?

Invoke it like this:

BOLT=/path/to/llvm-bolt INPUT_BINARY=./speed_finder_bm BOLT_OPTIONS="-data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" /path/to/llvm-project/bolt/utils/bughunter.sh

cc @yota9

ok,thank u your reply. I will try it.

@rickyzhang012500
Copy link
Author

rickyzhang012500 commented Mar 1, 2023

Yes, sampling is the only supported profiling method on AArch64. Can you please try to narrow down the runtime crash using bughunter script?

Invoke it like this:

BOLT=/path/to/llvm-bolt INPUT_BINARY=./speed_finder_bm BOLT_OPTIONS="-data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" /path/to/llvm-project/bolt/utils/bughunter.sh

cc @yota9

tztek@J5-210:/userdata/qcraft# BOLT=/userdata/qcraft/llvm-bolt INPUT_BINARY=./speed_finder_bm BOLT_OPTIONS="-data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" ./bughunter.sh
Verify input binary passes
  INPUT_BINARY: : &&  ./speed_finder_bm  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  Input binary passes.
Verify optimized binary fails
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
./bughunter.sh: line 143:  2758 Segmentation fault      $TIMEOUT_OR_CMD $OPTIMIZED_BINARY $COMMAND_LINE 2>&1
      2759 Done                    | $POST_COMMAND 1>&$OUTPUT_FILE
  Optimized binary fails as expected.
Iteration 0, trying /tmp/func-names.8Y9.txtaa / 21349 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.8Y9.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2785 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 1, trying /tmp/func-names.nVv.txtaa / 10675 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.nVv.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2797 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 2, trying /tmp/func-names.iVI.txtaa / 5338 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.iVI.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 3, trying /tmp/func-names.iVI.txtab / 5338 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.iVI.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2847 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 4, trying /tmp/func-names.erA.txtaa / 2669 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.erA.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 5, trying /tmp/func-names.erA.txtab / 2669 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.erA.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2891 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 6, trying /tmp/func-names.LP3.txtaa / 1334 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.LP3.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 7, trying /tmp/func-names.LP3.txtab / 1334 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.LP3.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2933 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 8, trying /tmp/func-names.ZVi.txtaa / 667 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.ZVi.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2946 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 9, trying /tmp/func-names.uC8.txtaa / 334 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.uC8.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2977 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 10, trying /tmp/func-names.dEk.txtaa / 167 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.dEk.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  2989 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 11, trying /tmp/func-names.zrg.txtaa / 84 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.zrg.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3001 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 12, trying /tmp/func-names.DiC.txtaa / 42 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.DiC.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3013 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 13, trying /tmp/func-names.s3S.txtaa / 21 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.s3S.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 14, trying /tmp/func-names.s3S.txtab / 21 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.s3S.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3052 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 15, trying /tmp/func-names.OZO.txtaa / 11 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.OZO.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3064 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 16, trying /tmp/func-names.77k.txtaa / 6 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.77k.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 17, trying /tmp/func-names.77k.txtab / 6 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.77k.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3103 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 18, trying /tmp/func-names.1hP.txtaa / 3 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.1hP.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3115 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 19, trying /tmp/func-names.AWo.txtaa / 2 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.AWo.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3127 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
Iteration 20, trying /tmp/func-names.w9B.txtaa / 1 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtaa -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
  BOLT failure=0
  OPTIMIZED_BINARY: : &&  /tmp/speed_finder_bm.zw3.bolt  |& cat >& /tmp/speed_finder_bm.zw3.bolt.out
  OPTIMIZED_BINARY failure=0
Iteration 21, trying /tmp/func-names.w9B.txtab / 1 functions
  BOLT: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.bolt >& /tmp/boltasI.log
./bughunter.sh: line 246:  3167 Aborted                 ( $BOLT $BOLT_OPTIONS $INPUT_BINARY $SEARCH_OPT -o $OPTIMIZED_BINARY 1>&$BOLT_LOG )
  BOLT failure=134
The function(s) that failed are in /tmp/func-names.6ND.txtaa
To reproduce, run: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.bolt
rm: can't remove '/tmp/speed_finder_bm.zw3.bolt': No such file or directory
rm: can't remove '/tmp/speed_finder_bm.zw3.bolt.out': No such file or directory

tztek@J5-210:/userdata/qcraft# /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.b
olt
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: <unknown>
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1600000, offset 0x1600000
BOLT-INFO: enabling relocation mode
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: pre-processing profile using branch profile reader
not implemented
UNREACHABLE executed at /home/llvm-project-15.0.7.src/bolt/include/bolt/Core/MCPlusBuilder.h:1581!
#0 0x0000aaaaaeb71804 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/llvm-project-15.0.7.src/llvm/lib/Support/Unix/Signals.inc:569:0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.bolt
Aborted (core dumped)

@aaupov
Copy link
Contributor

aaupov commented Mar 1, 2023

That's a bit unexpected that bughunter reduced to a BOLT crash.
Can you please post the function name (from /tmp/func-names.w9B.txtab), and function disassembly?
You can get it with
llvm-objdump --disassemble-symbols=<funcname> speed_finder_bm

It would also be nice to produce an assembly test using with asm-dump (but I never tested it with AArch64):

/userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp \
  -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh \
  -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab \
  -o /tmp/speed_finder_bm.zw3.b -asm-dump

If it doesn't crash, please upload the contents of produced funcname.s file to pastebin/github gist.

@rickyzhang012500
Copy link
Author

rickyzhang012500 commented Mar 1, 2023

> /tmp/func-names.w9B.txtab

tztek@J5-210:/userdata/qcraft# cat /tmp/func-names.w9B.txtab | c++filt
char const* google::protobuf::internal::EpsCopyInputStream::ReadPackedVarint<char const* google::protobuf::internal::PackedEnumParser<google::protobuf::UnknownFieldSet>(void*, char const*, google::protobuf::internal::ParseContext*, bool (*)(int), google::protobuf::internal::InternalMetadata*, int)::'lambda'(unsigned long)>(char const*, char const* google::protobuf::internal::PackedEnumParser<google::protobuf::UnknownFieldSet>(void*, char const*, google::protobuf::internal::ParseContext*, bool (*)(int), google::protobuf::internal::InternalMetadata*, int)::'lambda'(unsigned long))
tztek@J5-210:/userdata/qcraft# objdump --disassemble-symbols=_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_ speed_finder_bm

speed_finder_bm:	file format elf64-littleaarch64

Disassembly of section .text:

000000000072b9b0 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_>:
  72b9b0: ff 43 03 d1  	sub	sp, sp, #208            // =208
  72b9b4: fd 7b 09 a9  	stp	x29, x30, [sp, #144]
  72b9b8: f7 53 00 f9  	str	x23, [sp, #160]
  72b9bc: f6 57 0b a9  	stp	x22, x21, [sp, #176]
  72b9c0: f4 4f 0c a9  	stp	x20, x19, [sp, #192]
  72b9c4: fd 43 02 91  	add	x29, sp, #144           // =144
  72b9c8: 88 7d 00 d0  	adrp	x8, 0x16dd000 <_ZN6qcraft24ModelStructureConstParam9MergeFromERKS0_+0xa3c>
  72b9cc: 08 95 41 f9  	ldr	x8, [x8, #808]
  72b9d0: f3 03 00 aa  	mov	x19, x0
  72b9d4: e0 03 01 aa  	mov	x0, x1
  72b9d8: f4 03 02 aa  	mov	x20, x2
  72b9dc: 08 01 40 f9  	ldr	x8, [x8]
  72b9e0: a8 83 1f f8  	stur	x8, [x29, #-8]
  72b9e4: 08 14 c0 38  	ldrsb	w8, [x0], #1
  72b9e8: 15 1d 00 12  	and	w21, w8, #0xff
  72b9ec: 08 01 f8 36  	tbz	w8, #31, 0x72ba0c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x5c>
  72b9f0: 08 00 c0 39  	ldrsb	w8, [x0]
  72b9f4: 09 1d 00 12  	and	w9, w8, #0xff
  72b9f8: a9 1e 09 0b  	add	w9, w21, w9, lsl #7
  72b9fc: 35 01 02 51  	sub	w21, w9, #128           // =128
  72ba00: 88 08 f8 37  	tbnz	w8, #31, 0x72bb10 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x160>
  72ba04: 48 00 80 52  	mov	w8, #2
  72ba08: 20 00 08 8b  	add	x0, x1, x8
  72ba0c: 61 06 40 f9  	ldr	x1, [x19, #8]
  72ba10: 36 00 00 4b  	sub	w22, w1, w0
  72ba14: bf 02 16 6b  	cmp	w21, w22
  72ba18: cd 02 00 54  	b.le	0x72ba70 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0xc0>
  72ba1c: 80 06 40 ad  	ldp	q0, q1, [x20]
  72ba20: e2 03 01 91  	add	x2, sp, #64             // =64
  72ba24: e0 07 02 ad  	stp	q0, q1, [sp, #64]
  72ba28: 57 00 00 94  	bl	0x72bb84 <_ZN6google8protobuf8internal21ReadPackedVarintArrayIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS6_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES6_S6_S6_T_>
  72ba2c: 20 03 00 b4  	cbz	x0, 0x72ba90 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0xe0>
  72ba30: 69 06 40 f9  	ldr	x9, [x19, #8]
  72ba34: a8 02 16 4b  	sub	w8, w21, w22
  72ba38: 1f 41 00 71  	cmp	w8, #16                 // =16
  72ba3c: 17 00 09 cb  	sub	x23, x0, x9
  72ba40: 0d 04 00 54  	b.le	0x72bac0 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x110>
  72ba44: 68 1e 40 b9  	ldr	w8, [x19, #28]
  72ba48: 1f 45 00 71  	cmp	w8, #17                 // =17
  72ba4c: 8b 09 00 54  	b.lt	0x72bb7c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x1cc>
  72ba50: e0 03 13 aa  	mov	x0, x19
  72ba54: a2 86 3b 94  	bl	0x160d4dc <_ZN6google8protobuf8internal18EpsCopyInputStream4NextEv>
  72ba58: c0 01 00 b4  	cbz	x0, 0x72ba90 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0xe0>
  72ba5c: 61 06 40 f9  	ldr	x1, [x19, #8]
  72ba60: c8 02 17 0b  	add	w8, w22, w23
  72ba64: b5 02 08 4b  	sub	w21, w21, w8
  72ba68: 00 c0 37 8b  	add	x0, x0, w23, sxtw
  72ba6c: e9 ff ff 17  	b	0x72ba10 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x60>
  72ba70: 80 06 40 ad  	ldp	q0, q1, [x20]
  72ba74: 13 c0 35 8b  	add	x19, x0, w21, sxtw
  72ba78: e2 03 00 91  	mov	x2, sp
  72ba7c: e1 03 13 aa  	mov	x1, x19
  72ba80: e0 07 00 ad  	stp	q0, q1, [sp]
  72ba84: 40 00 00 94  	bl	0x72bb84 <_ZN6google8protobuf8internal21ReadPackedVarintArrayIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS6_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES6_S6_S6_T_>
  72ba88: 7f 02 00 eb  	cmp	x19, x0
  72ba8c: 00 00 9f 9a  	csel	x0, x0, xzr, eq
  72ba90: 89 7d 00 d0  	adrp	x9, 0x16dd000 <_ZN6google8protobuf5Arena18CreateMaybeMessageIN6qcraft21FENConfidenceMinScoreEJEEEPT_PS1_DpOT0_+0x40>
  72ba94: a8 83 5f f8  	ldur	x8, [x29, #-8]
  72ba98: 29 95 41 f9  	ldr	x9, [x9, #808]
  72ba9c: 29 01 40 f9  	ldr	x9, [x9]
  72baa0: 3f 01 08 eb  	cmp	x9, x8
  72baa4: 41 04 00 54  	b.ne	0x72bb2c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x17c>
  72baa8: f4 4f 4c a9  	ldp	x20, x19, [sp, #192]
  72baac: f6 57 4b a9  	ldp	x22, x21, [sp, #176]
  72bab0: f7 53 40 f9  	ldr	x23, [sp, #160]
  72bab4: fd 7b 49 a9  	ldp	x29, x30, [sp, #144]
  72bab8: ff 43 03 91  	add	sp, sp, #208            // =208
  72babc: c0 03 5f d6  	ret
  72bac0: bf 7f 3d a9  	stp	xzr, xzr, [x29, #-48]
  72bac4: bf 83 1e 78  	sturh	wzr, [x29, #-24]
  72bac8: bf 03 1e f8  	stur	xzr, [x29, #-32]
  72bacc: 20 01 c0 3d  	ldr	q0, [x9]
  72bad0: 81 0a 40 ad  	ldp	q1, q2, [x20]
  72bad4: b5 c3 00 d1  	sub	x21, x29, #48           // =48
  72bad8: b4 c2 28 8b  	add	x20, x21, w8, sxtw
  72badc: a0 c2 37 8b  	add	x0, x21, w23, sxtw
  72bae0: e2 83 00 91  	add	x2, sp, #32             // =32
  72bae4: e1 03 14 aa  	mov	x1, x20
  72bae8: a0 03 9d 3c  	stur	q0, [x29, #-48]
  72baec: e1 0b 01 ad  	stp	q1, q2, [sp, #32]
  72baf0: 25 00 00 94  	bl	0x72bb84 <_ZN6google8protobuf8internal21ReadPackedVarintArrayIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS6_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES6_S6_S6_T_>
  72baf4: 68 06 40 f9  	ldr	x8, [x19, #8]
  72baf8: 09 00 15 cb  	sub	x9, x0, x21
  72bafc: 1f 00 14 eb  	cmp	x0, x20
  72bb00: 04 08 40 fa  	ccmp	x0, #0, #4, eq
  72bb04: 08 01 09 8b  	add	x8, x8, x9
  72bb08: 00 11 9f 9a  	csel	x0, x8, xzr, ne
  72bb0c: e1 ff ff 17  	b	0x72ba90 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0xe0>
  72bb10: 28 08 c0 39  	ldrsb	w8, [x1, #2]
  72bb14: 09 1d 00 12  	and	w9, w8, #0xff
  72bb18: a9 3a 09 0b  	add	w9, w21, w9, lsl #14
  72bb1c: 35 11 40 51  	sub	w21, w9, #4, lsl #12    // =16384
  72bb20: 88 00 f8 37  	tbnz	w8, #31, 0x72bb30 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x180>
  72bb24: 68 00 80 52  	mov	w8, #3
  72bb28: b8 ff ff 17  	b	0x72ba08 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x58>
  72bb2c: 99 37 3c 94  	bl	0x1639990 <__stack_chk_fail@plt>
  72bb30: 28 0c c0 39  	ldrsb	w8, [x1, #3]
  72bb34: 09 1d 00 12  	and	w9, w8, #0xff
  72bb38: a9 56 09 0b  	add	w9, w21, w9, lsl #21
  72bb3c: 35 01 48 51  	sub	w21, w9, #512, lsl #12  // =2097152
  72bb40: 68 00 f8 37  	tbnz	w8, #31, 0x72bb4c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x19c>
  72bb44: 88 00 80 52  	mov	w8, #4
  72bb48: b0 ff ff 17  	b	0x72ba08 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x58>
  72bb4c: 28 10 40 39  	ldrb	w8, [x1, #4]
  72bb50: 1f 1d 00 71  	cmp	w8, #7                  // =7
  72bb54: 48 01 00 54  	b.hi	0x72bb7c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x1cc>
  72bb58: a8 72 08 0b  	add	w8, w21, w8, lsl #28
  72bb5c: 09 00 be 52  	mov	w9, #-268435456
  72bb60: 15 01 09 0b  	add	w21, w8, w9
  72bb64: e8 fd 9f 52  	mov	w8, #65519
  72bb68: e8 ff af 72  	movk	w8, #32767, lsl #16
  72bb6c: bf 02 08 6b  	cmp	w21, w8
  72bb70: 68 00 00 54  	b.hi	0x72bb7c <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x1cc>
  72bb74: a8 00 80 52  	mov	w8, #5
  72bb78: a4 ff ff 17  	b	0x72ba08 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0x58>
  72bb7c: e0 03 1f aa  	mov	x0, xzr
  72bb80: c4 ff ff 17  	b	0x72ba90 <_ZN6google8protobuf8internal18EpsCopyInputStream16ReadPackedVarintIZNS1_16PackedEnumParserINS0_15UnknownFieldSetEEEPKcPvS7_PNS1_12ParseContextEPFbiEPNS1_16InternalMetadataEiEUlmE_EES7_S7_T_+0xe0>

@rickyzhang012500
Copy link
Author

rickyzhang012500 commented Mar 1, 2023

tztek@J5-210:/userdata/qcraft# /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.b -asm-dump
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: <unknown>
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1800000, offset 0x1800000
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: enabling relocation mode
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: pre-processing profile using branch profile reader
not implemented
UNREACHABLE executed at /home/llvm-project-15.0.7.src/bolt/include/bolt/Core/MCPlusBuilder.h:1581!
#0 0x0000aaaad4e41804 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/llvm-project-15.0.7.src/llvm/lib/Support/Unix/Signals.inc:569:0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /userdata/qcraft/llvm-bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats ./speed_finder_bm -funcs-file-no-regex=/tmp/func-names.w9B.txtab -o /tmp/speed_finder_bm.zw3.b -asm-dump
Aborted (core dumped)

@yota9
Copy link
Member

yota9 commented Mar 1, 2023

What's there on MCPlusBuilder.h:1581? There is nothing on master branch

@yota9
Copy link
Member

yota9 commented Mar 1, 2023

Anyway please update the bolt to the latest commits and also could you run it under GDB and show me the place where the crash occurred? What linker do you use to link the application? Is it PIE or EXEC?

@rickyzhang012500
Copy link
Author

What's there on MCPlusBuilder.h:1581? There is nothing on master branch
llvm-project-15.0.7.src/bolt/include/bolt/Core/MCPlusBuilder.h:1581,built by release branch llvm-project-15.0.7.src .

@rickyzhang012500
Copy link
Author

Anyway please update the bolt to the latest commits and also could you run it under GDB and show me the place where the crash occurred? What linker do you use to link the application? Is it PIE or EXEC?

clang version 13.0.1, ld.lld. I'll try lastest version for it.

@yota9
Copy link
Member

yota9 commented Mar 1, 2023

I see, well I've never tried funcs-file my self, so never fall through to scanExternalRefs->createRelocation. AFAIR long time ago there were some other complications for aarch64 to support scanExternalRefs, cause createRelocation looks easy to support. But I propose not to use this option for now. Please try master branch and your normal command "llvm-bolt ./speed_finder_bm -o speed_finder_bm.bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" for now and if it would fail let's see under GDB what is going on there.

@rickyzhang012500
Copy link
Author

I see, well I've never tried funcs-file my self, so never fall through to scanExternalRefs->createRelocation. AFAIR long time ago there were some other complications for aarch64 to support scanExternalRefs, cause createRelocation looks easy to support. But I propose not to use this option for now. Please try master branch and your normal command "llvm-bolt ./speed_finder_bm -o speed_finder_bm.bolt -data=perf.fdata -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -split-all-cold -split-eh -dyno-stats" for now and if it would fail let's see under GDB what is going on there.

I updated bolt with the latest commits,no crash occurred. But performance no obviously improved. Anyway, thank you all for your reply,sincerely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BOLT crash Prefer [crash-on-valid] or [crash-on-invalid]
Projects
None yet
Development

No branches or pull requests

5 participants