Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CGData] Stable Function Map #112662

Merged
merged 4 commits into from
Nov 1, 2024
Merged

[CGData] Stable Function Map #112662

merged 4 commits into from
Nov 1, 2024

Conversation

kyulee-com
Copy link
Contributor

@kyulee-com kyulee-com commented Oct 17, 2024

These define the main data structures to represent stable functions and group similar functions in a function map.
Serialization is supported in a binary or yaml form.

Depends on #112638.
This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.

@kyulee-com kyulee-com force-pushed the users/kyulee-com/funcmap branch from e7272c3 to 060a23e Compare October 17, 2024 18:37
@kyulee-com kyulee-com force-pushed the users/kyulee-com/strhash branch from a61dd8e to d3c35a7 Compare October 26, 2024 19:37
Base automatically changed from users/kyulee-com/strhash to main October 27, 2024 03:02
@kyulee-com kyulee-com force-pushed the users/kyulee-com/funcmap branch from 060a23e to 7d9dc40 Compare October 27, 2024 07:03
These define the main data structures to represent stable functions and group
similar functions in a function map.
Serialization is supported in a binary or yaml form.
@kyulee-com
Copy link
Contributor Author

cc. @nocchijiang

@kyulee-com kyulee-com marked this pull request as ready for review October 28, 2024 13:44
void StableFunctionMap::finalize() {
Finalized = true;

for (auto It = HashToFuncs.begin(); It != HashToFuncs.end(); ++It) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to point out this this iterates over a DenseMap, which is not deterministic. Hopefully the iteration order doesn't matter. But if they are truly independent, can we use llvm::parallelFor()? I only see that used in lld, so I'm not sure it makes sense here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an entry can be deleted on the fly, we can't use a parallel loop.

@kyulee-com
Copy link
Contributor Author

cc. @nocchijiang

Do you have any comment on this direction?

Copy link
Contributor

@thevinster thevinster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach, lgtm

};

/// An efficient form of StableFunction for fast look-up
struct StableFunctionEntry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an internal implementation within StableFunctionMap? It doesn't seem like it could be defined without a StableFunctionMap calculating the Ids, right? We could always make this public again if there is a use case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a use case In the StableFunctionMapRecord. However, I agree that the insert(FuncEntry) method should remain private, as its usage should be restricted to within this context. So, I have utilized the friend keyword while keeping it private, and removed the inapplicable test case.

/// Insert a `StableFunctionEntry` into the function map directly. This
/// method assumes that string names have already been uniqued and the
/// `StableFunctionEntry` is ready for insertion.
void insert(std::unique_ptr<StableFunctionEntry> FuncEntry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If StableFunctionEntry is private, then this method can be private too.

@kyulee-com kyulee-com merged commit 7ec26b2 into main Nov 1, 2024
8 checks passed
@kyulee-com kyulee-com deleted the users/kyulee-com/funcmap branch November 1, 2024 05:23
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 1, 2024

LLVM Buildbot has detected a new failure on builder bolt-x86_64-ubuntu-shared running on bolt-worker while building llvm at step 6 "test-build-bolt-check-bolt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/3027

Here is the relevant piece of the build log for the reference
Step 6 (test-build-bolt-check-bolt) failure: test (failure)
******************** TEST 'BOLT :: perf2bolt/perf_test.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 5: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
RUN: at line 6: perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
info: Using a maximum frequency rate of 2000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 (18 samples) ]
RUN: at line 7: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id 2>&1 | /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
RUN: at line 12: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -no-pie -fuse-ld=lld -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -no-pie -fuse-ld=lld -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4
RUN: at line 13: perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp5 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4
+ perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp5 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4
info: Using a maximum frequency rate of 2000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp5 (9 samples) ]
RUN: at line 14: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4 -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp5 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp6 -nl -ignore-build-id 2>&1 | /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp4 -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp5 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp6 -nl -ignore-build-id
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test:10:12: error: CHECK-NOT: excluded string found in input
CHECK-NOT: !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection.
           ^
<stdin>:26:2: note: found here
 !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance.
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        .
        .
        .
       21: BOLT-WARNING: Running parallel work of 0 estimated cost, will switch to trivial scheduling. 
       22: PERF2BOLT: processing basic events (without LBR)... 
       23: PERF2BOLT: read 9 samples 
       24: PERF2BOLT: out of range samples recorded in unknown regions: 9 (100.0%) 
       25:  
       26:  !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance. 
not:10      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                   error: no match expected
       27:  
...

smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
These define the main data structures to represent stable functions and
group similar functions in a function map.
Serialization is supported in a binary or yaml form.

Depends on llvm#112638.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
These define the main data structures to represent stable functions and
group similar functions in a function map.
Serialization is supported in a binary or yaml form.

Depends on llvm#112638.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
kyulee-com added a commit that referenced this pull request Nov 5, 2024
This introduces a new cgdata format for stable function maps. The raw
data is embedded in the __llvm_merge section during compile time. This
data can be read and merged using the llvm-cgdata tool, into an indexed
cgdata file. Consequently, the tool is now capable of handling either
outlined hash trees, stable function maps, or both, as they are
orthogonal.

Depends on #112662.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
This introduces a new cgdata format for stable function maps. The raw
data is embedded in the __llvm_merge section during compile time. This
data can be read and merged using the llvm-cgdata tool, into an indexed
cgdata file. Consequently, the tool is now capable of handling either
outlined hash trees, stable function maps, or both, as they are
orthogonal.

Depends on llvm#112662.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants