-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CGData][GMF] Skip No Params #116548
[CGData][GMF] Skip No Params #116548
Conversation
cc. @nocchijiang |
SmallSet<stable_hash, 8> UniqueHashVals; | ||
for (auto &SF : SFS) { | ||
UniqueHashVals.clear(); | ||
for (auto &[IndexPair, Hash] : *SF->IndexOperandHashMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not aware that the unique values could differ, but it is still nice to count them precisely. This fix alone brings ~1% code size reduction on one of the projects I have tested.
unsigned ParamCount = UniqueHashVals.size(); | ||
if (ParamCount > GlobalMergingMaxParams) | ||
return false; | ||
if (GlobalMergingSkipNoParams && ParamCount == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this pass acts as an IR-level ICF when ParamCount == 0
? If your testing result shows it is beneficial then I have no problems with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking this. To be explicit, does ICF get the same size win as when GlobalMergingMaxParams = 0
? If so I think it would be worth saying this in the flag description to help users understand the pass a bit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be explicit, does ICF get the same size win as when GlobalMergingMaxParams = 0?
No, this might actually increase the size since it always creates redundant thunks. However, depending on downstream passes or an improved linker's ICF, the outcome could be different. I have now set the flag to true
to skip this ICF case, but an option remains available to enable it if needed.
@@ -14,6 +14,7 @@ | |||
//===----------------------------------------------------------------------===// | |||
|
|||
#include "llvm/CGData/StableFunctionMap.h" | |||
#include "llvm/ADT/SmallSet.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: includes not sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually sorted correctly. The main header file (StableFunctionMap.h) should come first, corresponding to this file, StableFunctionMap.cpp. https://llvm.org/docs/CodingStandards.html#header-files
"global-merging-inst-overhead", | ||
cl::desc("The overhead cost associated with each instruction when lowering " | ||
"to machine instruction."), | ||
cl::init(1.0), cl::Hidden); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we should set the default value slightly higher, maybe like 1.2
? I believe 1.0
is a guaranteed underestimation for AArch64 backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did a test with 1.2
and it delivered another 0.6% code size reduction on the same project I tested the effect of accurate ParamCount
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind changing the default value as it could eventually change depending on target/app. In theory, it's hard to precisely estimate final size impact for the given IR, anyhow,
unsigned ParamCount = UniqueHashVals.size(); | ||
if (ParamCount > GlobalMergingMaxParams) | ||
return false; | ||
if (GlobalMergingSkipNoParams && ParamCount == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking this. To be explicit, does ICF get the same size win as when GlobalMergingMaxParams = 0
? If so I think it would be worth saying this in the flag description to help users understand the pass a bit better.
if (ParamCount > GlobalMergingMaxParams) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand if, if a group of mergable functions have too many params, we can't merge any of them. Could we eliminate functions until the params are small enough and only merge a subset of the group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. To compute the actual parameters, we need to iterate over the set of candidates with the same hash. By reducing these candidates, we might obtain a smaller set of parameters. However, this adds complexity to repeat these steps until we find a smaller set. In addition, since the overall function hash still matches, we may create unprofitable merging instances that the linker does not fold. I would maintain the current approach for simplicity.
I believe all the comments have been addressed. Are there any more?" |
@nocchijiang @ellishg Can you take a look again? Thanks! |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/12484 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/11692 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/8209 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/2529 Here is the relevant piece of the build log for the reference
|
This reverts commit fdf1f69.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/154/builds/8027 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/8479 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/9016 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/9014 Here is the relevant piece of the build log for the reference
|
This update follows up on change #112671 and is mostly a NFC, with the following exceptions: - Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required. - Parameter count is now calculated based on the unique hash count. - Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size. - Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time. - Moved a sorting operation outside of the loop. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
-global-merging-skip-no-params
to bypass merging when no parameters are required.-global-merging-inst-overhead
to adjust the instruction overhead, reflecting the machine instruction size.This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.