-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(profilecli): clarifying the CLI help message and documentation for aggregate-callees #3638
docs(profilecli): clarifying the CLI help message and documentation for aggregate-callees #3638
Conversation
…update the related prompt messages.
|
Hi @JansonLv, thanks for the contribution! Indeed, I believe a clarification could improve UX. The However, I have a couple of questions:
I'd like to learn more about the case:
For the context: the |
@kolesnikovae Hi, thank you for your reply. I am using version go1.22.7. |
Thank you @JansonLv! I'll double check that our
|
aggregate-callees
in go-pgo query by default
aggregate-callees
in go-pgo query by defaultaggregate-callees
in go-pgo query by default
@kolesnikovae thanks,Please check if this is the file you are looking for. |
hi,@kolesnikovae Sorry, that was generated by another project; you can review the information from these two files, and I have modified the trimstrings code, processing only part of the information. |
Thank you so much for providing the samples! Profiles look good, however they contain surprisingly few samples ~800 and ~1000 samples correspondingly (16.64s of CPU time in both). Idea of UPD: Same for the second pair: 756/1223/17.45s. I'd suggest querying a longer period of time. I'm wondering how you've measured the impact of PGO – is it possible to check the compiler logs? UPD2: correct command:
|
@kolesnikovae I obtained this through stress testing the service and did not use any related Go commands. Thank you for your guidance. |
Thank you for sharing this! I'd like to figure out why aggregation affects PGO results, because this is really helpful feature and I wouldn't like disabling it by default. In the log I see use of CGO and lots of
This will tell us how many optimisations Go compiler made (very roughy), and estimate impact of the PGO (not the app performance). As for load/stress tests – as far as I understand from the log, it might involve IO (message broker), which would very likely dominate the result. Usually, the expected improvement (reduce in CPU consumption) is within 2-5%. Also, please note that PGO won't help with C code. |
hi,@kolesnikovae I'm so sorry to keep you waiting for so long. It took me some time to set up the relevant environment locally.
There is indeed a problem, which is why I didn't add the required tag for github.com/confluentinc/confluent-kafka-go/v2/kafka." in the end, the result is still 0, even though I used the one obtained through the I think we can keep the aggregation for now, as it will take some time to investigate this issue. Enjoy your weekend ./main.go:6:13: PGO devirtualize considering call cmd.Execute() |
Hi @JansonLv, no worries at all – I'm here to help. Thank you for sharing this! Please let me know if I can help you in any way. In the meantime, I believe that clarifying the CLI help message and documentation (in |
Hi,@kolesnikovae, I'm very honored to contribute to Grafana and improve the user experience. I will make the necessary modifications and commit them as soon as possible. I'm also glad to have your help. |
Hi, @kolesnikovae, I have submitted the code, but I am uncertain whether the last sentence ‘Try both options to see which gives better for your PGO’ is suitable. I hope to get your opinion on this. Wish you a pleasant weekend |
aggregate-callees
in go-pgo query by defaultCo-authored-by: Anton Kolesnikov <anton.e.kolesnikov@gmail.com>
hi, @kolesnikovae , I noticed that the hot-callsite-thres-from-CDF flag, which is related to PGO (Profile-Guided Optimization), may vary depending on the profile. So, would the parameters obtained from the original profile differ from those obtained after getting PGO through Pyroscope and then performing PGO compilation? It should be related to keep-locations, but there shouldn't be a connection if it's large enough; theoretically, it shouldn't have anything to do with aggregate-callees either, right? |
I think that CDF will be affected in any case. This is what is going on (simplified):
Thus, the aggregation and trimming will affect the |
@kolesnikovae , Thank you for explanation, I will spend the next two weeks comparing them in practice online to decide whether to continue exploring further. |
Thank you for your patience, @JansonLv! I'm very interested in the results of your research, and I would greatly appreciate it if you could share your experiences on how Pyroscope could be enhanced to better support PGO – your insights would be highly valuable! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for your patience, @kolesnikovae Based on online practice, it has been demonstrated that using the We have two web services: one of these services involves a lot of I/O processing and has high latency. In this case, PGO did not result in any noticeable improvement. For the other web service, which only has a small number of I/O requests, when we initially deployed it with the This situation aligns with my expectations but seems to be outside of yours. It appears that we need to delve deeper into this issue. Given this, should we reopen an issue to specifically discuss this problem? As this is not something that can be resolved quickly. This is a comparison before (--aggregate-callees) and after (--no—aggregate-callees) the deployment: |
Hi @JansonLv, of course, please feel free to create a new issue dedicated to the problem. It would be great if we could take a look at the compilation log with |
Thank you for your patience, @kolesnikovae,I have added two compilation logs, but they are too complex, and I may not be able to provide much insight. See if they can be of any help to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the docs!
In the default configuration related to
profilecli query go-pgo
, the exported PGO file does not optimize or improve performance. Using--no-aggregate-callees
results in a performance boost, but the prompt for theaggregate-callees
parameter is not user-friendly and may mislead users into thinking they should use--aggregate-callees=false
. Therefore, I suggest changing the default value of aggregate-callees to false and adding relevant prompts.