Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(reports): tune flags for reliability #698

Merged
merged 5 commits into from
Oct 7, 2021

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Oct 1, 2021

Fixes #697

Related cryostatio/cryostat-operator#277

Image available at quay.io/andrewazores/cryostat:serialgc-report

@andrewazores andrewazores force-pushed the subprocess-report-gc branch 2 times, most recently from 40af372 to 132057a Compare October 1, 2021 17:53
@andrewazores
Copy link
Member Author

andrewazores commented Oct 5, 2021

recordings.zip

This .zip contains two recording files I collected from the smoketest sample apps, both using the Profiling template while the applications sat idle for some time - one twice as long as the other, so the recording is twice the size. Unzip it, open a terminal, and navigate to where the two recordings are after unzipping.

$ CRYOSTAT_IMAGE=quay.io/cryostat/cryostat:latest sh smoketest.sh
# in a separate terminal:
$ curl -vk -F recording=@jmx-listener-55d48f7cfc-p9st4_profiling_20211004T222522Z.jfr https://0.0.0.0:8181/api/v1/recordings
$ curl -vk -F recording=@quarkus-test-6bd55b764c-vnq4l_profiling_20211004T222527Z.jfr https://0.0.0.0:8181/api/v1/recordings
# this request should succeed
$ curl -vk https://0.0.0.0:8181/api/v1/reports/jmx-listener-55d48f7cfc-p9st4_profiling_20211004T222522Z.jfr
# this request is likely to fail
$ curl -vk https://0.0.0.0:8181/api/v1/reports/quarkus-test-6bd55b764c-vnq4l_profiling_20211004T222527Z.jfr

Repeat the same test, but with CRYOSTAT_IMAGE=quay.io/andrewazores/cryostat:serialgc-report and both of the report requests should succeed. So, the flag tuning in this PR does seem to improve the ability for the report generation to handle larger recordings, although I have yet to determine the new ceiling with these changes. It may also depend on the event template in use/which events are recorded, since the JMC automated rules analysis memory usage profile probably changes somewhat depending on which rules are being evaluated and how much work they each have to do.

@ebaron
Copy link
Member

ebaron commented Oct 5, 2021

I was able to generate a report with no problems for Cryostat itself, when deployed in OpenShift. 30 seconds, Profiling template.

@andrewazores
Copy link
Member Author

Yea, I tried that yesterday and had no problem as well. Although with the profiling template, the maximum recording size that can be processed is pretty small - it's evidently somewhere between the two recordings I have in my last comment.

FWIW I still haven't been able to observe any noticeable (by "eye") difference in report generation whether the target is Cryostat itself or some external JVM.

@ebaron
Copy link
Member

ebaron commented Oct 5, 2021

I tried the same, 30s Profiling, without this change and it runs out of memory. JFR file size is 615KB.

@andrewazores andrewazores marked this pull request as ready for review October 7, 2021 16:55
@andrewazores andrewazores requested a review from ebaron October 7, 2021 16:55
Copy link
Member

@ebaron ebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@andrewazores andrewazores merged commit c4cc64b into cryostatio:main Oct 7, 2021
@andrewazores andrewazores deleted the subprocess-report-gc branch October 7, 2021 21:43
mergify bot pushed a commit that referenced this pull request Nov 3, 2021
* fix(reports): tune flags for reliability

* chore(run.sh): default to 512MB memory

* fix(timeouthandler): increase default timeout to 30s

* fix(reports): improve subprocess timeout handling

* chore(reports): remove unused constants

(cherry picked from commit c4cc64b)

# Conflicts:
#	src/main/java/io/cryostat/net/reports/SubprocessReportGenerator.java
#	src/main/java/io/cryostat/net/web/http/generic/TimeoutHandler.java
andrewazores added a commit that referenced this pull request Nov 3, 2021
* fix(reports): tune flags for reliability

* chore(run.sh): default to 512MB memory

* fix(timeouthandler): increase default timeout to 30s

* fix(reports): improve subprocess timeout handling

* chore(reports): remove unused constants

(cherry picked from commit c4cc64b)
andrewazores added a commit that referenced this pull request Nov 3, 2021
* fix(reports): tune flags for reliability (#698)

* fix(reports): tune flags for reliability

* chore(run.sh): default to 512MB memory

* fix(timeouthandler): increase default timeout to 30s

* fix(reports): improve subprocess timeout handling

* chore(reports): remove unused constants

(cherry picked from commit c4cc64b)

* fixup! fix(reports): tune flags for reliability (#698)

Co-authored-by: Andrew Azores <aazores@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Report generation should not use EpsilonGC
2 participants