Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(report): add granular CPU/mem configurability #7

Merged
merged 26 commits into from
Jan 10, 2022
Merged

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Dec 20, 2021

Fixes #5
Fixes #6

@andrewazores andrewazores added fix feat New feature or request labels Dec 20, 2021
@andrewazores andrewazores force-pushed the oom branch 3 times, most recently from 8878a19 to 7f188d0 Compare December 21, 2021 19:05
Copy link

@hareetd hareetd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I packaged this branch and ran a smoketest instance of https://github.com/cryostatio/cryostat/pull/779, report generation worked when uploading the ~10MB file Janelle linked (https://github.com/cryostatio/cryostat/pull/779#discussion_r766944173) that was previously failing for her.

src/main/docker/Dockerfile.jvm Show resolved Hide resolved
@hareetd
Copy link

hareetd commented Dec 21, 2021

Any chance you could link the ~150MB file you mentioned in the corresponding issue #5, perhaps through Google Drive? I'd like to test out the PR using that as well.

@andrewazores
Copy link
Member Author

Any chance you could link the ~150MB file you mentioned in the corresponding issue #5, perhaps through Google Drive? I'd like to test out the PR using that as well.

I'll email you a link to it.

@hareetd
Copy link

hareetd commented Dec 21, 2021

If I upload the recording, then click View Report, it hangs.

If I upload the recording, then click the dropdown, the report shows up nicely in the dropdown after 5s, and a subsequent click > on View Reports show the report in a new tab.

I spoke too soon, Janelle's 10MB one and the ~150MB one you sent me are still exhibiting the above behaviour. I originally just tested the dropdown report for the 10MB recording and not View Report in a new tab.

As for the ~480MB file, both the View Report and dropdown don't work, unfortunately.

My workflow was to upload the files to archives and for each one do: View Report --> Dropdown --> View Report. I did this in order of smallest to largest. For the 10MB and 150MB files the result was: Failure --> Success --> Success. For the 480MB file the result was: Failure --> Failure --> Failure.

See:

Cryostat Logs

Cryostat Reports Container Logs

@andrewazores
Copy link
Member Author

andrewazores commented Dec 21, 2021

The Connection was closed one looks like it's failing because the POST sending the data from cryostat to -reports ended up timing out, I think? Maybe we want configuration parameters to control the maximum time that Cryostat will wait for a response and the maximum time that -reports will spend on reading a request's POST body (if it does have such a time limit currently). Having a timeout at some point does seem desirable because we don't want to have Cryostat API requests that just hang indefinitely if the -reports container(s) are very slow or the network link between them is very slow. If the end client is a web browser then it likely also has an HTTP request timeout that will kill the connection on the initial Cryostat API reports request after 30 seconds or so.

I spoke too soon, Janelle's 10MB one and the ~150MB one you sent me are still exhibiting the above behaviour. I originally just tested the dropdown report for the 10MB recording and not View Report in a new tab.

I've been trying to figure out where all the memory usage is going because it does seem like the sidecar -reports container should be able to handle larger recording files than it currently does, but I haven't had too much success yet. That cryostatio/cryostat#779 PR is set up so that -reports has JDP and JMX enabled, so you can interact with it as any other target and use Cryostat to collect a Profiling JFR recording from the reports sidecar.

You can try increasing the CPU and memory allocation given to the container by editing those parameters in the smoketest.sh flags passed to podman and set in the JAVA_OPTIONS. The default 1 CPU/512MB doesn't go as far as I'd like it to, that's for sure.

You can also use the run.sh from this PR to test the capabilities with different resource allocations more quickly and easily, ex. CPUS=2 MEMORY=2G sh run.sh, then http -f :8080/report file@my.jfr.

@andrewazores
Copy link
Member Author

I tried opening the 150MB .jfr file using JMC, while also running a Profiling recording on JMC itself, and found that the heap memory usage there is also dominated by allocations that stem from the report generation. It ends up looking very similar to the -reports allocations and heap usage. So, I think the 150MB file or any larger ones just really do take quite a lot of memory to process. Opening it in JMC normally makes it seem pretty quick and easy but that's because JMC as a desktop application is able to use all of your workstation's CPU cores and can allocate a lot of heap space (probably about 1/4 of your workstation's RAM) by default, and these resource allocations completely dwarf the limits we're testing -reports with.

@andrewazores
Copy link
Member Author

$ cd workspace/jmc
$ $EDITOR ./target/products/org.openjdk.jmc/linux/gtk/x86_64/jmc.ini
# add "-Xms512M", "-Xmx512M", "-XX:ActiveProcessorCount=1" to the flags
$ ./target/products/org.openjdk.jmc/linux/gtk/x86_64/jmc

I tried this out to launch the JMC desktop application with similar constraints that we apply to cryostat-reports (512M memory, 1 CPU core), then tried opening the 150MB .jfr with it. JMC also failed to open it, although JMC's failure handling is nice and graceful:

Screenshot_2021-12-22_14-12-02

It seems to recognize that the recording is too large, and then presents a range selector to allow you to choose a narrower time window to try to analyze instead. After a couple of trials it seems the width of the bar corresponds to how much memory is currently available to JMC, so there appears to be some heuristic that JMC applies here to estimate how large of a recording it can handle given its memory constraints.

If there is such a utility in the JMC code somewhere doing this approximation then it would be very neat if we could reuse that and expose it in the cryostat-core report generator. Then we can reject requests for recordings that we estimate to be too large, rather than wasting resources trying to process the recording and ending in an OOM or TimeoutException.

run.sh Outdated Show resolved Hide resolved
@hareetd
Copy link

hareetd commented Dec 22, 2021

That makes sense, I like the idea of applying a rejection heuristic based on recording size.

I've been playing around with run.sh and with 8 cores and 8GB memory allocated, the 10MB, 150MB, and 480MB recordings take about 6.7s, 7.7s, and 87s to process report generation, respectively. I edited smoketest.sh to have the same allocation configuration but the previous failures are still observed (only dropdown report generation (and not View Report) works for the 10MB and 150MB recordings, with the 480MB failing in both cases).

Do you have an idea as to why the reports container can smoothly process all three recordings over a simple HTTP setup but for some reason when clicking View Report with the full Cryostat setup it doesn't work? I guess that's more of an issue with the Cryostat backend/frontend since this container does its job well in isolation.

@andrewazores
Copy link
Member Author

That makes sense, I like the idea of applying a rejection heuristic based on recording size.

I've been playing around with run.sh and with 8 cores and 8GB memory allocated, the 10MB, 150MB, and 480MB recordings take about 6.7s, 7.7s, and 87s to process report generation, respectively. I edited smoketest.sh to have the same allocation configuration but the previous failures are still observed (only dropdown report generation (and not View Report) works for the 10MB and 150MB recordings, with the 480MB failing in both cases).

Do you have an idea as to why the reports container can smoothly process all three recordings over a simple HTTP setup but for some reason when clicking View Report with the full Cryostat setup it doesn't work? I guess that's more of an issue with the Cryostat backend/frontend since this container does its job well in isolation.

That sounds to me like you're running into the HTTP timeout. Cryostat will only wait for up to 29 seconds to get a response from the report sidecar. Any longer and it cancels the request and reports a failure status back to the requesting client.

https://github.com/cryostatio/cryostat/pull/779/files#diff-08d5706f3bed9cd36f901546d703ac1e742d2fb33ff8330315e8e4483809ee25R73

The difference in behaviour for the dropdown vs View Report is still a bit strange sounding and I'll have to take a deeper look into that again, but that's going to be on the main Cryostat backend side of things.

@andrewazores
Copy link
Member Author

andrewazores commented Dec 22, 2021

@andrewazores
Copy link
Member Author

I've added some logic to apply a configurable memory factor heuristic so that the service can bail out early if it thinks the provided recording is too large to process, so for that reason I have also updated this PR to close #6 - it's still possible to OOM, especially if the memory factor configuration is set to a low number, but I think this is about the best we can do.

I have also added some logic to check if the uploaded file was compresed, and if so, to first decompress it to disk before trying to process it. This should keep compressed recordings from exerting additional memory pressure, since the decompression doesn't also need to be done streaming in-memory while also processing the contents of the recording.

Finally, the elapsed time taken to decompress the file (if any) is checked, and if this takes too long then a 504 response is sent. If it didn't take too long then we continue to generate the report but now in a ForkJoinPool thread, and we only wait up to timeout - elapsed (where elapsed is essentially any decompression time) for that generation to complete. If it does not complete in time then we also respond with a 504. I don't know if the underlying JMC report generation handles thread interruption well and will properly cancel processing here but at least we can respond nicely to clients.

@andrewazores
Copy link
Member Author

To create a compressed recording I simply did ex. pigz --best --keep foo.jfr, which will create foo.jfr.gz. Then http -f :8080/report file@foo.jfr.gz can be used as usual. I will eventually update the main Cryostat backend PR, or maybe open a new PR, to add configuration to enable compression on that end. I suspect that in a lot of cases, the extra time taken to compress the file on Cryostat's side and decompress it on the -reports side will probably be worth the reduced network transfer time. This also ties in neatly with cryostatio/cryostat#732 .

run.sh Show resolved Hide resolved
hareetd
hareetd previously approved these changes Jan 7, 2022
Copy link

@hareetd hareetd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Tested it out using 8GB of memory, looks like it's working well for both compressed and uncompressed files. I wasn't able to trigger a 504 response since the memory heuristic with the default memoryFactor=10 is working well and quickly responding with a 413 for files that are too large to process.

@andrewazores
Copy link
Member Author

You should be able to generate a 504 response pretty easily by doing something like MEMORY=100M sh run.sh. This is enough for the microservice to start up but obviously not enough to generate a result with even a 10MB recording file with the default 10 memoryFactor.

@hareetd
Copy link

hareetd commented Jan 7, 2022

I tried that configuration out with a 10 MB recording and unfortunately, it still isn't enough to trigger a 504. With the memory factor set to 10, the memory heuristic is still triggered (i.e. a 413). However, if I lower the memory factor to 4 it passes the heuristic but causes an OOM (as you mentioned earlier). But by setting the timeout to 50ms I can artificially trigger a 504 due to decompression taking "too long".

It seems the heuristic is doing its job well but I wonder in what conditions a 504 would be triggered during the actual report generation, with a more reasonable timeout of 30 seconds?

@andrewazores
Copy link
Member Author

Oh sorry, I read/wrote 504 but I had 413 in my head. Yea, I did have a combination I tested at some point that produced a timeout response while the timeout was still set to the default 30s, but I don't remember what that combination of memory allocation and recording size was - and it wouldn't necessarily behave exactly the same on your machine either since we have different CPUs etc. leading to different rules processing rates.

Another way to cause a 504 to happen could be to allocate a large amount of memory but a very small amount of CPU time. I just tested that out by modifying run.sh to pass --cpus ${CPUS} to Podman, but hardcode ActiveProcessorCount=1 to the JVM. This has the effect of telling the JVM to behave as if it has one full CPU core, but in reality Podman will limit it to only get some proportion of CPU time. Then I did CPUS=0.25 MEMORY=8G sh run.sh, and posted my 149MB test file.

I think in real practical terms this is when we would most likely see 504s manifesting - when the -reports deployment has essentially overprovisioned memory for the amount of CPU available. It's also possible they could occur when the available memory is just barely enough to process the requested recording, but the JVM gets stuck with very little free allocation room between rules evaluations and ends up spending significant time on GC cycles (GC thrashing). This might be possible for smaller specific memory size/recording size with a single CPU core (even a whole one), particularly if the JVM is running with the Serial GC.

@andrewazores
Copy link
Member Author

Anyway, that last minor unit test addition was the last, last-minute change I thought of for this PR. I won't make any further alterations unless something comes up again in review.

@hareetd hareetd self-requested a review January 7, 2022 21:00
Copy link

@hareetd hareetd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that CPU configuration gave me a 504 during report generation. I checked over the code and it looks good to go.

Copy link
Contributor

@jan-law jan-law left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that CPU configuration gave me a 504 during report generation.

Same here. The 413 is working well too.

Tests also look good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request fix
Projects
None yet
3 participants