-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize and extend Parca retention #630
Comments
It's also now just down, we need to stabilize this - use object storage or so to avoid OOMs and have profiles for longer. |
For context on those who would like to help: Use cases for cont. profilingWe need continuous profiling, mostly for the retrospective profile storage and to easily access it. Without cont. profiling we have to:
This is not super bad, but:
Current implementationWe have a Parca running in the cluster. We provide a link to it on each benchmark and it's accessible publicly. Currently it is down (503 svc unavailable) because @bboreham scaled it down (AFAIK) because it was blocking something (is it because it’s doing CPU profile obtain and we cannot do another one manually?) and it was crashlooping a lot (OOM). I change replica to 1 just now for debugging. TODOIn practice we need a stable solution where its maintenance effort is *lower than the fuss of going from manual steps of taking profiles. We can scrape profiles with increased interval too for stability. Maybe use GCS for storage? For sure some memory stability - it would be better to lose some old data than crashloop. cc @metalmatze |
I don't think we set any retention, but maybe it OOMs at some point? We see inconsistent Parca retentions, let's improve it.
E.g. running prombench for 4 days, yet only ~1h or data:
cc @bboreham @kakkoyun
The text was updated successfully, but these errors were encountered: