Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histograms stored in samples take too much memory during long runs #67

Closed
vponomaryov opened this issue Mar 29, 2024 · 3 comments
Closed
Assignees

Comments

@vponomaryov
Copy link
Contributor

Screenshot from 2024-03-29 14-19-56
On the screenshot above we see memory utilization of 2 nodes which are used for running latte.
Memory utilization grew up to 10Gb for 3 hours of uptime on each of the nodes.

Debugged a bit locally and observed that memory leaks happen during each event of sampling.
My observation is that memory utilization is directly related to the made operations during a sampling period.

@pkolaczk
Copy link
Owner

Latte collects some data (summaries) in memory and processes them afterwards; so some memory growth is expected. However if it is 10 GB, that's a lot. My first guess would be histograms...

@vponomaryov
Copy link
Contributor Author

vponomaryov commented Jun 20, 2024

Latte collects some data (summaries) in memory and processes them afterwards; so some memory growth is expected. However if it is 10 GB, that's a lot. My first guess would be histograms...

The root cause is the constantly growing number of stored samples which then get used for the report generation.
And yes, those include histograms.

So, the proper solution, I think, would be to process samples on the go and store only the processed single summary result which gets updated with each sampling step.

@pkolaczk
Copy link
Owner

pkolaczk commented Jul 6, 2024

The histograms in the samples are compressed and then stored in the report for future use, e.g. for producing HdrHistogram logs (latte hdr command). So I think this needs a user-facing change. Instead of saving the histograms to the report the option for producing HDR logs should be tied directly to run and the histograms should be optionally streamed to a separate file while running. This would have an additional benefit of making the reports smaller and faster to load, which is now even more important after I added the latte list command.

As a temporary workaround, you can control the interval at which latte takes samples. For very long runs there is probably no point in capturing them every second. Fewer samples = less memory overhead.

@pkolaczk pkolaczk self-assigned this Jul 6, 2024
@pkolaczk pkolaczk changed the title latte leaks memory Histograms stored in samples take too much memory during long runs Jul 6, 2024
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 25, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 28, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 28, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 28, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 28, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
vponomaryov pushed a commit to scylladb/latte that referenced this issue Oct 29, 2024
Fixes pkolaczk#67

(cherry picked from commit 48f3c8e)

Upd:
- Remove the '--drop-sampling-log' new option.
  It's role is played by the existing '--generate-report' option
  which has opposite default value and behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants