Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Use our own metrics-rs Handle implementation #7014

Merged
merged 14 commits into from
Apr 8, 2021
Merged

chore: Use our own metrics-rs Handle implementation #7014

merged 14 commits into from
Apr 8, 2021

Conversation

blt
Copy link
Contributor

@blt blt commented Apr 5, 2021

This commit introduces our own metrics-rs Handle implementation, one that takes constant space per metric type. This significantly reduces our happy path memory use under load and should resolve memory leaks noted in the closing issues. The leak culprit is metrics-rs Histogram combined with tracing integration. Their implementation relies on an atomic linked list of fixed-sized blocks to store samples. The storage required by this method grows until a quiescent period is found, a period that happens regularly when metrics-rs is not integrated with tracing and does not happen at present for reasons unknown. However, in our testing we've seen that a single histogram under heavy load can consume approximately 200Mb before being garbage collected, a burden vector doesn't need to bear as our own needs for histogram are served by a simple counting bucket arrangement. Own own histogram uses a fixed, boxed slice of atomic ints as buckets. This is the shape of the data that vector was exposing already and we remove a little internal translation.

I believe this PR will significantly reduce the memory use of a loaded vector, leak not withstanding. I am unsure of the performance impact as yet. I will update the PR comments with massif results and an all-up benchmark run.

REF #6673
Closes #6461 #6456

Signed-off-by: Brian L. Troutwine brian@troutwine.us

@binarylogic
Copy link
Contributor

cc @tobz, if you have a minute, we'd love your input.

@blt blt changed the title Call read_histogram_with_clear chore: Call read_histogram_with_clear Apr 5, 2021
@tobz
Copy link
Contributor

tobz commented Apr 5, 2021

I'm not sure I can entirely speak to the more philosophical questions in the description, at least not without thinking about it more, but: read_histogram_with_clear is indeed the right choice when you collect the values individually and want to, essentially, mark them as read... which in turn, as Brian mentions, will allow crossbeam-epoch to lazily clean up old blocks and keep your memory usage fairly bounded.

The non-clearing variant is meant for when you want a long-running bucket of values to calculate fully over i.e. you always need to rehydrate some downstream data structure from t=0. Admittedly, this isn't super useful because of the unbounded growth, and honestly, could probably stand to be removed. :P

In practice, under benchmark scenarios, there are still "quiescent" periods. As soon as you read/clear the histogram, and those blocks are no longer being read, they become eligible for reclamation. If you only have one "task" doing the read/clear, then you have no concurrent readers, and thus no contention, and thus they're almost immediately reclaimed.

@blt
Copy link
Contributor Author

blt commented Apr 5, 2021

In practice, under benchmark scenarios, there are still "quiescent" periods. As soon as you read/clear the histogram, and those blocks are no longer being read, they become eligible for reclamation. If you only have one "task" doing the read/clear, then you have no concurrent readers, and thus no contention, and thus they're almost immediately reclaimed.

That's what I would have expected but it's different than what I'm seeing in practice. With this PR as the base I run vector like so:

#!/usr/bin/env bash

set -o errexit
set -o pipefail
set -o nounset
# set -o xtrace

RUSTFLAGS="-g" cargo build --release --no-default-features --features "sources-stdin,sinks-http"

yes "msg" | valgrind --tool=massif ./target/release/vector --config vector-tohttp.toml

with config

data_dir = "/tmp"

[sources.stdin]
type = "stdin"

[sinks.http]
type = "http"
inputs = ["stdin"]
compression = "none"
encoding.codec = "json"
healthcheck.enabled = false
batch.max_bytes = 128
uri = "http://127.0.0.1:8080/"

and still see steady memory growth in the same spot. Here's a massif dump: massif.out.1496054.zip

I'm going to upgrade metrics -- essentially pull in #6442 -- and see how that fares, in case you folks have fixed something upstream.

@blt
Copy link
Contributor Author

blt commented Apr 5, 2021

Hmm, yep. Even with metrics-rs fully upgraded in 4a6bee5 we still have a leak in the histogram: massif.out.1531813.zip

@blt
Copy link
Contributor Author

blt commented Apr 5, 2021

DHAT info: dhat.out.1543297.zip

@tobz
Copy link
Contributor

tobz commented Apr 6, 2021

I use macOS, and I both failed to find a reasonably simply way to compile/install massif-visualizer and the DHAT online viewer doesn't seem to work for the above DHAT output: complains about a missing 'mode' field or something. Long story short, I can't view them. :P

Looking at the textual data, though, for the Massif output, I don't see any references to Block<T> which would be what I intuitively expect to see in there if related?

Not sure if you can post a screenshot from the DHAT viewer.

@blt
Copy link
Contributor Author

blt commented Apr 6, 2021

I use macOS, and I both failed to find a reasonably simply way to compile/install massif-visualizer and the DHAT online viewer doesn't seem to work for the above DHAT output: complains about a missing 'mode' field or something. Long story short, I can't view them. :P

D'oh, my bad. Let me get you some screenshots.

Screenshot from 2021-04-05 17-32-13
Screenshot_2021-04-05 DHAT Viewer - dhat out 1543297 - At t-end (bytes)

@blt
Copy link
Contributor Author

blt commented Apr 6, 2021

@tobz FWIW I recognize that vector is a big project to debug in. I'm working on making a smaller reproducible example that mirrors our usage.

blt added a commit to blt/metrics_histo that referenced this pull request Apr 6, 2021
Extract from [vector](https://github.com/timberio/vector/) to help debug
vectordotdev/vector#7014. Notably the version in this commit
_does not_ display the issue we see in vector.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
@blt
Copy link
Contributor Author

blt commented Apr 6, 2021

@tobz alright, I've managed to reproduce in https://github.com/blt/metrics_histo. Notably the problem does not exist in blt/metrics_histo@85e0a91 but only by blt/metrics_histo@170e02d once the tracing integration is added. Here's the massif dump I've grabbed:
massif.out.1584489.zip

Here's a screenshot:

Screenshot from 2021-04-05 19-00-45

I will note that before the tracing integration was added memory use was still very high but in spikes, like you'd expect with an epoch reclamation. 160Mb for the one -- very fast -- histogram in this example program.

@tobz
Copy link
Contributor

tobz commented Apr 6, 2021

Hmm, alright, that is interesting.

I'll have to take a look tomorrow. At first blush, I'm struggling to imagine how enabling tracing integration would break block reclamation, or maybe there's somehow a leak in the tracing integration that shows up incorrectly in Valgrind output? Definitely a head scratcher.

@blt
Copy link
Contributor Author

blt commented Apr 6, 2021

Hmm, alright, that is interesting.

I'll have to take a look tomorrow. At first blush, I'm struggling to imagine how enabling tracing integration would break block reclamation, or maybe there's somehow a leak in the tracing integration that shows up incorrectly in Valgrind output? Definitely a head scratcher.

It is, I agree. I'm going to read the related code paths in-depth today. I think the signal we have out of valgrind these days is accurate but it can be... indirect sometimes.

@blt
Copy link
Contributor Author

blt commented Apr 7, 2021

Alright, @tobz an update. As I was working through metrics-rs histogram code it dawned on me that vector, at least, probably doesn't need perfect sample capture. 0f1a773 pivots away from the use of metrics_util's Handle for something that's closer to vector's particular use case. I can make a ticket for metrics-rs with my findings -- shared here already -- if you'd like.

@blt blt changed the title chore: Call read_histogram_with_clear chore: Use our own metrics-rs Handle implementation Apr 7, 2021
@blt blt requested review from jszwedko, bruceg, JeanMertz and lukesteensen and removed request for jszwedko April 7, 2021 03:44
@blt blt marked this pull request as ready for review April 7, 2021 03:45
@blt blt requested review from a team and pablosichert and removed request for a team April 7, 2021 03:45
@blt
Copy link
Contributor Author

blt commented Apr 7, 2021

Alright, good results so far. The massif output suggests that vector's memory use is much less, just over 6Mb:
Screenshot from 2021-04-06 21-05-13

massif.out.1866254.zip

@blt
Copy link
Contributor Author

blt commented Apr 7, 2021

The throughput of this PR's build versus master is equivalent for our stdin -> remap -> blackhole and file -> remap -> blackhole configurations, though it does consume much less memory. Note this PR also includes an upgrade to latest metrics-rs.

@blt
Copy link
Contributor Author

blt commented Apr 7, 2021

Flamegraph: https://gist.github.com/blt/49858036c35f674a9aac7478089952f7

This is basically what we see from master branch, further substantiating the performance parity. Test failures look like they'll be easy to repair and are mostly related to my adjusting the internal data model.

@blt
Copy link
Contributor Author

blt commented Apr 7, 2021

@tsantero ping, relevant to your interests.

src/event/metric.rs Outdated Show resolved Hide resolved
src/metrics/handle.rs Outdated Show resolved Hide resolved
src/metrics/handle.rs Show resolved Hide resolved
src/metrics/handle.rs Outdated Show resolved Hide resolved
@blt blt requested a review from a team April 7, 2021 17:52
blt added 9 commits April 7, 2021 14:51
This commit calls `read_histogram_with_clear` rather than `read_histogram` as
the later leaves the histogram's underlying block storage in place between
polls. This leads to steady memory growth of vector. However, this does not
arrest growth of the sample storage. The metrics-rs histogram code is quite
complicated and it's believable to me that there's a leak in there, or we're
misuing metrics-rs histograms.

We might also consider, now that we're on the subject, if metrics-rs's histogram
is the correct model for us. There are two problems. Firstly, the histogram
stores every sample we send it in an atomic linked list of fixed size blocks,
implying that storage _will_ grow bounded only by how often we
poll. Secondarily, the collection of these fixed blocks are preformed only when
a quiescent period is available, in the manner of crossbeam-epoch. We should
expect that these are rare in a loaded vector, though the present issue is
something different. I _think_ a fixed array of atomic integers would be
sufficient as a histogram for vector -- bounded buckets are cheap -- but this is
still the germ of a thought.

REF #6673 #6461 #6456

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
This commit upgrades to the current metrics-rs. We know this causes a
performance degredation in vector -- see
metrics-rs/metrics#179 -- but we want to know if the
observed leak is part of our older metrics-rs or is still present upstream.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
This commit follows on with my previous thoughts and replaces the metrics-rs
handle with something that more closely matches our internal_metrics use
case. Benefits include static memory use compared to the epoch mechanism in
metrics-rs, though we do lose perfect sample capture.

In my experiments this reduces vector's memory usage from 200Mb to 6Mb,
controlling for the leak.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
src/metrics/handle.rs Outdated Show resolved Hide resolved
src/metrics/handle.rs Outdated Show resolved Hide resolved
blt added 3 commits April 7, 2021 15:29
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
@blt blt requested review from bruceg and jszwedko April 8, 2021 01:37
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though you did suggest one more micro-optimization, and there is a new warning in benches/metrics_snapshot.rs

pub(crate) fn record(&mut self, value: f64) {
let mut prev_bound = f64::NEG_INFINITY;
for (bound, bucket) in self.buckets.iter_mut() {
if value > prev_bound && value <= *bound {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get rid of one of these comparisons. I think you were referring to this idea too here: https://github.com/timberio/vector/pull/7014/files#r608864011

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, well. After poking at this for a while I couldn't come up with any other implementation that makes a measurable difference in runtime, since I want to avoid assuming foreknowledge about bucket spacing. The logic strikes me as reasonable explicable so I'll let it alone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's unlikely to make a visible difference except in microbenchmarks.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
@blt blt merged commit ddb16b1 into vectordotdev:master Apr 8, 2021
@blt blt deleted the blt-metrics_mem_leak branch April 8, 2021 17:45
@binarylogic
Copy link
Contributor

👏 nice work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory usage keeps increasing with Prometheus sink
5 participants