[collector][emitter] Split metric payloads bigger than 2MB #3454

olivielpeau · 2017-07-26T23:28:08Z

What does this PR do?

Split metric payloads from the collector that are bigger than 2MB

Motivation

Customer case where a custom check is collecting ~100,000 metrics, which causes the collector payloads to go above the threshold of 3MB (after compression) that's enforced by the metrics api endpoint.

Testing Guidelines

Added a test
The fix has worked on the customer's setup

Additional Notes

This is rather limited in scope: it can only split the metrics payload from the collector. That said I think the serialize_and_compress_metrics_payload function could be extracted out and used in dogstatsd too, if needed.

With some additional work, we could also make this more modular and allow the forwarder to call a "split" function when it receives a 413 response from the api endpoint. That would be more complex to implement though.

gmmeyer

This is a great solution! I think it's a bit unclear what's going on in parts of it, so I think some comments and such would really help.

Otherwise, I think it's great and once there's a bit of added clarity I'm very happy to give it a 👍

gmmeyer · 2017-07-27T00:12:08Z

emitter.py

+
+        for i in range(nb_chunks):
+            compressed_payloads.extend(
+                serialize_and_compress_metrics_payload({"series": series[i*series_per_chunk:(i+1)*series_per_chunk]}, max_compressed_size, depth+1, log)


Can you add some comments here? There's a lot going on in this line.

good comment, this definitely needs more explanations

gmmeyer · 2017-07-27T00:13:56Z

emitter.py

+            log.error("Maximum depth of payload splitting reached, dropping the %d metrics in this chunk", len(series))
+            return compressed_payloads
+
+        nb_chunks = len(zipped)/max_compressed_size + 1 + int(compression_ratio/2)  # try to account for the compression


What does nb_chunks mean? I am honestly uncertain, even after thinking about it for a few minutes. We might want to rename it.

nb_chunks is the number of "chunks" (i.e. smaller payloads) we'll split the current metrics_payload into. I can definitely document this more, let me know if you have a better idea for the name of the variable.

nb is number! ha that makes sense. ha. I kept thinking n and b were different words. No I think this is fine, n_chucks is the only other one I can think of.

I understood what it did, the naming just threw me off

gmmeyer · 2017-07-27T00:45:51Z

emitter.py

+    return compressed_payloads
+
+
+def serialize_and_compress_metrics_payload(metrics_payload, max_compressed_size, depth, log):


This is a really cool, elegant solution.

However, it took me a bit to understand everything that was going on in it. I think it could stand to use a few comments. I think it would be very easy to make a mistake in editing this function in the future without some added clarity, and I specified some of the places where I think it could stand to be clearer in my other comments.

very valid comment, I'm going to document this more

gmmeyer · 2017-07-27T00:52:15Z

emitter.py

@@ -29,6 +29,11 @@
 control_char_re = re.compile('[%s]' % re.escape(control_chars))


+# Only enforced for the metrics API on our end, for now
+MAX_COMPRESSED_SIZE = 2 << 20  # 2MB, the backend should accept up to 3MB but let's be conservative here
+MAX_SPLIT_DEPTH = 3  # maximum depth of recursive calls to payload splitting function


How did you arrive at this number? It's cool if it's arbitrary, clearly from the test it can split a giant object.

arbitrary number yes

olivielpeau · 2017-07-27T20:02:47Z

Thanks @gmmeyer for the review!

Just to explain the approach a bit (especially the arbitrary numbers that I chose)

In terms of why we need to have a recursive function:

if there was no compression involved, we could basically split the metrics_payload into serialized_payload_size/max_payload_size + 1 smaller payloads, and we'd be sure the resulting payloads would be smaller than the maximum size.
since there is compression involved though, there's no way to know how splitting the payloads will affect the compression ratio. So I use this kind of arbitrary split: compressed_payload_size/max_compressed_size + 1 + int(compression_ratio/2) to try to account for how the compression ratio could change in the smaller payloads. The idea being that smaller payloads might have a worse compression ratio, so we split into smaller payloads than what would be needed if there was no compression involved.
Also, since I'm not sure this logic is enough to create payloads that are small enough, I've added recursive calls. The maximum depth of 3 should be more than enough even if the compression ratio changes wildly after each split, it's really just a safety if for some reason we can't create a payload that's small enough (for example because one metric is huge). A value of 2 would probably be enough and less risky.

gmmeyer · 2017-07-27T21:55:25Z

Yea, I think the solution is good. 3 is a fine depth, there's no reason to use a smaller one, which would also likely result in more dropped payloads.

The approach makes sense and is really nice, once I understood what it was doing the rationale seemed pretty clear. It just took a little bit of parsing to get there.

olivielpeau · 2017-07-28T00:49:40Z

Addressed your review @gmmeyer, thanks!

I also lowered the max call depth to 2, I think that if we go deeper in the recursive calls we may spend way too much time serializing and compressing all the smaller payloads.

gmmeyer · 2017-07-28T04:25:56Z

@olivielpeau won't it not descend if it's already small enough? The change is fine, though!

gmmeyer · 2017-07-28T04:26:26Z

emitter.py

+
+        for i in range(n_chunks):
+            # Create each chunk and make them go through this function recursively ; increment the `depth` of the recursive call
+            compressed_payloads.extend(


[collector][emitter] Split metric payloads bigger than 2MB

b43cae1

olivielpeau added the core label Jul 26, 2017

gmmeyer suggested changes Jul 27, 2017

View reviewed changes

Address review comments

b4bf33b

gmmeyer approved these changes Jul 28, 2017

View reviewed changes

olivielpeau added this to the 5.17 milestone Aug 18, 2017

Merge branch 'master' into olivielpeau/split-collector-metrics-payloads

839f187

olivielpeau merged commit 7913e0a into master Aug 21, 2017

olivielpeau deleted the olivielpeau/split-collector-metrics-payloads branch August 21, 2017 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[collector][emitter] Split metric payloads bigger than 2MB #3454

[collector][emitter] Split metric payloads bigger than 2MB #3454

olivielpeau commented Jul 26, 2017

gmmeyer left a comment

gmmeyer Jul 27, 2017

olivielpeau Jul 27, 2017 •

edited

Loading

gmmeyer Jul 27, 2017

olivielpeau Jul 27, 2017

gmmeyer Jul 27, 2017

gmmeyer Jul 27, 2017

gmmeyer Jul 27, 2017

olivielpeau Jul 27, 2017

gmmeyer Jul 27, 2017

olivielpeau Jul 27, 2017

olivielpeau commented Jul 27, 2017

gmmeyer commented Jul 27, 2017

olivielpeau commented Jul 28, 2017

gmmeyer commented Jul 28, 2017

gmmeyer Jul 28, 2017

		return compressed_payloads


		def serialize_and_compress_metrics_payload(metrics_payload, max_compressed_size, depth, log):

[collector][emitter] Split metric payloads bigger than 2MB #3454

[collector][emitter] Split metric payloads bigger than 2MB #3454

Conversation

olivielpeau commented Jul 26, 2017

What does this PR do?

Motivation

Testing Guidelines

Additional Notes

gmmeyer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olivielpeau Jul 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olivielpeau commented Jul 27, 2017

gmmeyer commented Jul 27, 2017

olivielpeau commented Jul 28, 2017

gmmeyer commented Jul 28, 2017

Choose a reason for hiding this comment

olivielpeau Jul 27, 2017 •

edited

Loading