Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DogStatsD extended aggregation support #558

Merged
merged 7 commits into from
Aug 18, 2024

Conversation

GrgDev
Copy link
Contributor

@GrgDev GrgDev commented May 30, 2024

This PR aims to add the ability for statsd exporter to receive DogStatsD extended aggregation metrics. It is a feature I requested in Issue #557 and I wanted to try to implement it myself following the in-issue commented advice from @glightfoot.

Summary of Implementation

When DogStatsD has extended aggregation enabled, instead of sending:

my_distribution_metric:21|d|#all,my,tags
my_distribution_metric:43.2|d|#all,my,tags
my_distribution_metric:1657|d|#all,my,tags

it sends

my_distribution_metric:21:43.2:1657|d|#all,my,tags

This PR updates the LineToEvents function in the specific case of more than one : left of the initial |. If so, assume it is an extended aggregation line, convert it back into its multi-line non-aggregated form, then pass it onto the rest of the function as the samples. The logic enforces that this form of aggregation metrics must be only of the types of distribution, histogram, or timing.

@GrgDev GrgDev force-pushed the add-datadog-ext-agg-support branch from 8aa2bdd to 4d3ab3d Compare May 30, 2024 18:42
Signed-off-by: Greg Chambers <gregc@unity3d.com>
@@ -403,6 +403,17 @@ func TestLineToEvents(t *testing.T) {
"datadog tag extension with both valid and invalid utf8 tag values": {
in: "foo:100|c|@0.1|#tag1:valid,tag2:\xc3\x28invalid",
},
"datadog timings with extended aggregation values": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to add tests with metrics without tags as well, invalid tests where someone submits a line with a different tagging style (e.g. SignalFX) and the dogstatsd extended aggregation, and invalid tests for extended aggregation with counters and gauges

Copy link
Contributor Author

@GrgDev GrgDev Jun 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have now added these tests plus some extras as well as the additional logic required in the line.go code to pass them.

One more question. If parsing for DogStatsD tags is disabled, should a line with the extended aggregation format result in:

  1. no tags but otherwise the same output as if DogStatsD parsing was enabled
  2. no output as the line is considered a bad line
    ?

The other tests seemed to suggest that the main side effect of turning off parsing should be no tags, but since this is a DogStatsD specific feature, should an extended aggregation line be considered invalid altogether in such a scenario?

My opinion would be that option 1 is what we should go with because the current code base suggests that the parsing flavor options only affects the tag parsing and no other part of the line. Also, as the tests show, it already enforces that extended aggregation cannot be used with other flavors of StatsD.

GrgDev added 6 commits June 2, 2024 11:54
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
Signed-off-by: Greg Chambers <gregory.w.chambers@gmail.com>
@GrgDev
Copy link
Contributor Author

GrgDev commented Jun 10, 2024

@glightfoot Friendly nudge

Copy link
Contributor

@matthiasr matthiasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, and thank you @glightfoot for the initial review as well!

Comment on lines +244 to +251
aggValues := strings.Split(lineParts[0], ":")
aggLines := make([]string, len(aggValues))
_, aggLineSuffix, _ := strings.Cut(elements[1], "|")

for i, aggValue := range aggValues {
aggLines[i] = strings.Join([]string{aggValue, aggLineSuffix}, "|")
}
samples = aggLines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll accept this as is, but this amount of string manipulation is making me nervous – both for the potential to mess up a line, and because of the amount of intermediate strings we're leaving for the garbage collector to deal with.

I would prefer if we could delay dealing with the multiple samples in a line until line 280, when we turn the value into a float. There, we could turn the single value float64 into a values []float64, looping over the : separated values to parse them in turn. We should be able to do this without allocating any additional strings. Then, further down on line 331, where we're emitting 1/samplingFactor events, we could also loop over that.

That being said, I've left this PR hanging for too long, so I'll take working over perfect. If anyone runs into actual issues from the current approach, or feels otherwise motivated, maybe this comment can help guide them 😃

@matthiasr matthiasr merged commit 8aec6fd into prometheus:master Aug 18, 2024
3 checks passed
matthiasr added a commit that referenced this pull request Aug 18, 2024
And add a mention of #558 extended aggregations to the README.

Signed-off-by: Matthias Rampke <matthias@prometheus.io>
if strings.Contains(lineParts[0], ":") {
// handle DogStatsD extended aggregation
isValidAggType := false
switch lineParts[1] {
Copy link

@m-barthelemy m-barthelemy Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After updating to 0.27.1, this line is triggering a panic: runtime error: index out of range [1] with length 1 for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants