[awsemfexporter] Group exported metrics by labels #2317
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is the 2nd part of splitting #1891 which was originally done by @kohrapha.
Currently, each incoming metric is pushed to CloudWatch logs as a separate log. However, many metrics share the same labels so this results in a lot of duplicate data. To solve this, this PR implements batching of metrics by their labels such that metrics with the same set of labels will be exported together.
Specifically, metrics are batched together if they have the same:
The batched metrics are further split up if
metric_declarations
are defined. Currently, the filtered metrics are split up by the metric declaration rules they match. Since they have the same labels, they will have the same dimensions if they match the same metric declaration rules.Caveat: 2 groups of filtered metrics can still share the same dimension sets if their metric declarations result in the same dimension set. We currently don't perform this check to group the 2 groups together.
Implementation Details
Since this PR includes a lot of refactoring, I will give an overview of how the new metric translation logic works. Given a list of
ResourceMetrics
viaemfExporter.pushMetricsData
,ResourceMetrics
in the list, we will add its metrics intogroupedMetrics
(a map consisting of batched metrics).ResourceMetrics
, we create aCWMetricMetadata
which consists of metadata (i.e. namespace, timestamp, log group, log stream, instrumentation library name) associated with the given metric. This will be added togroupedMetrics
for future processing.DataPoints
from each metric. For eachDataPoint
, we define its "group key" using its labels, namespace, timestamp, log group, and log stream. We use this group key to add the metric to its corresponding group ingroupedMetrics
.groupedMetrics
, we iterate through each group and translate it intoCWMetric
. In this stage, we will filter out metrics if there are metric declarations defined and set the dimensions for exported metrics (w/ rolled-up dimensions).CWMetric
into an EMF log and push it to CloudWatch using the appropriate log group and log stream found in the group'sCWMetricMetadata
.Testing:
Tests were added for new functions and tests for modified functions were updated. Additionally, this PR was tested in a sample environment using an NGINX server on EKS. Given the following config (same as in #2):
we get the following cases: