-
Notifications
You must be signed in to change notification settings - Fork 207
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add memory limiter to drop data when a soft limit is reached (#…
…1827) ## Problem At the moment, if there is pressure in the pipeline for any reason, and batches are failed to export, they will start building up in the queues of the collector exporter and grow memory unboundly. Since we don't set any memory request or limit on the node collectors ds, they will just go on to consume more and more of the available memory on the node: 1. Will show a pick in resource consumption on the cluster metrics. 2. Starve other pods on the same node, which now has less spare memory to grow into. 3. If the issue is not transient, the memory will just keep increasing over time 4. The amount of data in the retry buffers, will keep the CPU busy attempting to retry the rejected or unsuccessful batches. ## Levels of Protections To prevent the above issues, we imply few level of protections, listed from first line to last resort: 1. setting GOMEMLIMIT to a (now hardcoded constant) `352MiB`. At this point, go runtime GC should kick in and start reclaiming memory aggressively. 2. Setting the otel collector soft limit to (now hardcoded constant) `384MiB`. When the heap allocations reach this amount, the collector will start dropping batches of data after they are exported from the `batch` processor, instead of streaming them down the pipeline. 3. Setting the otel collector hard limit to `512MiB`. When the heap reaches this number, a forced GC is performed. 4. Setting the memory request to `256MiB`. This ensures we have at least this amount of memory to handle normal traffic and some slack for spikes without running into OOM. the rest of the memory is consumed from available memory on the node which by handy for more buffering, but may also cause OOM if the node has no resources. ## Future Work - Add configuration options to set these values, preferably as a spectrum for trace-offs: "resource-stability", "resource-spikecapacity" - drop the data as it received not after it is batched - open-telemetry/opentelemetry-collector#11726 - drop data at receiver when it's implemented in collector - open-telemetry/opentelemetry-collector#9591
- Loading branch information
Showing
7 changed files
with
103 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
package common | ||
|
||
import ( | ||
odigosv1 "github.com/odigos-io/odigos/api/odigos/v1alpha1" | ||
"github.com/odigos-io/odigos/common/config" | ||
) | ||
|
||
func GetMemoryLimiterConfig(memorySettings odigosv1.CollectorsGroupResourcesSettings) config.GenericMap { | ||
// check_interval is currently hardcoded to 1s | ||
// this seems to be a reasonable value for the memory limiter and what the processor uses in docs. | ||
// preforming memory checks is expensive, so we trade off performance with fast reaction time to memory pressure. | ||
return config.GenericMap{ | ||
"check_interval": "1s", | ||
"limit_mib": memorySettings.MemoryLimiterLimitMiB, | ||
"spike_limit_mib": memorySettings.MemoryLimiterSpikeLimitMiB, | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters