Skip to content

Target Allocator does not operate outside of Kubernetes but is essential for scaling OTEL(prometheus "receivers" are essential) #3317

@vape-spryker

Description

@vape-spryker

Component(s)

target allocator

Is your feature request related to a problem? Please describe.

I am a user of ECS but what I am writing makes sense for most out of Kubernetes use-cases like ours like bare instances.

We are using OTEL relaying heavily on the prometheus "receiver" ( I put in quotes as its a scraper :) ) these days most of the cloud-native stack is running prometheus-compliant api metrics endpoint hence this plugin of OTEL becomes critical for metrics collection. We are faced in a situation where we need to scale our collector for HA and potentially for capacity and thats where prometheus "receiver" becomes a pain which is currently only elegantly solved by the Target Allocator - any other solution like randomly spread the config and somehow feeding it via a separate configuration management creates complicated dynamics.

But here comes the problem, TA is written and designed for Kubernetes environment and currently tightly coupled in the otel-operator codebase, however it solves a domain of issues beyond orchestrator.
Currently the implementation allows me to feed static list of scraping configuration which is great but this is not flexible enough to use it in my use case. Discovery of collectors is still K8s hardcoded. Technically I can try to add AWS CloudMap discovery for ECS and maybe that will be enough to make it work but I am not sure if this contribution will be accepted in this project.

The use case of TA is outside of the domain of otel operator(the Single Responsibility principal) and it would be great that any OTEL citizen not only K8s has access to it :)

Describe the solution you'd like

Extend collector discovery with AWS CloudMap based on tags and names so the collectors be discovered.
Add service discovery using the AWS CloudMap so endpoints can be created automatically - this is not a big issue as I can provide static scraping config.
AWS has a https://github.com/aws-samples/prometheus-for-ecs/ (https://github.com/aws-samples/prometheus-for-ecs/blob/main/pkg/aws/cloudmap.go) and I am aiming to use it in a similiar approach to augment the TA.

At least, you should be able to provide static config of the collectors and scraping config to be chunked and distributed which would democratize the TA to work in any environment.

Describe alternatives you've considered

Manually randomize configuration, write it in AWS SSM and potentially let initcontainers for the collectors Service in ECS to determine which config is for which collector when they start. This is a flawed approach but apart from the TA there is no option.

Additional context

https://github.com/aws-samples/prometheus-for-ecs/

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions