-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
Component(s)
receiver/receivercreator
Prior related proposal: #17418
Is your feature request related to a problem? Please describe.
This issue proposes to add support for a hints/suggestions based autodiscovery mechanism for the receivercreator. This was initially discussed at https://cloud-native.slack.com/archives/C01N5UCHTEH/p1718893771362289.
Describe the solution you'd like
The goal is to provide support to the users for automatically enabling receivers by annotating their Pods properly:
apiVersion: v1
kind: Pod
metadata:
name: annotations-hints-demo
annotations:
otel.hints/receiver: redis/1
otel.hints/endpoint: '`endpoint`:6379'
spec:
containers:
- name: redis
image: redis:latest
ports:
- containerPort: 6379Inspiration comes from Filebeat's/Metricbeat's and Fluentbit's respective features.
Describe alternatives you've considered
While such a dynamic workload monitoring approach is already supported by the receivercreator and the k8s_observer, it still requires the users to have already predefined the receivercreator's sub-configs. This means that if we want to later
target and discover a different workload we need to extend the Collector's config and restart the Collector's instances.
Providing the option to achieve the same result seamlessly would ease the experience for both the application developers who deploy workloads on K8s but also for the admins of the K8s cluster and/or the OTel based observability stack.
Additional context
Feature feasibility
I have already tried that out and it seems that the receivercreator can indeed be extended to support such a feature.
There might be a need to distinguish the implementation for logs and metrics respectively, to best cover the default behaviors (i.e. we want to parse all logs from containers by default in a generic way unless we are hinted to use a technology specific "input". One thing to cover here would be to ensure that the feature would play well with a future templates provider)
A very dirty PoC can be found at 4591ecf.
This is just to illustrate the idea for now. Here is how it works:
values.yaml for installing the Collector's Helm Chart:
mode: daemonset
image:
repository: otelcontribcol-dev
tag: "latest"
pullPolicy: Never
clusterRole:
create: true
rules:
- apiGroups:
- ''
resources:
- 'pods'
verbs:
- 'get'
- 'list'
- 'watch'
config:
extensions:
k8s_observer:
auth_type: serviceAccount
node: ${env:K8S_NODE_NAME}
observe_pods: true
exporters:
debug:
verbosity: detailed
receivers:
receiver_creator:
watch_observers: [ k8s_observer ]
receivers:
service:
extensions: [health_check, k8s_observer]
pipelines:
traces:
processors: [batch]
exporters: [debug]
metrics:
receivers: [ receiver_creator]
processors: [batch]
exporters: [debug]
logs:
processors: [batch]
exporters: [debug]Target Redis Pod:
apiVersion: v1
kind: Pod
metadata:
name: redis
annotations:
otel.hints/receiver: redis/1
otel.hints/endpoint: '`endpoint`:6379'
labels:
app: redis
spec:
containers:
- image: redis
imagePullPolicy: IfNotPresent
name: redis
ports:
- name: redis
containerPort: 6379
protocol: TCPCollector's output:
2024-08-06T13:23:17.509Z info receivercreator@v0.103.0/observerhandler.go:86 handling added endpoint {"kind": "receiver", "name": "receiver_creator", "data_type": "metrics", "env": {"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"otel.hints/endpoint\":\"`endpoint`:6379\",\"otel.hints/receiver\":\"redis/1\"},\"labels\":{\"app\":\"redis\",\"k8s-app\":\"redis\"},\"name\":\"redis\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"redis\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"redis\",\"ports\":[{\"containerPort\":6379,\"name\":\"redis\",\"protocol\":\"TCP\"}]}]}}\n","otel.hints/endpoint":"`endpoint`:6379","otel.hints/receiver":"redis/1"},"endpoint":"10.244.0.31","id":"k8s_observer/3176770e-5e14-4f7b-9146-0df6a445bb6c","labels":{"app":"redis","k8s-app":"redis"},"name":"redis","namespace":"default","type":"pod","uid":"3176770e-5e14-4f7b-9146-0df6a445bb6c"}}
2024-08-06T13:23:17.509Z warn receivercreator@v0.103.0/observerhandler.go:100 handling added hinted receiver {"kind": "receiver", "name": "receiver_creator", "data_type": "metrics", "subreceiverKey": "redis/1"}
2024-08-06T13:23:17.509Z info receivercreator@v0.103.0/observerhandler.go:136 starting receiver {"kind": "receiver", "name": "receiver_creator", "data_type": "metrics", "name": "redis/1", "endpoint": "10.244.0.31", "endpoint_id": "k8s_observer/3176770e-5e14-4f7b-9146-0df6a445bb6c", "config": {"collection_interval":"10s","endpoint":"`endpoint`:6379"}}
2024-08-06T13:23:18.572Z info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 26, "data points": 31}
2024-08-06T13:23:18.572Z info ResourceMetrics #0
Resource SchemaURL:
Resource attributes:
-> redis.version: Str(7.4.0)
-> k8s.namespace.name: Str(default)
-> k8s.pod.name: Str(redis)
-> k8s.pod.uid: Str(3176770e-5e14-4f7b-9146-0df6a445bb6c)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/redisreceiver 0.103.0-dev
Metric #0
Descriptor:
-> Name: redis.clients.blocked
-> Description: Number of clients pending on a blocking call
-> Unit: {client}
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2024-08-06 13:23:16.512548474 +0000 UTC
Timestamp: 2024-08-06 13:23:18.512548474 +0000 UTC
Value: 0
Metric #1
Descriptor:
-> Name: redis.clients.connected
-> Description: Number of client connections (excluding connections from replicas)
-> Unit: {client}
-> DataType: Sum
-> IsMonotonic: false
-> AggregationTemporality: Cumulative
NumberDataPoints #0
....Specification
Prerequisite: #35544
In the context of this feature we can focus on 2 primary goals:
- Collect Metrics from dynamically discovered workloads
- Collect Logs from dynamically discovered workloads
For Pods with mutliple containers we need to ensure that we collect logs from each
one of them and that we can scope the metrics collection accordingly per container/port.
Collect Metrics dynamically from workloads
type: port.
Supported hints:
io.opentelemetry.collector.receiver-creator.metrics/receiver: redisio.opentelemetry.collector.receiver-creator.metrics/endpoint: "`endpoint`:6379"io.opentelemetry.collector.receiver-creator.metrics/username: "admin"io.opentelemetry.collector.receiver-creator.metrics/password: "passpass"io.opentelemetry.collector.receiver-creator.metrics/collection_interval: 10sio.opentelemetry.collector.receiver-creator.metrics/timeout: 1m
With some more advanced options:
io.opentelemetry.collector.receiver-creator.metrics/raw: ''
To support users that want to provide more advanced configurations.
Extra
Additionally we can support defining hints per Pod's container/port i.e.: io.opentelemetry.collector.receiver-creator.metrics.<port_name>/receiver: redis. To support use cases where a Pod consists of multiple containers and we want to specify a specific port
Collect Logs dynamically from workloads
type: pod.container (see #35544).
Supported hints:
io.opentelemetry.collector.receiver-creator.logs/enabled: true-> This will enable a default filelog receiver for the specific pathio.opentelemetry.collector.receiver-creator.logs/template: redis-> If/When we will have support for technology specific templates(Template provider opentelemetry-collector#8372)io.opentelemetry.collector.receiver-creator.logs/raw: ''
We could expose more settings of filelogreceiver through annotations. For example io.opentelemetry.collector.receiver-creator.logs/max_log_size, io.opentelemetry.collector.receiver-creator.logs/operators
Extra
Additionally we can support defining hints per Pod's container/port i.e.: io.opentelemetry.collector.receiver-creator.logs.<container_name>/operators: redis
Next steps
I would like to get feedback from the community/SIG on this. Some pieces might be missing from this proposal, so any kind of feedback would be helpful here :).
Eventually, in case we find this a good fit for the Collector I'd be happy to contribute on its implementation (and future support of it).
We could potentially break this up to 2 different scopes, one for the metrics use-case and one for the logs use-case since the requirements are a bit different in each one of them.