Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define mapping for new k8s resources #16558

Closed
ChrsMark opened this issue Feb 25, 2020 · 11 comments
Closed

Define mapping for new k8s resources #16558

ChrsMark opened this issue Feb 25, 2020 · 11 comments
Assignees
Labels
autodiscovery breaking change Team:Integrations Label for the Integrations team Team:Platforms Label for the Integrations - Platforms team v8.0.0

Comments

@ChrsMark
Copy link
Member

ChrsMark commented Feb 25, 2020

After new resources were introduced with the #14875, we need to update the mapping which is still behind and should be updated accordingly.

The major concern here is about the place of the labels. Should it be like kubernetes.labels.* for all the resources or it should go under each resource like kubernetes.pod.labels.*, kubernetes.service.labels.*?

Another option would be to have them duplicated in both places so as to be able to search both on resource level but also across resources level for instance:

  1. filter all resources with kubernetes.labels.app==node-exporter , will return events for pods, services, nodes etc. This is an equivalent of kubectl query :
$kubectl get all -l app=node-exporter --show-labels                         
NAME                      READY   STATUS    RESTARTS   AGE     LABELS
pod/node-exporter-z2h4r   1/1     Running   0          3h19m   app=node-exporter,controller-revision-hash=5ddc9599dc,name=node-exporter,pod-template-generation=1

NAME                    TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE     LABELS
service/node-exporter   ClusterIP   None         <none>        9100/TCP   3h19m   app=node-exporter,name=node-exporter
  1. filter on resource level like kubernetes.pod.labels.app==node-exporter which will return events only for pods. kubectl equivalent:
kubectl get pods -l app=node-exporter --show-labels                              
NAME                  READY   STATUS    RESTARTS   AGE     LABELS
node-exporter-z2h4r   1/1     Running   0          3h21m   app=node-exporter,controller-revision-hash=5ddc9599dc,name=node-exporter,pod-template-generation=1

Let's discuss about it!

cc: @exekias @jsoriano

Related to: #16554, #16480, #16483

@ChrsMark ChrsMark added Team:Integrations Label for the Integrations team autodiscovery Team:Platforms Label for the Integrations - Platforms team labels Feb 25, 2020
@jsoriano
Copy link
Member

I think that we should have them at least in kubernetes.labels.*, to allow filtering all resources together, this can be useful in overview dashboards with different kinds of resources.

To retrieve resources of an specific kind we can use other fields as event.dataset (kubernetes.labels.app: node-exporter AND event.dataset: (kubernetes.pod OR kubernetes.state_pod)).
If common fields are not enough, we could consider adding an specific kubernetes field for the kind of resource that originated the metric/log, so we can do queries like kubernetes.labels.app: node-exporter AND kubernetes.resource.kind: pod.

@exekias
Copy link
Contributor

exekias commented Feb 27, 2020

I see that point, and my thoughts go in the same direction. It's important that we are able to correlate labels, even from different resources.

That said, we have another problem, when we enrich a Pod with namespace metadata, we end up with both pod & namespace labels in the same document. Which somehow forces us to put namespace labels under a different key (ie namespace.labels). I wonder if we should revisit this decision.

@exekias
Copy link
Contributor

exekias commented Feb 27, 2020

For reference, this is an example of where it's useful to have namespace labels/annotations around with the Pod: #16321

@jsoriano
Copy link
Member

Well, namespaces are a bit special, because most resources belong to some namespace. It can be a good option to have all the labels in kubernetes.labels, and all namespace labels in something like kubernetes.namespace.labels.

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 11, 2020

After investigating into #16834 I'm putting some thoughts together so as to move it forward:

I see 2 cases here: Pure resource metadata and embedded resource metadata.

For pure resources' metadata like kubernetes.pod.labels.*, kubernetes.service.labels.*, kubernetes.node.labels.*, kubernetes.namespace.labels.* options:

  1. we could just move the resource.labels.* just under kubernetes.labels.* (this is what we have right now)
  2. do a breaking change and move them under kubernetes.pod.labels.* etc
  3. keep them both on resource.labels.* and kubernetes.labels.*

For embedded resource metadata, like when in pod we add node metadata or namespace metadata:

{
    "pod": {
        "name": "obj",
        "uid":  "uid",
    },
    "namespace": "default",
    "namespace_uid": "uid",
    "namespace_labels": {
            "nskey": "nsvalue",
    },
    "node": {
        "name": "testnode",
        "uid":  "uid",
        "labels": {
            "nodekey": "nodevalue",
        },
    },
    "labels": {
        "foo": "bar",
    },
    "annotations": {
        "app": "production",
    }
}

options are:

  1. move all resource specific labels to kubernetes.labels.* and with that we will have a common pool of labels for all of the resources. Do not add labels under resource.labels.* too.
  2. move only reported resource labels under kubernetes.labels.* and keep subresource labels under subresource.labels.*, like in the example above with pod and node.
  3. move all labels under kubernetes.labels.* for both the main resource but for the subresource too and additionally keep labels uncer each resource at resource.labels.* for instance: pod.labels.* and node.labels.*.

My current opinion is to go with options no_3 for pure and embedded resources. I think it simplifies the understanding of the feature and is more robust. An example would be:

{
    "pod": {
        "name": "obj",
        "uid":  "uid",
        "labels": {
            "foo": "bar",
         },
    },
    "namespace": {
      "name": "default",
      "uid": "uid",
       "labels": {
            "nskey": "nsvalue",
    },
    "node": {
        "name": "testnode",
        "uid":  "uid",
        "labels": {
            "nodekey": "nodevalue",
        },
    },
    "labels": {
        "foo": "bar",
        "nskey": "nsvalue",
        "nodekey": "nodevalue"
    },
    "annotations": {
        "app": "production",
    }
}

The breaking change goes only on the namespace field. The only problem I see with namespace is that the experience would not be so close to what kubectl for instance does: kubectl get pods -n prod since in our fields namespace's name would be under namespace.name and not on the root level of the fields as it was until now.

@ChrsMark ChrsMark self-assigned this Mar 11, 2020
@exekias
Copy link
Contributor

exekias commented Mar 11, 2020

Thank you for the research and explanation!

For "pure resource" metadata I think option 1 is best. Is what we were doing already for Pods. It would be interesting to double check that we do the same when we enrich info about services/nodes. As I suspect these are going under kubernetes.<kind>.lables

As per embedded, in my opinion having all kind's labels merged defeats their purpose, as there is no way to differentiate a label source (except by manual inspection). Put for example, if I want to find all the kinds with a certain label, I will filter by them under kubernetes.labels. That will match some Pods that don't have the label, while their node/namespace has it, this is not what I want.

Option 2 makes more sense to me, where you report labels for the monitored resource, while keeping the rest for context.

@jsoriano & @blakerouse, any thouhgs ?

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 11, 2020

Thank you for the research and explanation!

For "pure resource" metadata I think option 1 is best. Is what we were doing already for Pods. It would be interesting to double check that we do the same when we enrich info about services/nodes. As I suspect these are going under kubernetes.<kind>.lables

This is right we don't do this for nodes and services and could be fixed by #16834.

@blakerouse
Copy link
Contributor

I think Option 2 makes the most sense as well. Being that the event is for a specific kind it makes sense that the labels for that kind be at kubernetes.labels.* and any related resources have those labels nested in kubernetes.<kind>.labels.

@jsoriano
Copy link
Member

+1 to option 2, kubernetes.labels for "pure resources" labels, and kubernetes.<kind>.labels for labels of nested resources.

@exekias
Copy link
Contributor

exekias commented Mar 12, 2020

@ChrsMark I guess we can close this now? The only missing bit would be to rename namespace to namespace.name and namespace_* to namespace.* right. This part is tracked #13911

@ChrsMark
Copy link
Member Author

@ChrsMark I guess we can close this now? The only missing bit would be to rename namespace to namespace.name and namespace_* to namespace.* right. This part is tracked #13911

👍 I'm closing it. Thank you everyone for your input on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autodiscovery breaking change Team:Integrations Label for the Integrations team Team:Platforms Label for the Integrations - Platforms team v8.0.0
Projects
None yet
Development

No branches or pull requests

5 participants