Support additional custom source configurations #66

slintes · 2021-04-30T09:48:16Z

What would you like to be added:
Recently configuration options for the NFD worker were extended: it supports a /etc/kubernetes/node-feature-discovery/custom.d directory now, which can be used for additional custom source configurations besides those in the main worker config. Typically this directory will be populated by mounting ConfigMaps. The needed volume and mount configurations should be handled by the operator.

Why is this needed:
The new configuration option allows dynamic configuration by potentially multiple parties. Each party can maintain their own ConfigMap without the need for cross party agreements. Ideally the operator can watch for those ConfigMaps and reconfigure the worker on the fly by adding and removing relevant volumes and mounts.

Design considerations
The big question is: how to find the relevant ConfigMaps. Some thoughts:

add the ConfigMap names to the NFD CRD. This makes implementation easy. But this isn't very dynamic and still needs manual work, which introduces some risk that in case of misconfiguration (name mismatch between config and actual ConfigMap, accidental deletion of ConfigMaps, ...?) the nfd workers won't start because a volume mounts fails.
let the operator find relevant ConfigMaps and (re)-configure the worker daemonset dynamically. However, we still need to know which ConfigMaps are relevant. The first restriction is easy: the CM needs to be in the same namespace as the NFD CR / worker. And then? A very basic check might be to have a look into the data and try to parse it as custom source configuration. Or at least look for e.g. the "matchOn" string. An alternative might be to require the ConfigMaps to have a certain name, e.g. a "custom-config-" prefix (maybe configurable in the NFD CR), or a label, or an annotation.
what worries me a bit: for this dynamic solution the operator would need to watch for all ConfigMaps in all namespaces. At least I did not find how to dynamically restrict the watch to the desired namespace(s). Is this a problem (thinking of huge clusters with many CMs...)?
side note: talking about namespaces: do I see correct that the operator watches ALL namespaces for NFD CRs and potentially installs multiple instances of NFD in multiple namespaces? Is that on purpose?

I think my favorite is using a marker label for ConfigMaps which should be mounted.
When we agree a way forward, I volunteer to implement it, in order to finish the work which started on the NFD worker :)

Related: #53

The text was updated successfully, but these errors were encountered:

ArangoGutierrez · 2021-04-30T12:12:21Z

/assign

mythi · 2021-05-04T06:22:45Z

side note: talking about namespaces: do I see correct that the operator watches ALL namespaces for NFD CRs and potentially installs multiple instances of NFD in multiple namespaces? Is that on purpose?

your observation is correct, see #54

slintes · 2021-05-19T21:16:57Z

in case of misconfiguration (name mismatch between config and actual ConfigMap, accidental deletion of ConfigMaps, ...?) the nfd workers won't start because a volume mounts fails.

I learned there is a optional flag for configmap mounts, so this isn't an issue

k8s-triage-robot · 2021-08-17T21:42:08Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

marquiz · 2021-08-18T04:45:29Z

Let's not close this yet
/remove-lifecycle stale

k8s-triage-robot · 2021-11-16T05:19:33Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

marquiz · 2021-11-16T05:52:21Z

/remove-lifecycle stale

marquiz · 2022-01-20T10:31:43Z

Hmm, do we still want to keep this open? Especially after NFD v0.10.0 (and #653)? I'd say no.

If we would implement this it would probably mean a separate CRD for the extra custom configs which doesn't make much sense after kubernetes-sigs/node-feature-discovery#653, in practice overlapping functionality and more maintenance burden

Thoughts @slintes @ArangoGutierrez?

slintes · 2022-01-20T11:45:04Z

thanks for the heads up

in practice overlapping functionality and more maintenance burden

sounds like a good argument to me to close this

ArangoGutierrez · 2022-02-09T17:36:49Z

Let's close this after #119

marquiz · 2022-02-09T17:39:29Z

Let's close this after #119

Agree

ArangoGutierrez · 2022-02-17T18:15:30Z

#119 is merged, we can say this issue has been properly addressed
/close

k8s-ci-robot · 2022-02-17T18:15:43Z

@ArangoGutierrez: Closing this issue.

In response to this:

#119 is merged, we can say this issue has been properly addressed
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

slintes added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 30, 2021

k8s-ci-robot assigned ArangoGutierrez Apr 30, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2021

ArangoGutierrez added this to the v0.4.0 milestone Feb 17, 2022

k8s-ci-robot closed this as completed Feb 17, 2022

ArangoGutierrez mentioned this issue Feb 17, 2022

Release v0.4.0 #118

Closed

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support additional custom source configurations #66

Support additional custom source configurations #66

slintes commented Apr 30, 2021 •

edited

Loading

ArangoGutierrez commented Apr 30, 2021

mythi commented May 4, 2021

slintes commented May 19, 2021

k8s-triage-robot commented Aug 17, 2021

marquiz commented Aug 18, 2021

k8s-triage-robot commented Nov 16, 2021

marquiz commented Nov 16, 2021

marquiz commented Jan 20, 2022

slintes commented Jan 20, 2022

ArangoGutierrez commented Feb 9, 2022

marquiz commented Feb 9, 2022

ArangoGutierrez commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

Support additional custom source configurations #66

Support additional custom source configurations #66

Comments

slintes commented Apr 30, 2021 • edited Loading

ArangoGutierrez commented Apr 30, 2021

mythi commented May 4, 2021

slintes commented May 19, 2021

k8s-triage-robot commented Aug 17, 2021

marquiz commented Aug 18, 2021

k8s-triage-robot commented Nov 16, 2021

marquiz commented Nov 16, 2021

marquiz commented Jan 20, 2022

slintes commented Jan 20, 2022

ArangoGutierrez commented Feb 9, 2022

marquiz commented Feb 9, 2022

ArangoGutierrez commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

slintes commented Apr 30, 2021 •

edited

Loading