Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CRDs for custom label rules #468

Closed
marquiz opened this issue Mar 9, 2021 · 18 comments
Closed

Support CRDs for custom label rules #468

marquiz opened this issue Mar 9, 2021 · 18 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Milestone

Comments

@marquiz
Copy link
Contributor

marquiz commented Mar 9, 2021

What would you like to be added:

An idea I've been pondering to make adding new rules for the custom source easier, without the need to add any new mounts or anything. NFD would specify a new CRD and watch for objects. Not sure what would be the best way to implement this:

  1. NFD worker instances would watch the CRs directly themselves
  2. NFD master would watch the CRs and communicate rules via gRPC
  3. NFD master would watch the CRs and update a predefined ConfigMap that is mounted on the worker containers
  4. Do rule processing on master side: NFD worker sends "raw" features to master. Master watches CRs, performs rule processing/matching against the "raw" features and creates labels.

Option 1 would be the obvious simplest choice even if it opens up direct communication of the worker to the control plane.

Why is this needed:

Make dropping in vendor/user specific custom label rules easier, as simple as kubectl apply -f <rules.yaml>

List of individual PRs:

Adjacent PRs, building on top and extending the functionality:

@marquiz marquiz added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 9, 2021
@zvonkok
Copy link
Contributor

zvonkok commented Mar 9, 2021

etcd has the 1.5 MiB object limit size which corresponds to roughly 750 pages of text. I think this should be enough to hold enough rules.

@zvonkok
Copy link
Contributor

zvonkok commented Mar 9, 2021

Regarding 1. does it mean we would have 1 CR that all nfd-workers would read and report the labels ... If we have several hundreds of nodes, how will the API behave is several hundred nfd-workers are reading from the very same CR at the same time? Should we add a random startup time?

@zvonkok
Copy link
Contributor

zvonkok commented Mar 9, 2021

Regarding 3. only the master would read the CR and the load would be distributed to each node, update of the CM...

@marquiz
Copy link
Contributor Author

marquiz commented Mar 9, 2021

Regarding 1. does it mean we would have 1 CR that all nfd-workers would read and report the labels ...

Yes, all workers will need read that config (CR)

If we have several hundreds of nodes, how will the API behave is several hundred nfd-workers are reading from the very same CR at the same time? Should we add a random startup time?

This is a fair concern. Unfortunately I don't have access to such a cluster. It's not just startup time but any time a new CR is added (or old one updated)

@marquiz
Copy link
Contributor Author

marquiz commented Mar 9, 2021

Regarding 3. only the master would read the CR and the load would be distributed to each node, update of the CM...

Yes. Point of stress would be moved to the configmap

@zvonkok
Copy link
Contributor

zvonkok commented Mar 9, 2021

Regarding 2. This could create some load on the "network" if going this route we definitely would need something like "OnChange" trigger and not do this periodically.

@marquiz
Copy link
Contributor Author

marquiz commented Mar 9, 2021

Regarding 2. This could create some load on the "network" if going this route we definitely would need something like "OnChange" trigger and not do this periodically

Yeah, I don't like this as it complicates the worker-master communication and looks like a well of sync issues.

@marquiz
Copy link
Contributor Author

marquiz commented Mar 15, 2021

Added option 4 which might make sense: do rule processing of the CRs on the master side.

@ArangoGutierrez
Copy link
Contributor

I know I am biassed, but I feel this issue is more suited for the Operator than the Operand side of NFD
thoughts?

@marquiz
Copy link
Contributor Author

marquiz commented Mar 18, 2021

I know I am biassed, but I feel this issue is more suited for the Operator than the Operand side of NFD
thoughts?

I think this would be orthogonal to / independent of the operator and should be handled by nfd-master. Allowing the creation of rules independent of the deployment mechanism, be it the operator, Helm or templates/kustomize.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 16, 2021
@marquiz
Copy link
Contributor Author

marquiz commented Jul 7, 2021

Been working on a prototype building on top of #464 and #550. I'll submit a RFC version soon.
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 7, 2021
@marquiz
Copy link
Contributor Author

marquiz commented Jul 8, 2021

Been working on a prototype building on top of #464 and #550. I'll submit a RFC version soon.

...and we have #553

@ArangoGutierrez
Copy link
Contributor

we are getting ready for V0.11.0 !!!!

@ArangoGutierrez ArangoGutierrez mentioned this issue Nov 29, 2021
22 tasks
@marquiz
Copy link
Contributor Author

marquiz commented Nov 30, 2021

With #663 all the essential parts are now merged and I declare this issue as implemented
/close

@k8s-ci-robot
Copy link
Contributor

@marquiz: Closing this issue.

In response to this:

With #663 all the essential parts are now merged and I declare this issue as implemented
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eero-t
Copy link

eero-t commented Nov 30, 2021

With #663 all the essential parts are now merged and I declare this issue as implemented /close

@marquiz In one of the PRs you mentioned documentation update being a separate PR, but I do not see one listed above. could you you give a pointer to the associated doc update?

@marquiz
Copy link
Contributor Author

marquiz commented Nov 30, 2021

@marquiz In one of the PRs you mentioned documentation update being a separate PR, but I do not see one listed above. could you you give a pointer to the associated doc update?

It's still under construction, there's no PR yet on that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

6 participants