CRD-based custom node labeling #553

marquiz · 2021-07-08T15:27:09Z

Enable Custom Resource based label creation in nfd-master. This extends
the previously implemented controller stub for watching NodeFeatureRule
objects. NFD-master watches NodeFeatureRule objects in the cluster and
processes the rules on every incoming labeling request from workers.
The functionality relies on the "raw features" (identical to how
nfd-worker handles custom rules) submitted by nfd-worker, making it
independent of the label source configuration of the worker. This means
that the labeling functions as expected even if all sources in the
worker are disabled.

NOTE: nfd-master is stateless and re-labeling only happens on the
reception of SetLabelsRequest from the worker – i.e. on intervals
specified by the core.sleepInterval configuration option (or
-sleep-interval cmdline flag) of each nfd-worker instance. This means
that modification/creation of NodeFeatureRule objects does not
automatically update the node labels. Instead, the changes only come
visible when workers send their labeling requests.

This PR implements #468, i.e. CRD-based custom node labeling, and builds on top of previous work:

k8s-ci-robot · 2021-07-08T15:27:10Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2021-07-08T15:27:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marquiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [marquiz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kailun-qin · 2021-07-09T03:35:20Z

/cc @kailun-qin

k8s-ci-robot · 2021-07-09T03:35:21Z

@kailun-qin: GitHub didn't allow me to request PR reviews from the following users: kailun-qin.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @kailun-qin

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

marquiz · 2021-07-09T16:35:02Z

All feedback is very much appreciated. Especially on the soundness of the whole concept.

Some concerns/questions from the top of my head below, just talking out loud...

Data format

I've been pondering the data representation of "raw" features, i.e. is it good (enough). In this patchset NFD now under the hood uses a unified data model of all the "raw features" (not talking about labels here) it is detecting.

Basically we divide feature types into three categories:

"key-only features": features that don't have any value other than "present" or "not present". Examples are CPUID flags or loaded kernel nodules
"value features": features that have a value associated with it. Examples are kernel config flags or os-release information.
"instance features": features that cannot be represented by a single key-value pair (or key-only) but consist of mulitple attributes (key-value pairs). Examples are pci and usb devices

On the higher level, data representation of all detected features (from all features sources) is fixed into a three level hierarchy:
<source-name>.<feature-name>.<single-feature-instance>.

In go code this is a bit more complicated looks something like this (in a pseudo-ish golang):

// Name of the feature source e.g. "cpu" or "pci"
type FeatureSourceName string

// Name of the feature under the feature source e.g. "cpuid" (cpu) or "device" (pci)
type FeatureName string

// All features from all the feature sources
type AllFeatures map[FeatureSourceName]struct {
	// "key-only" features like cpuid or loaded kernel modules
	Keys map[FeatureName]struct {
		Features map[string]struct{}
	}

	// "key-value" features e.g. kernel version or kernel config flags
	Values map[FeatureName]struct {
		Features map[string]string
	}

	// "instance features" like pci or usb devices
	Instances map[FeatureName]struct {
		Features []struct {
			Attributes map[string]string
		}
	}
}

This data representation trickles down everywhere and to the rule format and CRD, too. The unified representation is nice in the sense that it reduces the lines of code dramatically and also simplifies the CRD schema plus makes it stable. Basically any new features added to NFD is automatically supported (because of the unified data representation and unified rule schema) with essentially no changes required in the CRD spec or rules.

My only concern is that is this flexible enough as we're fixed with the three-level hierarchy and cannot specify arbitrary data structures for the features. For example, Intel Speed Select Technology we now have cpu["sst"].Features["bf.enabled"] even though cpu.sst.bf.enabled (just nested structs and a bool in the end) might be more logical representation. However, the downside is that every feature needs a rule type of its own, similar to what we currently have in master (the custom rules). This also means that any added feature potentially changes the CRD API making is potentially way messier.

MatchExpressions

The key attribute of matchexpressions is implicit os the set of expressions are are presented as a map - the map key is used as the key attribute of the matchexpression. This has the limitation that you cannot specify more than one expression per key. I originally thought that this does not matter but now realized that there is actually one case where you would need this property: if you want to check that a value is between two values - in this case you'd need both Lt and Gt expressions. Talking out loud: maybe this should be changed...

Naming

I'm not happy with all the terms/naming that I've come up with. Need to think through it.

ArangoGutierrez · 2021-08-10T18:44:36Z

Not yet ready for review?
I want to jump into this PR as soon as it moves out of draft !

marquiz · 2021-08-11T06:55:12Z

Not yet ready for review?
I want to jump into this PR as soon as it moves out of draft !

#464 would be the first part. But even that is not ready yet, missing unit tests and documentation. I will work on these in the coming weeks, for sure, as I want to get these finally in shape, too.

ArangoGutierrez · 2021-11-22T16:46:08Z

/assign

ArangoGutierrez · 2021-11-22T16:46:17Z

@marquiz rebase needed

Enable Custom Resource based label creation in nfd-master. This extends the previously implemented controller stub for watching NodeFeatureRule objects. NFD-master watches NodeFeatureRule objects in the cluster and processes the rules on every incoming labeling request from workers. The functionality relies on the "raw features" (identical to how nfd-worker handles custom rules) submitted by nfd-worker, making it independent of the label source configuration of the worker. This means that the labeling functions as expected even if all sources in the worker are disabled. NOTE: nfd-master is stateless and re-labeling only happens on the reception of SetLabelsRequest from the worker – i.e. on intervals specified by the core.sleepInterval configuration option (or -sleep-interval cmdline flag) of each nfd-worker instance. This means that modification/creation of NodeFeatureRule objects does not automatically update the node labels. Instead, the changes only come visible when workers send their labeling requests.

marquiz · 2021-11-23T07:36:49Z

#656 was merged -> rebased and PR description updated
/unhold

marquiz · 2021-11-23T07:38:52Z

/retest

zvonkok · 2021-11-23T12:14:58Z

/lgtm

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. labels Jul 8, 2021

k8s-ci-robot requested review from ArangoGutierrez and kad July 8, 2021 15:27

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 8, 2021

marquiz force-pushed the devel/custom-crd branch from 8f6803e to 47d93b9 Compare July 8, 2021 15:39

marquiz changed the title ~~Devel/custom crd~~ CRD-based custom node labeling Jul 8, 2021

marquiz mentioned this pull request Jul 8, 2021

Support CRDs for custom label rules #468

Closed

marquiz force-pushed the devel/custom-crd branch 2 times, most recently from bfb8bad to 8a7494c Compare July 9, 2021 13:49

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 4, 2021

marquiz force-pushed the devel/custom-crd branch from 8a7494c to 2367e8c Compare August 10, 2021 17:01

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2021

marquiz force-pushed the devel/custom-crd branch from 2367e8c to d968f7a Compare August 10, 2021 17:38

k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Aug 10, 2021

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 18, 2021

marquiz force-pushed the devel/custom-crd branch from d968f7a to 6f2287b Compare August 19, 2021 06:41

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 19, 2021

marquiz force-pushed the devel/custom-crd branch from 6f2287b to e2883b4 Compare August 20, 2021 08:28

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 20, 2021

marquiz force-pushed the devel/custom-crd branch 6 times, most recently from 6e2c98a to 0df5b14 Compare November 18, 2021 12:49

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 18, 2021

marquiz force-pushed the devel/custom-crd branch from 0df5b14 to 44af276 Compare November 18, 2021 15:22

marquiz mentioned this pull request Nov 22, 2021

Add variables to feature rule spec and support backrefs #663

Merged

k8s-ci-robot assigned ArangoGutierrez Nov 22, 2021

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 22, 2021

marquiz force-pushed the devel/custom-crd branch from 44af276 to b9a6f50 Compare November 23, 2021 07:17

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 23, 2021

marquiz force-pushed the devel/custom-crd branch from b9a6f50 to 33fdf75 Compare November 23, 2021 07:19

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 23, 2021

k8s-ci-robot assigned zvonkok Nov 23, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 23, 2021

k8s-ci-robot merged commit aa0074f into kubernetes-sigs:master Nov 23, 2021

marquiz deleted the devel/custom-crd branch November 23, 2021 12:33

marquiz mentioned this pull request Nov 25, 2021

Add support to detect whether the node has storage #674

Closed

marquiz mentioned this pull request Dec 22, 2021

Release v0.10.0 #678

Closed

22 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRD-based custom node labeling #553

CRD-based custom node labeling #553

marquiz commented Jul 8, 2021 •

edited

Loading

k8s-ci-robot commented Jul 8, 2021

k8s-ci-robot commented Jul 8, 2021

kailun-qin commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

marquiz commented Jul 9, 2021

ArangoGutierrez commented Aug 10, 2021

marquiz commented Aug 11, 2021

ArangoGutierrez commented Nov 22, 2021

ArangoGutierrez commented Nov 22, 2021

marquiz commented Nov 23, 2021

marquiz commented Nov 23, 2021

zvonkok commented Nov 23, 2021

CRD-based custom node labeling #553

CRD-based custom node labeling #553

Conversation

marquiz commented Jul 8, 2021 • edited Loading

k8s-ci-robot commented Jul 8, 2021

k8s-ci-robot commented Jul 8, 2021

kailun-qin commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

marquiz commented Jul 9, 2021

Data format

MatchExpressions

Naming

ArangoGutierrez commented Aug 10, 2021

marquiz commented Aug 11, 2021

ArangoGutierrez commented Nov 22, 2021

ArangoGutierrez commented Nov 22, 2021

marquiz commented Nov 23, 2021

marquiz commented Nov 23, 2021

zvonkok commented Nov 23, 2021

marquiz commented Jul 8, 2021 •

edited

Loading