From c8175e5bde46747952f013b560ba9a2b3c3f5dac Mon Sep 17 00:00:00 2001 From: Swati Sehgal Date: Wed, 20 Jan 2021 21:49:41 +0000 Subject: [PATCH] Documentation capturing enablement of NFD-Topology-Updater in NFD Prior to this feature, NFD consisted of only software components namely nfd-master and nfd-worker. We have introduced another software component called nfd-topology-updater. NFD-Topology-Updater is a daemon responsible for examining allocated resources on a worker node to account for allocatable resources on a per-zone basis (where a zone can be a NUMA node). It then communicates the information to nfd-master which does the CRD creation corresponding to all the nodes in the cluster. One instance of nfd-topology-updater is supposed to be running on each node of the cluster. Signed-off-by: Swati Sehgal --- docs/advanced/developer-guide.md | 96 ++++++++- .../topology-updater-commandline-reference.md | 197 ++++++++++++++++++ docs/get-started/deployment-and-usage.md | 55 ++++- docs/get-started/introduction.md | 62 +++++- docs/get-started/quick-start.md | 117 ++++++++++- 5 files changed, 517 insertions(+), 10 deletions(-) create mode 100644 docs/advanced/topology-updater-commandline-reference.md diff --git a/docs/advanced/developer-guide.md b/docs/advanced/developer-guide.md index 2b24f192c3..24d97c0f25 100644 --- a/docs/advanced/developer-guide.md +++ b/docs/advanced/developer-guide.md @@ -184,6 +184,8 @@ Usage of nfd-master: Comma separated list of labels to be exposed as extended resources. -verify-node-name Verify worker node name against the worker's TLS certificate. Only takes effect when TLS authentication has been enabled. + -nrt-namespace + Namespace in which Node Resource Topology CR are created. Ensure that the namespace specified already exists -version Print version and exit. ``` @@ -242,6 +244,96 @@ stand-alone directly with `docker run`. See the [default deployment](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/components/common/worker-mounts.yaml) for up-to-date information about the required volume mounts. +### NFD-Topology-Updater + +In order to run nfd-topology-updater as a "stand-alone" container against your +standalone nfd-master you need to run them in the same network namespace: + +```bash +$ docker run --rm --network=container:nfd-test ${NFD_CONTAINER_IMAGE} nfd-topology-updater +2019/02/01 14:48:56 Node Feature Discovery Topology Updater +... +``` + +If you just want to try out feature discovery without connecting to nfd-master, +pass the `-no-publish` flag to nfd-topology-updater. + +Command line flags of nfd-topology-updater: + +```bash +$ docker run --rm ${NFD_CONTAINER_IMAGE} nfd-topology-updater -help +docker run --rm quay.io/swsehgal/node-feature-discovery:v0.10.0-devel-64-g93a0a9f-dirty nfd-topology-updater -help +Usage of nfd-topology-updater: + -add_dir_header + If true, adds the file directory to the header of the log messages + -alsologtostderr + log to standard error as well as files + -ca-file string + Root certificate for verifying connections + -cert-file string + Certificate used for authenticating connections + -key-file string + Private key matching -cert-file + -kubeconfig string + Kube config file. + -kubelet-config-file string + Kubelet config file path. (default "/host-var/lib/kubelet/config.yaml") + -log_backtrace_at value + when logging hits line file:N, emit a stack trace + -log_dir string + If non-empty, write log files in this directory + -log_file string + If non-empty, use this log file + -log_file_max_size uint + Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800) + -logtostderr + log to standard error instead of files (default true) + -no-publish + Do not publish discovered features to the cluster-local Kubernetes API server. + -one_output + If true, only write logs to their native severity level (vs also writing to each lower severity level) + -oneshot + Update once and exit + -podresources-socket string + Pod Resource Socket path to use. (default "/host-var/lib/kubelet/pod-resources/kubelet.sock") + -server string + NFD server address to connecto to. (default "localhost:8080") + -server-name-override string + Hostname expected from server certificate, useful in testing + -skip_headers + If true, avoid header prefixes in the log messages + -skip_log_headers + If true, avoid headers when opening log files + -sleep-interval duration + Time to sleep between CR updates. Non-positive value implies no CR updatation (i.e. infinite sleep). [Default: 60s] (default 1m0s) + -stderrthreshold value + logs at or above this threshold go to stderr (default 2) + -v value + number for the log level verbosity + -version + Print version and exit. + -vmodule value + comma-separated list of pattern=N settings for file-filtered logging + -watch-namespace string + Namespace to watch pods (for testing/debugging purpose). Use * for all namespaces. (default "*") +``` + +NOTE: + +* NFD topology updater needs certain directories and/or files from the +host mounted inside the NFD container. Thus, you need to provide Docker with the +correct `--volume` options in order for them to work correctly when run +stand-alone directly with `docker run`. See the +[template spec](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/components/topology-updater/topologyupdater-mounts.yaml) +for up-to-date information about the required volume mounts. + +* [PodResource API][podresource-api] is a prerequisite for nfd-topology-updater. +Preceding Kubernetes v1.23, the `kubelet` must be started with the following flag: + +`--feature-gates=KubeletPodResourcesGetAllocatable=true`. +Starting Kubernetes v1.23, the `GetAllocatableResources` is enabled by default +through `KubeletPodResourcesGetAllocatable` [feature gate][feature-gate]. + ## Documentation All documentation resides under the @@ -271,4 +363,6 @@ make site-build This will generate html documentation under `docs/_site/`. -[e2e-config-sample]: https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/test/e2e/e2e-test-config.example.yaml +[e2e-config-sample]: https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/test/e2e/e2e-test-config.exapmle.yaml +[podresource-api]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources +[feature-gate]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates diff --git a/docs/advanced/topology-updater-commandline-reference.md b/docs/advanced/topology-updater-commandline-reference.md new file mode 100644 index 0000000000..6f64776308 --- /dev/null +++ b/docs/advanced/topology-updater-commandline-reference.md @@ -0,0 +1,197 @@ +--- +title: "Topology Updater Cmdline Reference" +layout: default +sort: 5 +--- + +# NFD-Topology-Updater Commandline Flags + +{: .no_toc } + +## Table of Contents + +{: .no_toc .text-delta } + +1. TOC +{:toc} + +--- + +To quickly view available command line flags execute `nfd-topology-updater -help`. +In a docker container: + +```bash +docker run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-topology-updater -help +``` + +### -h, -help + +Print usage and exit. + +### -version + +Print version and exit. + +### -server + +The `-server` flag specifies the address of the nfd-master endpoint where to +connect to. + +Default: localhost:8080 + +Example: + +```bash +nfd-topology-updater -server=nfd-master.nfd.svc.cluster.local:443 +``` + +### -ca-file + +The `-ca-file` is one of the three flags (together with `-cert-file` and +`-key-file`) controlling the mutual TLS authentication on the topology-updater side. +This flag specifies the TLS root certificate that is used for verifying the +authenticity of nfd-master. + +Default: *empty* + +Note: Must be specified together with `-cert-file` and `-key-file` + +Example: + +```bash +nfd-topology-updater -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key +``` + +### -cert-file + +The `-cert-file` is one of the three flags (together with `-ca-file` and +`-key-file`) controlling mutual TLS authentication on the topology-updater +side. This flag specifies the TLS certificate presented for authenticating +outgoing requests. + +Default: *empty* + +Note: Must be specified together with `-ca-file` and `-key-file` + +Example: + +```bash +nfd-topology-updater -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key -ca-file=/opt/nfd/ca.crt +``` + +### -key-file + +The `-key-file` is one of the three flags (together with `-ca-file` and +`-cert-file`) controlling the mutual TLS authentication on topology-updater +side. This flag specifies the private key corresponding the given certificate file +(`-cert-file`) that is used for authenticating outgoing requests. + +Default: *empty* + +Note: Must be specified together with `-cert-file` and `-ca-file` + +Example: + +```bash +nfd-topology-updater -key-file=/opt/nfd/updater.key -cert-file=/opt/nfd/updater.crt -ca-file=/opt/nfd/ca.crt +``` + +### -server-name-override + +The `-server-name-override` flag specifies the common name (CN) which to +expect from the nfd-master TLS certificate. This flag is mostly intended for +development and debugging purposes. + +Default: *empty* + +Example: + +```bash +nfd-topology-updater -server-name-override=localhost +``` + +### -no-publish + +The `-no-publish` flag disables all communication with the nfd-master, making +it a "dry-run" flag for nfd-topology-updater. NFD-Topology-Updater runs +resource hardware topology detection normally, but no CR requests are sent to +nfd-master. + +Default: *false* + +Example: + +```bash +nfd-topology-updater -no-publish +``` + +### -oneshot + +The `-oneshot` flag causes nfd-topology-updater to exit after one pass of +resource hardware topology detection. + +Default: *false* + +Example: + +```bash +nfd-topology-updater -oneshot -no-publish +``` + +### -sleep-interval + +The `-sleep-interval` specifies the interval between resource hardware +topology re-examination (and CR updates). A non-positive value implies +infinite sleep interval, i.e. no re-detection is done. + +Default: 60s + +Example: + +```bash +nfd-topology-updater -sleep-interval=1h +``` + +### -watch-namespace + +The `-watch-namespace` specifies the namespace to ensure that resource +hardware topology examination only happens for the pods running in the +specified namespace. Pods that are not running in the specified namespace +are not considered during resource accounting. This is particularly useful +for testing/debugging purpose. A "*" value would mean that all the pods would +be considered during the accounting process. + +Default: "*" + +Example: + +```bash +nfd-topology-updater -watch-namespace=rte +``` + +### -kubelet-config-file + +The `-kubelet-config-file` specifies the path to the Kubelet's configuration +file. + +Default: /host-var/lib/kubelet/config.yaml + +Example: + +```bash +nfd-topology-updater -kubelet-config-file=/var/lib/kubelet/config.yaml +``` + +### -podresources-socket + +The `-podresources-socket` specifies the path to the Unix socket where kubelet +exports a gRPC service to enable discovery of in-use CPUs and devices, and to +provide metadata for them. + +Default: /host-var/liblib/kubelet/pod-resources/kubelet.sock + +Example: + +```bash +nfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock +``` diff --git a/docs/get-started/deployment-and-usage.md b/docs/get-started/deployment-and-usage.md index 5eaa61b2a5..8682af05db 100644 --- a/docs/get-started/deployment-and-usage.md +++ b/docs/get-started/deployment-and-usage.md @@ -96,7 +96,11 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo ``` This will required RBAC rules and deploy nfd-master (as a deployment) and -nfd-worker (as a daemonset) in the `node-feature-discovery` namespace. +nfd-worker (as daemonset) in the `node-feature-discovery` namespace. + +**NOTE:** nfd-topology-updater is not deployed as part of the `default` overlay. +Please refer to the [Master Worker Topologyupdater](#master-worker-topologyupdater) +and [Topologyupdater](#topology-updater) below. Alternatively you can clone the repository and customize the deployment by creating your own overlays. For example, to deploy the [minimal](#minimal) @@ -115,6 +119,10 @@ scenarios under see [Master-worker pod](#master-worker-pod) below - [`default-job`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/default-job): see [Worker one-shot](#worker-one-shot) below +- [`master-worker-topologyupdater`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/master-worker-topologyupdater): + see [Master Worker Topologyupdater](#master-worker-topologyupdater) below +- [`topologyupdater`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/topologyupdater): + see [Topology Updater](#topology-updater) below - [`prune`](https://github.com/kubernetes-sigs/node-feature-discovery/blob/{{site.release}}/deployment/overlays/prune): clean up the cluster after uninstallation, see [Removing feature labels](#removing-feature-labels) @@ -138,10 +146,14 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo ``` -This creates a DaemonSet runs both nfd-worker and nfd-master in the same Pod. +This creates a DaemonSet that runs nfd-worker and nfd-master in the same Pod. In this case no nfd-master is run on the master node(s), but, the worker nodes are able to label themselves which may be desirable e.g. in single-node setups. +**NOTE:** nfd-topology-updater is not deployed by the default-combined overlay. +To enable nfd-topology-updater in this scenario,the users must customize the +deployment themselves. + #### Worker one-shot Feature discovery can alternatively be configured as a one-shot job. @@ -154,11 +166,33 @@ kubectl kustomize https://github.com/kubernetes-sigs/node-feature-discovery/depl kubectl apply -f - ``` -The example above launces as many jobs as there are non-master nodes. Note that +The example above launches as many jobs as there are non-master nodes. Note that this approach does not guarantee running once on every node. For example, tainted, non-ready nodes or some other reasons in Job scheduling may cause some node(s) will run extra job instance(s) to satisfy the request. +#### Master Worker Topologyupdater + +NFD Master, NFD worker and NFD Topologyupdater can be configured to be deployed +as separate pods. The `master-worker-topologyupdater` overlay may be used to +achieve this: + +```bash +kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/master-worker-topologyupdater?ref={{ site.release }} + +``` + +#### Topologyupdater + +NFD master and NFD Topologyupdater can be configured to be deployed +as separate pods. The `topologyupdater` overlay may be used to +achieve this: + +```bash +kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref={{ site.release }} + +``` + ### Deployment with Helm Node Feature Discovery Helm chart allow to easily deploy and manage NFD. @@ -325,6 +359,21 @@ The worker configuration file is watched and re-read on every change which provides a simple mechanism of dynamic run-time reconfiguration. See [worker configuration](#worker-configuration) for more details. +### NFD-Topology-Updater + +NFD-Topology-Updater is preferably run as a Kubernetes DaemonSet. This assures +re-examination (and CR updates) on regular intervals capturing changes in +the allocated resources and hence the allocatable resources on a per zone +basis. It makes sure that more CR instances are created as new nodes get +added to the cluster. Topology-Updater connects to the nfd-master service +to create CR instances corresponding to nodes. + +When run as a daemonset, nodes are re-examined for the allocated resources +(to determine the information of the allocatable resources on a per zone basis +where a zone can be a NUMA node) at an interval specified using the +`-sleep-interval` option. The default sleep interval is set to 60s which is the + the value when no -sleep-interval is specified. + ### Communication security with TLS NFD supports mutual TLS authentication between the nfd-master and nfd-worker diff --git a/docs/get-started/introduction.md b/docs/get-started/introduction.md index a3bde7ab6a..2850d70e4a 100644 --- a/docs/get-started/introduction.md +++ b/docs/get-started/introduction.md @@ -19,10 +19,11 @@ This software enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels. -NFD consists of two software components: +NFD consists of three software components: 1. nfd-master 1. nfd-worker +1. nfd-topology-updater ## NFD-Master @@ -36,7 +37,17 @@ NFD-Worker is a daemon responsible for feature detection. It then communicates the information to nfd-master which does the actual node labeling. One instance of nfd-worker is supposed to be running on each node of the cluster, -## Feature discovery +## NFD-Topology-Updater + +NFD-Topology-Updater is a daemon responsible for examining allocated +resourceson a worker node to account for resources available to be allocated +to new pod on a per-zone basis (where a zone can be a NUMA node). It then +communicates the information to nfd-master which does the +[NodeResourceTopology CR](#noderesourcetopology-cr) creation corresponding +to all the nodes in the cluster. One instance of nfd-topology-updater is +supposed to be running on each node of the cluster. + +## Feature Discovery Feature discovery is divided into domain-specific feature sources: @@ -93,4 +104,49 @@ command line flag affects the annotation names Unapplicable annotations are not created, i.e. for example master.version is only created on nodes running nfd-master. - +## NodeResourceTopology CR + +When run with NFD-Topology-Updater, NFD creates CR intances corresponding to +node resource hardware topology such as: + + ```yaml +apiVersion: topology.node.k8s.io/v1alpha1 +kind: NodeResourceTopology +metadata: + name: node1 +topologyPolicies: ["SingleNUMANodeContainerLevel"] +zones: + - name: node-0 + type: Node + resources: + - name: cpu + capacity: 20 + allocatable: 16 + available: 10 + - name: vendor/nic1 + capacity: 3 + allocatable: 3 + available: 3 + - name: node-1 + type: Node + resources: + - name: cpu + capacity: 30 + allocatable: 30 + available: 15 + - name: vendor/nic2 + capacity: 6 + allocatable: 6 + available: 6 + - name: node-2 + type: Node + resources: + - name: cpu + capacity: 30 + allocatable: 30 + available: 15 + - name: vendor/nic1 + capacity: 3 + allocatable: 3 + available: 3 + ``` diff --git a/docs/get-started/quick-start.md b/docs/get-started/quick-start.md index 854bda43f4..b0e0340197 100644 --- a/docs/get-started/quick-start.md +++ b/docs/get-started/quick-start.md @@ -19,14 +19,16 @@ kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deplo ## Verify -Wait until NFD master and worker are running. +Wait until NFD master and NFD worker are running. ```bash $ kubectl -n node-feature-discovery get ds,deploy -NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE -daemonset.apps/nfd-worker 3 3 3 3 3 5s +NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE +daemonset.apps/nfd-worker 2 2 2 2 2 10s + NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/nfd-master 1/1 1 1 17s + ``` Check that NFD feature labels have been created @@ -71,3 +73,112 @@ $ kubectl get po feature-dependent-pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES feature-dependent-pod 1/1 Running 0 23s 10.36.0.4 node-2 ``` + +## Additional Optional Installation Steps + +In order to deploy nfd-master and nfd-topology-updater daemons +use `topologyupdater` overlay. + +Deploy with kustomize -- creates a new namespace, service and required RBAC +rules and nfd-master and nfd-topology-updater daemons. + +```bash +kubectl apply -k https://github.com/kubernetes-sigs/node-feature-discovery/deployment/overlays/topologyupdater?ref={{ site.release }} +``` + +**NOTE:** + +[PodResource API][podresource-api] is a prerequisite for nfd-topology-updater. + +Preceding Kubernetes v1.23, the `kubelet` must be started with the following flag: + +`--feature-gates=KubeletPodResourcesGetAllocatable=true` + +Starting Kubernetes v1.23, the `GetAllocatableResources` is enabled by default +through `KubeletPodResourcesGetAllocatable` [feature gate][feature-gate]. + +## Verify + +Wait until NFD master and NFD topologyupdater are running. + +```bash +$ kubectl -n node-feature-discovery get ds,deploy +NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE +daemonset.apps/nfd-topology-updater 2 2 2 2 2 5s + +NAME READY UP-TO-DATE AVAILABLE AGE +deployment.apps/nfd-master 1/1 1 1 17s + +``` + +Check that the NodeResourceTopology CR instances are created + +```bash +$ kubectl get noderesourcetopologies.topology.node.k8s.io +NAME AGE +kind-control-plane 23s +kind-worker 23s +``` + +## Show the CR instances + +```bash +$ kubectl describe noderesourcetopologies.topology.node.k8s.io kind-control-plane +Name: kind-control-plane +Namespace: default +Labels: +Annotations: +API Version: topology.node.k8s.io/v1alpha1 +Kind: NodeResourceTopology +... +Topology Policies: + SingleNUMANodeContainerLevel +Zones: + Name: node-0 + Costs: + node-0: 10 + node-1: 20 + Resources: + Name: Cpu + Allocatable: 3 + Capacity: 3 + Available: 3 + Name: vendor/nic1 + Allocatable: 2 + Capacity: 2 + Available: 2 + Name: vendor/nic2 + Allocatable: 2 + Capacity: 2 + Available: 2 + Type: Node + Name: node-1 + Costs: + node-0: 20 + node-1: 10 + Resources: + Name: Cpu + Allocatable: 4 + Capacity: 4 + Available: 4 + Name: vendor/nic1 + Allocatable: 2 + Capacity: 2 + Available: 2 + Name: vendor/nic2 + Allocatable: 2 + Capacity: 2 + Available: 2 + Type: Node +Events: +``` + +The CR instances created can be used to gain insight into the allocatable +resources along with the granularity of those resources at a per-zone level +(represented by node-0 and node-1 in the above example) or can be used by an +external entity (e.g. topology-aware scheduler plugin) to take an action based +on the gathered information. + + +[podresource-api]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources +[feature-gate]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates