-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PodResources interface enhancements #1884
PodResources interface enhancements #1884
Conversation
daab4b8
to
0433850
Compare
Are these changes applicable to both windows and linux? |
Changes tightly coupled with TopologyManager on the worker node, but now only fake TopologyManager is created on the Windows. So now it's not necessary for Windows hosts. |
} | ||
|
||
// ContainerDevices contains information about the devices assigned to a container | ||
message ContainerDevices { | ||
string resource_name = 1; | ||
repeated string device_ids = 2; | ||
uint32 numaid = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, deviceplugin represents numaid as int64. Not so clear why it was done in such way, since 64 bit to much for numa id and sign also not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. In that case, uint32 is fine, as long as we handle the conversion correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at linux kernel implementation. It represents internally both cpu & numaid as int (https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/arch/x86/mm/numa.c#L62), but numa id on the most architectures has a range (https://github.com/torvalds/linux/blob/8dcd175bc3d50b78413c56d5b17d4bddd77412ef/arch/x86/Kconfig#L1590 and https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/include/linux/numa.h#L12) so now on x86_64 it's limited by 2^10. But I'm already afraid, so let's use int64 as in deviceplugin )
Yeah, I know it isn't implemented now. But if possible, we should ensure the same concepts exist across OSs so someone could implement them and get similar functionality in the future if they wanted to. I googled a bit, and it seems like the concept of a numa node exists in windows as well, so this addition should be fine. It might be good to confirm this with sig-windows. cc @PatrickLang |
This looks mostly good, assuming the other KEP is accepted. |
Thank you, the most problem now it's where to land the daemon and code for CRD. |
@@ -87,12 +94,14 @@ message PodResources { | |||
message ContainerResources { | |||
string name = 1; | |||
repeated ContainerDevices devices = 2; | |||
repeated uint32 cpu_ids = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so on the node with e.g. 96 core (I've it now), we will see
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
84 85 86 87 88 89 90 91 92 93]
for any best-effort pods, I think it's too long list (it's not about UX, since it's not for humans), but we can represent it shorter just like kernel it does, e.g. 0-96.
I suggest list format from here http://man7.org/linux/man-pages/man7/cpuset.7.html#FORMATS
0433850
to
3eca007
Compare
} | ||
``` | ||
Cpu_ids is a list of cpu id in [cpuset format](https://man7.org/linux/man-pages/man7/cpuset.7.html#FORMATS). Each cpu id is a thread id or a core id in term of cadvisor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using cpuset
as a format (and especially in the string form) is really not the best idea in terms of engineering of APIs. It means that we are loosing validation on the grpc/protobuf level, and in additional all clients that supposed to consume that field will be required to implement parser for cpuset
format (which is also non-trivial in terms of syntax and validation).
Having repeated uint
allows to properly validate content and handle situations with empty/undefined conditions of this field cheaply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, no problem, cpuset format is mostly for human readability (it's not about unix domain socket). I just was focusing on golang/python clients (where cpuset parsing is publicly available),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To best of my knowledge, neither Python nor Go std libraries have cpuset
string format parser available.
Passing string puts additional technical debt on the clients, either by implementing parser or by pulling non-std library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we even need to put the cpusets of containers which are using the shared pool? Should we just have a global default cpu set, and omit all pods using shared CPUs? Or just omit containers using the shared pool all-together?
I agree that concrete lists of integers is better than cpuset notation.
} | ||
|
||
// ContainerDevices contains information about the devices assigned to a container | ||
message ContainerDevices { | ||
string resource_name = 1; | ||
repeated string device_ids = 2; | ||
int64 numaid = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of re-inventing wheel here, it will be more valuable to simply re-use TopologyInfo
type from device plugin APIs. Because that's practically what you can map to with 100% certainty, as device plugins reports mapping device_id/TopologyInfo.
Also, TopologyInfo is device_id specific. If you have repeated device_ids, you should have repeated numaid. Or better represent device allocation in the manner that re-uses device plugin APIs and structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried to keep backward compatibility. Additional field at the end of the structure doesn't break backward compatibility.
In case when one resource assigned to different numa node (e.g. it has 2 devices) in this interface looks like 2 instances of ContainerDevices, in this case we'll know numa of each device (but for scheduling it's not necessary, it's enough to know about resource).
If I truly understand you, if we have TopologyInfo (it's repeated NUMANode nodes) instead of numaid. We still know numa ids of resource, but doesn't know numa id of devices.
I'm ok with both: current one and your suggestion, but I want to know @dashpole opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about different problem here. resource name is 'vendor.example.com/type`. it has instances 'id0','id1',...'id47'. IDs from 0 to 17 are from PCI bus 0. IDs from 18 to 35 are from PCI bus 1, and IDs 36-47 are from PCI bus 3. 'NUMA ID' is not about whole resource, it is property of each instance of particular resource type announced by device plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its probably best to keep each API self-contained. We may want to introduce fields in the device plugin API that we don't want here, or vice-versa. I don't have any objections to modeling this API after the device plugin API, but we shouldn't use the types directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about different problem here. resource name is 'vendor.example.com/type`. it has instances 'id0','id1',...'id47'. IDs from 0 to 17 are from PCI bus 0. IDs from 18 to 35 are from PCI bus 1, and IDs 36-47 are from PCI bus 3. 'NUMA ID' is not about whole resource, it is property of each instance of particular resource type announced by device plugins.
In this case we will have 2 instances of ContainerDevices. 2 instances will have the same resource name, different list of device ids and different numaid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each device instance can have repeated numa_id
field. Please re-check definitions of structures how device plugins are announcing device instances upward to the kubelet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I know it. And this interface change takes it into account: If we have only dev1 and it attaches to NUMA id 1 and 2, in this case we will have
ContainerDevices [ {resource_name: "res1", ["dev1"], 1}, {resource_name: "res1", ["dev2"], 2}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your example shows "dev1" and "dev2". I'm talking about "dev1" with 1,2 NUMA IDs from TopologyInfo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes, thank you. How about this:
ContainerDevices [ {resource_name: "res1", ["dev1"], 1}, {resource_name: "res1", ["dev1"], 2}]
I'll add such test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexeyPerevalov @kad @fromani Based on the API, a representation of this could be: ContainerDevices [ {resource_name: "res1", device_ids: ["dev1", "dev2"], topology: 0 }, {resource_name: "res1", device_ids: ["dev3", "dev4"], topology: 1}, {resource_name: "res2", device_ids: ["dev1"], topology: 0},{resource_name: "res2", device_ids: ["dev2", "dev3"], topology: 1},]
.
So essentially we are saying that we would have containersDevices grouped by resourcename and topology info. We need to document this assumption very carefully and provide examples of the intended use of the API as it is not very intuitive especially given that this information is obtained device plugin API where a Device data structure looks like below storing its corresponding device id, health and topology info:
message Device {
// A unique ID assigned by the device plugin used
// to identify devices during the communication
// Max length of this field is 63 characters
string ID = 1;
// Health of the device, can be healthy or unhealthy, see constants.go
string health = 2;
// Topology for device
TopologyInfo topology = 3;
}
The information of all the devices known to the device manager are stored as below:
allDevices map[string]map[string]pluginapi.Device // map[resourceName][ID]pluginapi.Device
After this interface has been introduced it was used by CNI plugins like [kuryr-kubernetes](https://review.opendev.org/#/c/651580/) in couple with [intel-sriov-device-plugin](https://github.com/intel/sriov-network-device-plugin) to correctly define which devices were assigned to the pod. | ||
|
||
### Topology aware scheduling | ||
This interface can be used to collect allocated resources with information about the NUMA topology of the worker node. This information can then be used in NUMA aware scheduling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come the information about already allocated resources can be used for future scheduling?
Device IDs are dynamic objects, device plugins can add/remove instances of the devices on the fly.
Previous information about same device ID can't be re-used for future decisions with 100% certainty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We mostly interesting in resource and its numaid. Deviceid itself is not so interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We mostly interesting in resource and its numaid. Deviceid itself is not so interesting.
What do you mean by that statement? numa_id can't be assigned to container. It is a property of other resource instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're using the quantity of resources per numa node, that quantity is calculated by number of element in device list. We don't use device id here.
numa_id can't be assigned to container. It is a property of other resource instance.
Frankly saying I didn't find container word in the thread above, except yours.
Yes it can't be, since linux kernel doesn't have container entity, not so clear why container is mentioned here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Container" word comes from changes below in ContainerDevices
. Device
instance can be uniquely mapped to what is announced by device plugins. Device plugins announcing device instances with TopologyInfo
filed. this field contains repeated numa_id
per device instance.
3eca007
to
c7a613e
Compare
status: implementable | ||
--- | ||
# Kubelet endpoint for device assignment observation details | ||
# Kubelet endpoint for device assignment observation details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this KEP we are enabling support for obtaining information about kubelet's assignment of cpus to containers in addition to devices. I think we should update the title of this KEP to reflect that too. From the current title it seems to be only catering to devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, yes ) I'll update.
Kubelet exposes endpoint at `/var/lib/kubelet/pod-resources/kubelet.sock` for exposing information about assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager and returns a single PodResourcesResponse enabling monitor applications to poll for resources allocated to pods and containers on the node. This makes PodResource API a reasonable way of obtaining allocated resource information. However, PodResource API:https://godoc.org/k8s.io/kubernetes/pkg/kubelet/apis/podresources/v1alpha1 currently only exposes devices as the container resources (without topology info) and hence we are proposing KEP:kubernetes/enhancements#1884 to enhance it to expose CPU information along with device topology info. In order to use pod-resource-api source in Resource Topology Exporter, we need to use this:https://github.com/kubernetes/kubernetes/pull/93243/files patched version of kubelet implementing chnages proposed in the aforementioned KEP. This will no longer be needed once the KEP and the PR are merged. - Added command line argument to specify source, enabling user to specify either cri or pod-resource-api - Created PodResourceFinder struct and supporting methods to enable support for pod-resource-api as a way of gathering information of allocated resources - Created NodeResources struct to be used for storing node resource information - Moved functions (updateNUMAMap(), Aggregate() and makePCI2ResourceMap()) used by both crifinder.go and podresourcefinder.go to finder.go - Narrowed down volume mounts to the required subtree - Updated README Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Kubelet exposes endpoint at `/var/lib/kubelet/pod-resources/kubelet.sock` for exposing information about assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager and returns a single PodResourcesResponse enabling monitor applications to poll for resources allocated to pods and containers on the node. This makes PodResource API a reasonable way of obtaining allocated resource information. However, PodResource API:https://godoc.org/k8s.io/kubernetes/pkg/kubelet/apis/podresources/v1alpha1 currently only exposes devices as the container resources (without topology info) and hence we are proposing KEP:kubernetes/enhancements#1884 to enhance it to expose CPU information along with device topology info. In order to use pod-resource-api source in Resource Topology Exporter, we need to use this:https://github.com/kubernetes/kubernetes/pull/93243/files patched version of kubelet implementing chnages proposed in the aforementioned KEP. This will no longer be needed once the KEP and the PR are merged. - Added command line argument to specify source, enabling user to specify either cri or pod-resource-api - Created PodResourceFinder struct and supporting methods to enable support for pod-resource-api as a way of gathering information of allocated resources - Created NodeResources struct to be used for storing node resource information - Moved functions (updateNUMAMap(), Aggregate() and makePCI2ResourceMap()) used by both crifinder.go and podresourcefinder.go to finder.go - Narrowed down volume mounts to the required subtree - Updated README Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
After this interface has been introduced it was used by CNI plugins like [kuryr-kubernetes](https://review.opendev.org/#/c/651580/) in couple with [intel-sriov-device-plugin](https://github.com/intel/sriov-network-device-plugin) to correctly define which devices were assigned to the pod. | ||
|
||
### Topology aware scheduling | ||
This interface can be used to collect allocated resources with information about the NUMA topology of the worker node. This information can then be used in NUMA aware scheduling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Container" word comes from changes below in ContainerDevices
. Device
instance can be uniquely mapped to what is announced by device plugins. Device plugins announcing device instances with TopologyInfo
filed. this field contains repeated numa_id
per device instance.
|
||
## Changes | ||
|
||
Add a v1alpha1 Kubelet GRPC service, at `/var/lib/kubelet/pod-resources/kubelet.sock`, which returns information about the kubelet's assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager. The GRPC Service returns a single PodResourcesResponse, which is shown in proto below: | ||
Add a v1alpha1 Kubelet GRPC service, at `/var/lib/kubelet/pod-resources/kubelet.sock`, which returns information about the kubelet's assignment of devices and cpus to containers with NUMA id. It obtains this information from the internal state of the kubelet's Device Manager and CPU Manager respectively. The GRPC Service returns a single PodResourcesResponse, which is shown in proto below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean "containers with NUMA id"? NUMA ID in Linux kernel terms or in hardware terms is not something that can be linked to container directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"NUMA id" was related to cpus and devices in this sentence, ok, I'll rephrase it.
} | ||
|
||
// ContainerDevices contains information about the devices assigned to a container | ||
message ContainerDevices { | ||
string resource_name = 1; | ||
repeated string device_ids = 2; | ||
int64 numaid = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each device instance can have repeated numa_id
field. Please re-check definitions of structures how device plugins are announcing device instances upward to the kubelet.
Kubelet exposes endpoint at `/var/lib/kubelet/pod-resources/kubelet.sock` for exposing information about assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager and returns a single PodResourcesResponse enabling monitor applications to poll for resources allocated to pods and containers on the node. This makes PodResource API a reasonable way of obtaining allocated resource information. However, PodResource API:https://godoc.org/k8s.io/kubernetes/pkg/kubelet/apis/podresources/v1alpha1 currently only exposes devices as the container resources (without topology info) and hence we are proposing KEP:kubernetes/enhancements#1884 to enhance it to expose CPU information along with device topology info. In order to use pod-resource-api source in Resource Topology Exporter, we need to use this:https://github.com/kubernetes/kubernetes/pull/93243/files patched version of kubelet implementing chnages proposed in the aforementioned KEP. This will no longer be needed once the KEP and the PR are merged. - Added command line argument to specify source, enabling user to specify either cri or pod-resource-api - Created PodResourceFinder struct and supporting methods to enable support for pod-resource-api as a way of gathering information of allocated resources - Created NodeResources struct to be used for storing node resource information - Moved functions (updateNUMAMap(), Aggregate() and makePCI2ResourceMap()) used by both crifinder.go and podresourcefinder.go to finder.go - Narrowed down volume mounts to the required subtree - Updated README Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Kubelet exposes endpoint at `/var/lib/kubelet/pod-resources/kubelet.sock` for exposing information about assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager and returns a single PodResourcesResponse enabling monitor applications to poll for resources allocated to pods and containers on the node. This makes PodResource API a reasonable way of obtaining allocated resource information. However, PodResource API:https://godoc.org/k8s.io/kubernetes/pkg/kubelet/apis/podresources/v1alpha1 currently only exposes devices as the container resources (without topology info) and hence we are proposing KEP:kubernetes/enhancements#1884 to enhance it to expose CPU information along with device topology info. In order to use pod-resource-api source in Resource Topology Exporter, we need to use this:https://github.com/kubernetes/kubernetes/pull/93243/files patched version of kubelet implementing chnages proposed in the aforementioned KEP. This will no longer be needed once the KEP and the PR are merged. - Added command line argument to specify source, enabling user to specify either cri or pod-resource-api - Created PodResourceFinder struct and supporting methods to enable support for pod-resource-api as a way of gathering information of allocated resources - Created NodeResources struct to be used for storing node resource information - Moved functions (updateNUMAMap(), Aggregate() and makePCI2ResourceMap()) used by both crifinder.go and podresourcefinder.go to finder.go - Narrowed down volume mounts to the required subtree - Updated README Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
In order to simplify and make more understandable the KEP, and to comply with the new process, we extract the unit of work still ongoing in this KEP from kubernetes#1884 Work in this area was done during the 1.20 and 1.21 cycles in kubernetes/kubernetes#95734 Rationale, discussion and documentation for all the changes including the one proposed in this KEP have been described in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments and reported here were relevant Signed-off-by: Francesco Romani <fromani@redhat.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This commit updates description according to kubernetes/enhancements#1884 Update content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com> Co-authored-by: Tim Bannister <tim@scalefactory.com>
* Actuallize podresources description This commit updates description according to kubernetes/enhancements#1884 Update content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com> Co-authored-by: Tim Bannister <tim@scalefactory.com> * podresources: document the new feature gate Signed-off-by: Francesco Romani <fromani@redhat.com> * device plugins: add clarifications after review - fix the AllocatableResourcesResponse comment - describe the NUMA ID and explain the meaning of the field. Signed-off-by: Francesco Romani <fromani@redhat.com> Co-authored-by: Alexey Perevalov <alexey.perevalov@huawei.com> Co-authored-by: Tim Bannister <tim@scalefactory.com>
This change is necessary for resource with topology exporting daemon,
which used in topology aware scheduling.
This PR proposes following changes:
Information about CPU is keeping in cpu_ids, since it's enough to
represent both quantity and numaid. NUMAid can be obtained from
cadvisor MachineInfo, since id in cpus_ids is a thread_id.
This API doesn't provide cpu fraction, since it could be obtained from
Pod's request/limits and in case of non-integer CPU quantity and
non-guaranteed QoS cpu assigned is not exclusive and NUMA id is not
interesting.
Issue: #2043