Support for Hardware Accelerators #192

vishh · 2017-02-28T22:05:04Z

Description

Kubernetes is becoming popular for managing workloads that consume accelerators like Tensorflow for example. The agility that Kubernetes offers makes it easy to consume accelerators across a fleet of machines.
Kubernetes can provide an end to end workflow by separating provisioning and configuration of accelerators from consumption.

Progress Tracker

FEATURE_STATUS is used for feature tracking and to be updated by @kubernetes/feature-reviewers.
FEATURE_STATUS: IN_DEVELOPMENT

cc @kubernetes/sig-node-feature-requests @kubernetes/sig-scheduling-feature-requests

The text was updated successfully, but these errors were encountered:

vishh · 2017-02-28T22:06:24Z

cc @aronchick for priority

jeremyeder · 2017-03-01T13:22:50Z

s/accelerators/device assignment please? /cc @derekwaynecarr

k82cn · 2017-03-01T14:16:09Z

regarding accelerators, does it mean some kind of device, e.g. GPU (but not limit to GPU)?

cmluciano · 2017-03-01T15:38:36Z

/subscribe

jeremyeder · 2017-03-01T16:19:26Z

@k82cn yes. Actually per sig meeting yesterday, any PCI device (most tend to be accelerators but I'd personally prefer more generic wording). Note that Intel has "accelerators" inside their CPUs (called CPU extensions). All of these things should become candidates for scheduler match making.

cmluciano · 2017-03-01T18:09:55Z

related kubernetes/community#414

vishh · 2017-03-01T18:35:14Z

@jeremyeder

My understanding is that,

There needs to be a way to discover, represent and consume Accelerators as a resource in Kubernetes
As an optimization, node hardware topology needs to taken into account while provisioning accelerators.

1 does not depend on 2 and 2 can be solved independent of 1.
This feature is meant to focus on 1
It can benefit from 2 if it made available in parallel.

ravisantoshgudimetla · 2017-03-01T22:26:00Z

Is the scope limited to accelerators or some co-processors like TPM etc?

My understanding is that,

There needs to be a way to discover, represent and consume Accelerators as a resource in Kubernetes

If the hardware discovery is a functionality that we are targeting, shouldn't scope be broadened to all types of devices(including accelerators)?

vishh · 2017-03-01T22:30:25Z

This issue is not meant to support arbitrary third party devices which I believe warrants an issue by itself. Node Feature Discovery attempts to solve the device discovery problem to an extent.

…

On Wed, Mar 1, 2017 at 2:26 PM, ravig ***@***.***> wrote: Is the scope limited to accelerators or some co-processors like TPM etc? My understanding is that, 1. There needs to be a way to discover, represent and consume Accelerators as a resource in Kubernetes If the hardware discovery is a functionality that we are targeting, shouldn't scope be broadened to all types of devices(including accelerators)? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#192 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGvIKI5igGmT1xdSyaC9BAPC3f9y0RZAks5rhfB6gaJpZM4MO8fm> .

philips · 2017-03-02T00:57:15Z

Can we use the term "hardware accelerators"? I was really confused by this issue at first.

liyubobj · 2017-03-03T08:33:06Z

Good proposal! I think topology support for deivce is a must. For example, nvidia GPUs on different PCI bridge can not talk p2p.

idvoretskyi · 2017-05-09T15:35:42Z

ping @calebamiles to review

vishh · 2017-09-12T18:30:32Z

One of the critical pieces of this problem is Hardware device plugins landed in v1.8 #368.
This feature is broad and requires more work around identifying and defining the matrix of devices, device plugins and workload compatibility. This aspect is expected to be handled outside of core kubernetes, but the specifics are not yet defined. For that reason, I'm leaving this issue open, and moving it to v1.9.

idvoretskyi · 2017-11-13T14:35:05Z

@vishh is it still alpha for 1.9?

Also, can you update the feature template to follow the new format? https://github.com/kubernetes/features/blob/master/ISSUE_TEMPLATE.md

rohitagarwal003 · 2017-11-13T20:11:06Z

It is still alpha for 1.9.

zacharysarah · 2017-11-22T21:26:20Z

@vishh 👋 Please indicate in the 1.9 feature tracking board
whether this feature needs documentation. If yes, please open a PR and add a link to the tracking spreadsheet. Thanks in advance!

zacharysarah · 2017-11-29T00:02:22Z

@vishh Bump for docs ☝️

/cc @idvoretskyi

Automatic merge from submit-queue (batch tested with PRs 56681, 57384). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Deprecate the alpha Accelerators feature gate. Encourage people to use DevicePlugins instead. /kind cleanup Related to kubernetes/enhancements#192 and kubernetes/enhancements#368 **Release note**: ```release-note The alpha Accelerators feature gate is deprecated and will be removed in v1.11. Please use device plugins instead. They can be enabled using the DevicePlugins feature gate. ``` /sig node /sig scheduling /area hw-accelerators

fejta-bot · 2018-02-27T00:05:58Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-03-29T00:52:55Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

justaugustus · 2018-04-17T03:18:13Z

@vishh
Any plans for this in 1.11?

If so, can you please ensure the feature is up-to-date with the appropriate:

Description
Milestone
Assignee(s)
Labels:
- stage/{alpha,beta,stable}
- sig/*
- kind/feature

cc @idvoretskyi

fejta-bot · 2018-05-17T03:39:13Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

…able-md Fix incorrect link

Update Kuryr information on SSC doc

vishh changed the title ~~Support for Accelerators~~ Support for Hardware Accelerators Mar 2, 2017

idvoretskyi added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 2, 2017

idvoretskyi added this to the next-milestone milestone Mar 2, 2017

cmluciano mentioned this issue Mar 16, 2017

Convert nvidia-gpu manager to CRI kubernetes/kubernetes#43240

Closed

ConnorDoyle mentioned this issue Apr 5, 2017

Extended pod-level resource isolation #246

Closed

23 tasks

ConnorDoyle mentioned this issue Apr 24, 2017

[WIP] Pod and container-level resource isolator interface (plus isolator library, examples, tests) kubernetes/kubernetes#44870

Closed

calebamiles modified the milestones: 1.8, next-milestone Jul 31, 2017

calebamiles added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Aug 3, 2017

vishh modified the milestones: 1.9, 1.8 Sep 12, 2017

vishh mentioned this issue Nov 3, 2017

Built in support for dedicated nodes with extended compute resources (auto toleration) kubernetes/kubernetes#55080

Closed

idvoretskyi assigned vishh Nov 13, 2017

zacharysarah added the do-not-merge/docs label Nov 29, 2017

rohitagarwal003 mentioned this issue Dec 19, 2017

Deprecate the alpha Accelerators feature gate. kubernetes/kubernetes#57384

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 27, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 29, 2018

k8s-ci-robot closed this as completed May 17, 2018

justaugustus pushed a commit to justaugustus/enhancements that referenced this issue Sep 3, 2018

Merge pull request kubernetes#192 from xiangpengzhao/fix-node-allocat…

aaa4d5e

…able-md Fix incorrect link

ingvagabund pushed a commit to ingvagabund/enhancements that referenced this issue Apr 2, 2020

Merge pull request kubernetes#192 from dulek/update-ssc

01660fc

Update Kuryr information on SSC doc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Hardware Accelerators #192

Support for Hardware Accelerators #192

vishh commented Feb 28, 2017 •

edited

Loading

vishh commented Feb 28, 2017

jeremyeder commented Mar 1, 2017 •

edited

Loading

k82cn commented Mar 1, 2017

cmluciano commented Mar 1, 2017

jeremyeder commented Mar 1, 2017

cmluciano commented Mar 1, 2017

vishh commented Mar 1, 2017

ravisantoshgudimetla commented Mar 1, 2017

vishh commented Mar 1, 2017 via email

philips commented Mar 2, 2017

liyubobj commented Mar 3, 2017

idvoretskyi commented May 9, 2017

vishh commented Sep 12, 2017

idvoretskyi commented Nov 13, 2017 •

edited

Loading

rohitagarwal003 commented Nov 13, 2017

zacharysarah commented Nov 22, 2017

zacharysarah commented Nov 29, 2017

fejta-bot commented Feb 27, 2018

fejta-bot commented Mar 29, 2018

justaugustus commented Apr 17, 2018

fejta-bot commented May 17, 2018

Support for Hardware Accelerators #192

Support for Hardware Accelerators #192

Comments

vishh commented Feb 28, 2017 • edited Loading

Description

Progress Tracker

vishh commented Feb 28, 2017

jeremyeder commented Mar 1, 2017 • edited Loading

k82cn commented Mar 1, 2017

cmluciano commented Mar 1, 2017

jeremyeder commented Mar 1, 2017

cmluciano commented Mar 1, 2017

vishh commented Mar 1, 2017

ravisantoshgudimetla commented Mar 1, 2017

vishh commented Mar 1, 2017 via email

philips commented Mar 2, 2017

liyubobj commented Mar 3, 2017

idvoretskyi commented May 9, 2017

vishh commented Sep 12, 2017

idvoretskyi commented Nov 13, 2017 • edited Loading

rohitagarwal003 commented Nov 13, 2017

zacharysarah commented Nov 22, 2017

zacharysarah commented Nov 29, 2017

fejta-bot commented Feb 27, 2018

fejta-bot commented Mar 29, 2018

justaugustus commented Apr 17, 2018

fejta-bot commented May 17, 2018

vishh commented Feb 28, 2017 •

edited

Loading

jeremyeder commented Mar 1, 2017 •

edited

Loading

idvoretskyi commented Nov 13, 2017 •

edited

Loading