Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Support nvidia-container-runtime for gpu isolation #2352

Merged
merged 4 commits into from
Mar 20, 2019

Conversation

abuccts
Copy link
Member

@abuccts abuccts commented Mar 19, 2019

Support nvidia-container-runtime for gpu isolation:

  1. Support nvidia-container-runtime for gpu isolation
    Set NVIDIA_VISIBLE_DEVICES to void to avoid conflict with runc, reference:
    https://github.com/NVIDIA/nvidia-container-runtime#nvidia_visible_devices.
    Works for w/ and w/o --runtime=nvidia.

  2. Set default runtime to runc
    Set --runtime to runc explicitly to overwrite default runtime.

  3. Drop MKNOD capability
    Drop MKNOD capability when starting container.
    nvidia-smi will make all gpu devices show up under /dev, drop mknod capability to avoid this.
    Reference: Chaotic device name show in container`s /dev/ path and with GPU isolation NVIDIA/nvidia-docker#170

abuccts added 3 commits March 19, 2019 15:20
Support nvidia-container-runtime for gpu isolation.
Set NVIDIA_VISIBLE_DEVICES to void to avoid conflict with runc, reference:
https://github.com/NVIDIA/nvidia-container-runtime#nvidia_visible_devices.

Works for w/ and w/o --runtime=nvidia.

Closes #1667.
Set runtime to runc explicitly to overwrite default runtime.
Drop `MKNOD` capability when starting container.
`nvidia-smi` will make all gpu devices show up under /dev,
drop [mknod](https://linux.die.net/man/2/mknod) capability to avoid this.

Reference: NVIDIA/nvidia-docker#170
@abuccts abuccts requested review from xudifsd and fanyangCS March 19, 2019 07:37
@coveralls
Copy link

coveralls commented Mar 19, 2019

Coverage Status

Coverage increased (+0.01%) to 52.659% when pulling 40672ac on xiongyf/nvidia-runtime into 3adbcdd on master.

@xudifsd
Copy link
Member

xudifsd commented Mar 19, 2019

fixed #1667

@scarlett2018 scarlett2018 added this to the 0.11.0 milestone Mar 20, 2019
@scarlett2018 scarlett2018 mentioned this pull request Mar 20, 2019
11 tasks
@abuccts abuccts merged commit 0f80208 into master Mar 20, 2019
@abuccts abuccts deleted the xiongyf/nvidia-runtime branch March 20, 2019 14:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants