Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for runtime option in service definitions #5405

Merged
merged 1 commit into from
Jan 4, 2018

Conversation

cuckoohello
Copy link

Works with nvidia-docker2

For docker compose

version: '3'
services:
  nvidia:
    image: nvidia/cuda
    runtime: nvidia
    command: nvidia-smi

docker-compose up

$ docker-compose up
Creating network "nvidia_default" with the default driver
Creating nvidia_nvidia_1 ...
Creating nvidia_nvidia_1 ... done
Attaching to nvidia_nvidia_1
nvidia_1  | Mon Nov 27 02:22:54 2017
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_1  | | NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
nvidia_1  | |-------------------------------+----------------------+----------------------+
nvidia_1  | | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
nvidia_1  | | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
nvidia_1  | |===============================+======================+======================|
nvidia_1  | |   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
nvidia_1  | | 28%   42C    P0    39W / 180W |      0MiB /  8112MiB |      0%      Default |
nvidia_1  | +-------------------------------+----------------------+----------------------+
nvidia_1  | |   1  GeForce GTX 1080    Off  | 00000000:03:00.0 Off |                  N/A |
nvidia_1  | |  0%   41C    P0    36W / 180W |      0MiB /  8114MiB |      0%      Default |
nvidia_1  | +-------------------------------+----------------------+----------------------+
nvidia_1  |
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_1  | | Processes:                                                       GPU Memory |
nvidia_1  | |  GPU       PID   Type   Process name                             Usage      |
nvidia_1  | |=============================================================================|
nvidia_1  | |  No running processes found                                                 |
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_nvidia_1 exited with code 0

close #5360

Signed-off-by: Yafeng Shan cuckoo@kokonur.me

Copy link

@shin- shin- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

However, there are 2 major problems with this:

  1. The config schema for v3 can not be changed. Additions to the schema may be suggested on the docker/cli repo and then ported back to Compose. For this PR, support for the option should be added in the 2.3 schema instead
  2. There's no tests.

@cuckoohello
Copy link
Author

@shin- Thank you for your kind reply.

The config schema for v3 can not be changed. Additions to the schema may be suggested on the docker/cli repo and then ported back to Compose. For this PR, support for the option should be added in the 2.3 schema instead

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

There's no tests.

I have tried to add some tests before, but docker only provides docker-runc runtime by default, and nvidia runtime needs a machine with a nvidia gpu and nivida-docker installed. So if I just add some tests for nvidia runtime, it won't pass on a general computer.

Could you give me some suggestions, thank you.

@shin-
Copy link

shin- commented Nov 30, 2017

Cool, thank you for applying the schema changes!

For the tests, I think there are 3 things we would like to do:

  1. Add unit tests that make sure the option is picked up in the config (look at these tests for example)
  2. Create an integration test that explicitly specifies runc as the runtime and make sure it works as expected
  3. Create a test decorator that checks whether a particular runtime is available and skips the test if it isn't. You can model that one after no_cluster. Then you can write a custom runtime test with a @if_runtime_available('nvidia') decorator that will only run if nvidia is available.

Let me know if any of that is unclear and I'll do my best to assist.

@cuckoohello
Copy link
Author

Thank you. Let me have a try first.

@cuckoohello cuckoohello force-pushed the 5360-add-runtime-support branch 6 times, most recently from 02cfa11 to f700b73 Compare December 2, 2017 04:20
@cuckoohello
Copy link
Author

@shin- I have created both unit and integration tests for runtime option, and pushed the code with a force push.

But jenkins always report error due to not from a trusted source.

Please have a look, thank you.

  Getting remote pull request #5405...

    Checking pull request #5405
    (not from a trusted source)
      ‘Jenkinsfile’ found
    Met criteria
Changes detected: PR-5405 (02cfa11ddf88decdfdf4c4b70d719618108e2153 → f700b73964338fb134165e4242859b06cafb2e1e)
Scheduled build for branch: PR-5405

  1 pull requests were processed (query completed)

  1 pull requests were processed

Finished examining docker/compose

@anibali
Copy link

anibali commented Dec 3, 2017

Sorry to randomly interject, but I've been following this PR in eager anticipation of replacing nvidia-docker-compose for good.

@cuckoohello If you look at the build output at https://jenkins.dockerproject.org/job/docker/job/compose/job/PR-5405/9/console you will see that there are in fact errors/warnings that you can address. Namely:

04:21:10 [all_py27] tests/integration/testcases.py:190:1: E302 expected 2 blank lines, found 1

04:21:10 [all_py27] tests/integration/testcases.py:200:106: E501 line too long (127 > 105 characters)

04:21:10 [all_py27] tests/integration/project_test.py:979:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:999:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:1015:6: F821 undefined name 'if_runtime_available'

04:21:10 [all_py27] tests/integration/project_test.py:1019:21: F821 undefined name 'V2_3'

Thanks for working on this enhancement, it will certainly be very useful for CUDA users in the future!

@cuckoohello cuckoohello force-pushed the 5360-add-runtime-support branch 7 times, most recently from 1f83cf4 to b9beda1 Compare December 4, 2017 04:06
close docker#5360

Signed-off-by: Yafeng Shan <cuckoo@kokonur.me>
@cuckoohello
Copy link
Author

@anibali Thank you. I use gitlab-ci before, and don't use jenkins, not find the console.

@shin- Finally, both the test and integration tests work finnally. And provide if_runtime_available test scripts. Please have a look, thank you.

@andyneff
Copy link

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

@cuckoohello, have you made any progress on this?

@cuckoohello
Copy link
Author

@andyneff I have tried to add support for docker/cli, but find there're much more work need to be done. At least there projects need to be adapted.

I will try to manage this next week, and give you feedback

@shin-
Copy link

shin- commented Jan 4, 2018

@cuckoohello Sorry, it was a busy month. Changes LGTM now for the most part, just a few typos but I'll fix those in a separate commit. Thanks a lot!

@shin- shin- merged commit e4e92dd into docker:master Jan 4, 2018
@fredericgermain
Copy link

I'm not sure about this, but is it supposed to work on schema v3?

I just tried the exemple above on master, it only works for me if I change the schema version to 2.3.

@cuckoohello
Copy link
Author

cuckoohello commented Jan 27, 2018

@fredericgermain You can look at the discussion above first.

If you want to use runtime option for schema v3 and above, the only thing you need to do is just modify the config schema for the corresponding json file.

But currently config schema json files for v3 and above in docker compose bump from upstream docker repo, don't accept any modification. So if you want to make these files changed, you need to pull request to upstream repo docker, and docker depends on repo docker-cli, you need to add runtime support for swarm first.... A lot of work....

I don't have enough time to do this now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please provide docker-compose alternative for docker run --runtime=some-runtime argument
5 participants