Skip to content

Conversation

@zhixiongli2011
Copy link
Contributor

Overview:

When setting up etcd & NATS by following the instruction in https://github.com/ai-dynamo/dynamo?tab=readme-ov-file#1-initial-setup and using command "docker compose -f deploy/docker-compose.yml up -d", the command failed with the below error, either etcd or NATS brought up.

✘ Container deploy-dcgm-exporter-1 Error response from daemon: unk... 0.0s
Error response from daemon: unknown or invalid runtime name: nvidia

Details:

The root cause is the dcgm-exporter service defined in docker-compose.yml explicitly uses the nvidia container runtime like below, which was not installed and running in the environment. To deploy the services successfully, the "runtime: nvidia" parameter needs to be commented out. This PR is to add additional information to the instruction to help others avoid the same issue.

image

Where should the reviewer start?

The "Install etcd and NATS (required)" section of the main README.md file

image

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Signed-off-by: Paul Li <zhixiong2008@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 19, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi zhixiongli2011! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added the external-contribution Pull request is from an external contributor label Sep 19, 2025
@zhixiongli2011 zhixiongli2011 changed the title Update README.md to bypass nvidia runtime for dcgm-exporter service when deploying container for etcd and nats docs: Update README.md to bypass nvidia runtime for dcgm-exporter service when deploying container for etcd and nats Sep 19, 2025
@github-actions github-actions bot added the docs label Sep 19, 2025
Signed-off-by: Paul Li <zhixiong2008@gmail.com>
@PeaBrane
Copy link
Contributor

@zhixiongli2011 just confirming that you verified that commenting out that line would work and can still launch etcd + nats normally

@PeaBrane
Copy link
Contributor

/ok to test 5ccb21c

@zhixiongli2011
Copy link
Contributor Author

zhixiongli2011 commented Sep 19, 2025

@PeaBrane yes, I did verified that and both services are running on my VM now. frontend is also running.

image image

@PeaBrane PeaBrane merged commit 007b9d6 into ai-dynamo:main Sep 20, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs external-contribution Pull request is from an external contributor size/XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants