Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] fix managed job issue on k8s #4357

Merged

Conversation

nkwangleiGIT
Copy link
Contributor

  1. initialize a new dict if the destination is None, and add the source key/value to it
  2. use the cpu/memory to k8s resources limits, or the pod will fail to create if there is LimitRange configured in the namespace, as the default cpu/memory maybe smaller than the requests.

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nkwangleiGIT!

Comment on lines +1696 to +1697
if destination is None:
destination = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this!

Comment on lines 429 to 430
cpu: {{cpus}}
memory: {{memory}}G
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not set limits because we want SkyPilot pods to use any idle CPU cycles instead of being throttled to a specific limit. https://home.robusta.dev/blog/stop-using-cpu-limits

Suggested change
cpu: {{cpus}}
memory: {{memory}}G

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me remove it, maybe we should add this notice to docs somewhere.

Signed-off-by: nkwangleiGIT <nkwanglei@126.com>
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nkwangleiGIT!

@romilbhardwaj romilbhardwaj added this pull request to the merge queue Nov 16, 2024
Merged via the queue into skypilot-org:master with commit e8d18e3 Nov 16, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants