Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods in lower-weighted queue are not being evicted to make resources for Pods in greater weighted queue #2340

Closed
jchatter123 opened this issue Jul 7, 2022 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@jchatter123
Copy link

What happened:

Pods in lower weighted queue continue running when Pods in higher-weighted queue needs resources

What you expected to happen:

Pods in lower priority queue would be evicted and resources would be returned back to higher-weighted queue to run Pods.

Extra Information:

  • The total amount of resources available:

    • CPU: 1200m
  • There are two queues

    • default and test
  • The default queue has weight 1

    • This queue has a capacity greater than 1200m
  • The test queue has weight 5

    • This queue has a capacity of 2 cpu (2000m)
  • I’m scheduling a job (job-1) in the default queue that creates 5 pods, each taking 200m in cpu resources

    • Job-1 pods are taking 1000m in cpu resources altogether
  • I’m scheduling a second job (job-2) in the default queue that creates 1 pod, which takes 200m in cpu resources.

  • At this point, the default queue is using all cpu resources (1200m) and all pods of the two jobs are running successfully.

  • I now schedule a third job (job-3) in the test queue which creates 5 pods, each taking 200m in cpu resources

  • I expect the test queue (with weight 5) to take resources from the default queue to run the 5 pods of job-3

  • However, this does not occur.

    • Instead, pods for job-3 remain pending until pods in the default queue finish and release resources

Output:

  • The Pod Listing:

Screen Shot 2022-07-07 at 10 37 33 AM

YAML :

  • queue.yaml (YAML for test queue)

Screen Shot 2022-07-07 at 10 38 43 AM

  • vcjob.yaml (YAML for job-1)

Screen Shot 2022-07-07 at 10 39 45 AM

  • vcjob2.yaml (YAML for job-2)

Screen Shot 2022-07-07 at 10 40 56 AM

  • vcjob3.yaml (YAML for job-3)

Screen Shot 2022-07-07 at 10 41 26 AM

Describing Resources

  • Pod Group for job-1

Screen Shot 2022-07-07 at 10 42 29 AM

  • Pod Group for job-2

Screen Shot 2022-07-07 at 10 43 10 AM

  • Pod group for job-3

Screen Shot 2022-07-07 at 10 43 38 AM

  • Volcano Config Map

Screen Shot 2022-07-07 at 10 44 05 AM

  • The Default Queue

Screen Shot 2022-07-07 at 10 45 05 AM

  • The Test Queue

Screen Shot 2022-07-07 at 10 45 35 AM

Environment:

  • Volcano Version: 1.6.0
  • Kubernetes version (use kubectl version):

Screen Shot 2022-07-07 at 10 46 46 AM

  • Cloud provider or hardware configuration:
    • Not using a cloud provider
    • Running minikube locally
@jchatter123 jchatter123 added the kind/bug Categorizes issue or PR as related to a bug. label Jul 7, 2022
@stale
Copy link

stale bot commented Oct 12, 2022

Hello 👋 Looks like there was no activity on this issue for last 90 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 12, 2022
@stale
Copy link

stale bot commented Dec 31, 2022

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Dec 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

1 participant