Commit fe0c2a0
Force nodes to down and power_save when stopping the cluster
If a cluster is stopped while a node is powering-up (alloc#-idle#),
node is kept in the powering-up state on cluster start.
This makes the node unavailable for the entire ResumeTimeout which is 60 minutes.
Slurm is ignoring the transition to power_down if we don't put the node to down first.
From @demartinofra
## Manual test
* Created a cluster and submitted a job on it
* When the node was powering up stopped the cluster and verified the node is correctly marked as power down
* Restarted the cluster and verified the node is back to powering save state (after about 2 minutes)
* Job ran correctly in the new node.
Signed-off-by: Enrico Usai <usai@amazon.com>1 parent 3489e16 commit fe0c2a0
File tree
3 files changed
+15
-7
lines changed- src/common/schedulers
- tests/common/schedulers
3 files changed
+15
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
6 | 12 | | |
7 | 13 | | |
8 | 14 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
238 | | - | |
| 238 | + | |
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
614 | 614 | | |
615 | 615 | | |
616 | 616 | | |
617 | | - | |
| 617 | + | |
618 | 618 | | |
619 | 619 | | |
620 | 620 | | |
| |||
627 | 627 | | |
628 | 628 | | |
629 | 629 | | |
630 | | - | |
631 | | - | |
| 630 | + | |
| 631 | + | |
632 | 632 | | |
633 | 633 | | |
634 | 634 | | |
| |||
682 | 682 | | |
683 | 683 | | |
684 | 684 | | |
685 | | - | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
686 | 688 | | |
687 | 689 | | |
688 | 690 | | |
| |||
692 | 694 | | |
693 | 695 | | |
694 | 696 | | |
695 | | - | |
| 697 | + | |
696 | 698 | | |
697 | | - | |
| 699 | + | |
698 | 700 | | |
0 commit comments