You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a label and annotation jobset.sigs.k8s.io/global-job-replicas which will contain the total number of replicas in the JobSet.
Why is this needed:
Some users want to run multislice training workloads using replicated jobs with different templates. #617 and #649 added important features to make this possible, but the use case is still not supported because the workloads need to know the total number of replicas. The new label and annotation will finally address this.
The text was updated successfully, but these errors were encountered:
GiuseppeTT
changed the title
Add global job replicas label/annotation to support multislice training workloads with different templates
Add global replica count label/annotation to support multislice training workloads with different templates
Sep 16, 2024
What would you like to be added:
Add a label and annotation
jobset.sigs.k8s.io/global-job-replicas
which will contain the total number of replicas in the JobSet.Why is this needed:
Some users want to run multislice training workloads using replicated jobs with different templates. #617 and #649 added important features to make this possible, but the use case is still not supported because the workloads need to know the total number of replicas. The new label and annotation will finally address this.
The text was updated successfully, but these errors were encountered: