You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had searched in the DSIP and found no similar DSIP.
Motivation
Right now, ds will use master slot to calculate the command slot, and use worker group mapping to select the worker to dispatched the task to worker.
The problem is the code in master is difficult to maintain, there are rise a lot of bug related to the node manager. This PR is aim to refactor the ServerNodeManager and split the code in different component.
Design Detail
The design looks like below:
ClusterManager: used to manage the metadata of the whole clusters include master clusters/worker clusters.
MasterClusters: used to manage the metadata of the master clusters.
WorkerCluster: used to manage the metadata of the worker clusters, include the worker group mapping.
The key point is split the business code from registry, the business code don't need to take care of the registry component.
SbloodyS
changed the title
[DSIP-49][Master] Use ClusterManager to manage the cluster in master
[DSIP-54][Master] Use ClusterManager to manage the cluster in master
Jul 3, 2024
Search before asking
Motivation
Right now, ds will use master slot to calculate the command slot, and use worker group mapping to select the worker to dispatched the task to worker.
The problem is the code in master is difficult to maintain, there are rise a lot of bug related to the node manager. This PR is aim to refactor the ServerNodeManager and split the code in different component.
Design Detail
The design looks like below:
ClusterManager: used to manage the metadata of the whole clusters include master clusters/worker clusters.
MasterClusters: used to manage the metadata of the master clusters.
WorkerCluster: used to manage the metadata of the worker clusters, include the worker group mapping.
The key point is split the business code from registry, the business code don't need to take care of the registry component.
Compatibility, Deprecation, and Migration Plan
Compatibility
Test Plan
Test by UT and E2E
Code of Conduct
The text was updated successfully, but these errors were encountered: