-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow communication APIs and Simplified ML Algorithms #2250
Conversation
/build |
7c4dfd7
to
992075b
Compare
examples/hello-world/hello-km/jobs/kaplan-meier/app/custom/km_train.py
Outdated
Show resolved
Hide resolved
76593a0
to
c1e7594
Compare
713c75d
to
90301d5
Compare
09661ff
to
4e780aa
Compare
… into some utils class 2. Refactoring the WFController class into BaseWFController class and WFController class. The BaseWFController has majority of the logics and implement ControllerSpec. But not implement Responder, which is needed for Server-side controller. WFController Inherited both BaseWFController and Controller. This allows BaseWFController be able to be used on the client-side controller logics without implementing control_flow() method
There is a bug in the model update ( where the 2nd round missing keys)
aceb96a
to
7c4f62e
Compare
cleanup, fix server config json parsing with original controllers todo in follow up PRs:
|
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's clarify on what each class should do.
examples/hello-world/hello-fedavg/jobs/fedavg/app/custom/net.py
Outdated
Show resolved
Hide resolved
examples/hello-world/hello-fedavg/jobs/fedavg/app/custom/train.py
Outdated
Show resolved
Hide resolved
d8f47ca
to
da30df2
Compare
9ea89ec
to
23faeb2
Compare
23faeb2
to
d37cd23
Compare
I'm not sure about many of these changes. It seems to be more complex than I imagined. Does it require the user to do any config about the communicator? It's also not clear how a workflow can have its own communicator object. |
Yes the communicator object can be configured in rare cases, but it is not necessary as it will default to the WFCommunicator. Also currently clarifying with the team whether we want each workflow to share a single Communicator or use its own. config_fed_server.conf:
|
changed to alternative design #2390 |
Description
We like to make writing a new Federated DL/ML Algorithm workflow simpler without introducing the NVFLARE specific constructs. Similar to the Client API that makes data scientist easy to transition from DL/ML to FL by just modify a few lines of the code. We like to let data scientists write different Fed Algorithms by just a making few lines of changes needed for communication, the rest of workflow logic should be independent of NVFLARE communications.
Instead of writing a NVFLARE controller for different algorithms, we can simply write the workflow logic, separate the controller and workflow APIs.
Proposal
Introduce an Workflow Communication API that responsible only for communication such as broadcast_and_wait(). User doesn't need to deal with controller APIs.
For example:
Where
scatter_and_gather
is merely call broadcast_and_wait()We also created a new Controller ( old Controller will be renamed to MPCommunicator (multi-part communicator)) , which communicate to clients
We demonstrate that user can easily write different Workflows (FedAvg and Kaplan-Meier) using the same Controller class
Types of changes
./runtest.sh
.