-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In process Client API Executor Part 1 #2248
Conversation
be780ef
to
faede63
Compare
This PR is too big for this stage of 2.4. We could consider it for 2.4.1. |
1e64db4
to
3ba6a11
Compare
/build |
1 similar comment
/build |
To reduce the PR size, I split the PR into three parts
|
c042fd5
to
9b61050
Compare
4ca64c5
to
6c1d643
Compare
2) fix example code formatting add queue.task_done() 1) add message bus 2) hide task func wrapper class 3) rename executor package 4) clean up some code update meta info remove used code optimize import fix message_bus import order change rename the executor from ClientAPIMemExecutor to InProcessClientAPIExecutor 1) remove thread_pool 2) further loose couple executor and client_api implementation formating add unit tests avoid duplicated constant TASK_NAME definition split PR into two parts (besides message bus) this is part 1: only remove the example and job template changes 1. Replace MemPipe (Queues) with callback via EventManager 2. Simplified overall logics 3. notice the param convert doesn't quite work ( need to fix later) 4. removed some tests that now invalid. Will need to add more unit tests later fix task_name is None bug add few unit tests code format update to comform with new databus changes
3c48285
to
2b7e16b
Compare
Cleanup, add additional support for main function with cli args. Will add and enhance examples in follow up PR. |
87897a1
to
5f7efa4
Compare
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good, we just need to make changes following recent PRs in main regarding api.py
8db7572
to
9b7536e
Compare
6ae01d9
to
cbb82e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good, minor comments
2981d1b
to
330b4ef
Compare
330b4ef
to
0e11466
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM, can think about these logging levels later
/build |
Description
The Client API currently only support running the training code in sub-process. This works great for cases where 3rd party needs to run in a command line.
For many data scientist users, where running training directly in-process is also desired as it will be easier for debugging in IDE debugger. Also, the in-process training code will be a lot easier both in configuration and setup avoid the communication issues.
The proposal
This PR proposed an additional alternative in-memory implementation of the executor and Client API.
user will need to specify the path as 'module.function_name' to indicate the call function.
For example
which indicate the function name is "sub_main" located in cifar10_fl.py module
User Experience
The user experience is the same as previous sub-process case. The code change from ML/DL to FL stays identical.
The configuration changes to indicate the Memory Executor
All the pipe and responding component configurations are not needed.
The implementation
a new InProcessClientAPIExecutor is implemented, We leverage the new DataBus and EventManager. The main thread of the executor will subscribe to both LOCAL_RESULT and LOG_DATA topics to receive callbacks when results are available.
The client API will also register a Global_RESULT topic callback to receive global model.
we use one thread is running the Client's training code,
The executor main thread is sending global model once the model shareable is received and waiting for the local model update.
Types of changes
./runtest.sh
.