Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor core for a more streamlined API for aggregators/controllers #459

Merged
merged 221 commits into from
May 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
221 commits
Select commit Hold shift + click to select a range
e71a488
Add instructions to connect examples to distributed deployment
May 23, 2022
85f517d
Update README.md
May 24, 2022
bfe8dc5
Update README.md
May 24, 2022
989b062
Update README.md
May 24, 2022
2a01be2
Formatting and sorting imports in Python files
May 30, 2022
3c4a736
fix
May 30, 2022
31bbea1
Fix sorting
May 30, 2022
a39fd43
inference rest
May 31, 2022
5855943
add inference to validation channel
Jun 1, 2022
31cbd5e
fixing formatting
Jun 1, 2022
91c5ed7
merge dev and fix conflicts
Jun 1, 2022
ce6b4b5
formatting
Jun 1, 2022
eb7ff12
Merge branch 'docs/deployment' into feature/inference
Jun 1, 2022
4e04650
merge option for entripoints
Jun 1, 2022
edcf74e
add task type
Jun 1, 2022
52f30ff
clean up requirement.txt file/dir
Jun 2, 2022
f228363
Make default compose a development compose
Jun 2, 2022
b87eea2
Add dev mount to pytorch example
Jun 2, 2022
0437afc
Merge branch 'feature/improve-docker' into feature/inference
Jun 2, 2022
80df305
Merge branch 'develop' into feature/inference
Jun 3, 2022
0cd874c
Status type for inference
Jun 3, 2022
77a0dd8
improve entrypoint
Jun 3, 2022
d11fa57
fix sorting
Jun 3, 2022
6438baf
Remove license
Jun 7, 2022
57ab43f
add stuff to test
Jun 7, 2022
20767a4
add python version to matrix
Jun 7, 2022
e3dee68
fix
Jun 7, 2022
ecb7c3c
remove version from name
Jun 7, 2022
d7db35f
Increase sleep for reducers and clients
Jun 7, 2022
3b7d319
Improve logs
Jun 7, 2022
717541b
Always print logs
Jun 7, 2022
23ef432
Fix PyTorch example data mount path in compose file
Jun 7, 2022
82b74c0
Merge branch 'bugfix/torchexample' into feature/ci-matrix
Jun 7, 2022
04b78f3
mets many python versions
Jun 7, 2022
160ee02
quotes
Jun 7, 2022
e4ed2d4
Fix CI sleep time
Jun 8, 2022
d547b94
Add Python versiong
Jun 8, 2022
463d48a
don't fail fast
Jun 8, 2022
db384b4
remove python 3.10
Jun 8, 2022
9df12c3
remove python 3.10
Jun 8, 2022
6355ba8
fix numpy for py 3.7
Jun 8, 2022
239f82f
Merge branch 'feature/ci-matrix' into feature/inference
Jun 8, 2022
30370f8
Merge branch 'bugfix/ci-time' into feature/inference
Jun 8, 2022
cb8b362
Merge branch 'develop' into feature/inference
Jun 8, 2022
61e14d1
Merge branch 'develop' into feature/inference
Jun 9, 2022
1c556c0
Merge branch 'develop' into feature/inference
Jun 10, 2022
976e12d
Inference CI
Jun 10, 2022
18de033
minor
Jun 10, 2022
ca5f809
fix
Jun 10, 2022
3051550
fix
Jun 10, 2022
9fb4b01
fix
Jun 10, 2022
8f84f3c
fix
Jun 10, 2022
9dddc02
fix
Jun 10, 2022
98c5aba
fix
Jun 10, 2022
ed3b3bd
fix
Jun 10, 2022
7a457de
reduce CI time
Jun 13, 2022
f1be8c5
update and fix conflicts
Jun 15, 2022
d1904c1
fix conflict
Jun 15, 2022
04316a6
upstream updates
Jul 4, 2022
a9ea496
Initial implementation toggle ssl for REST service
Jul 6, 2022
5a20932
Removed unused reducer inference interface mockup
Jul 6, 2022
da9d64d
Removed geoip2 dependency
Jul 6, 2022
175e3d9
Dockerfile update, install developer tools
Jul 6, 2022
0b95ebf
Draft implementation
Jul 7, 2022
d2920bb
Update and fix conflicts
Jul 7, 2022
d45bed8
update and fix conflict
Jul 7, 2022
1b4eec8
Merge branch 'develop' into feature/toggle-ssl
Jul 7, 2022
7e954b0
Remove mocked inference endpoint in restservice
Jul 8, 2022
37e522d
Develop (#418)
ahellander Jul 11, 2022
59890e2
fix code-checks
Wrede Jul 11, 2022
97b556b
insecure mode in ci (http)
Wrede Jul 11, 2022
0e24072
secure option to package download and checksum
Wrede Jul 11, 2022
a802c4c
work in progress
Jul 11, 2022
9e3902b
merge conflicts
Jul 11, 2022
c831508
fix flake8 warning
Wrede Jul 12, 2022
3917779
Merge branch 'feature/toggle-ssl' of https://github.com/scaleoutsyste…
Jul 13, 2022
90396e0
Remove Talisman
Jul 13, 2022
3981f83
bugfix, combiner now correctly uses secure flag in connector
Jul 13, 2022
53ef114
Revert accidetal change to compose file
Jul 13, 2022
2c29009
sort import
Jul 13, 2022
3ebf8d9
Changed combiner ssl default config to False
Jul 13, 2022
b771ed2
Fixed code checks
Jul 14, 2022
be0a726
Code checks
Jul 14, 2022
eecc65f
Add docstings in connecy.py
Jul 14, 2022
c4555f1
Add docstings in certificatemanager
Jul 14, 2022
637a7be
Docstrings
Jul 28, 2022
48c8dea
Changed some parameter names in reducer CLI
Jul 28, 2022
884ef41
Default no-ssl for REST, ssl for gRPC
Aug 2, 2022
37b66f9
Fix code check
Aug 3, 2022
0741607
Harmoize option names between combiner and reducer
Aug 3, 2022
9f2983e
Add help text for combiner options
Aug 3, 2022
79c088a
Make --secure option flag
Aug 3, 2022
81ea77e
Works to disable secure grpc
Aug 10, 2022
9304725
Added back use of copy
Aug 12, 2022
0f0a3d7
Remove possibility to generate cert for reducer
Aug 15, 2022
09c568f
Default to insecure gRPC setting
Aug 15, 2022
91adabe
Fix code scanning alerts
Aug 15, 2022
e3252eb
Initial refactor
Aug 16, 2022
d22bbd4
Initial refactor reducer
Aug 18, 2022
5d7abee
Introduce base class for controller
Aug 18, 2022
612dd75
More refactoring and cleaning
Aug 26, 2022
1b8c0ab
refactored look-aside loadbalancer
Aug 29, 2022
2f639a1
Refactored load-balancer
Aug 30, 2022
fe61751
Fixed code checks
Aug 31, 2022
63fb1f1
latest
Aug 31, 2022
b32cd34
work in progress
Sep 1, 2022
33ea7a8
Resolved conflicts
Sep 5, 2022
a472bab
Fixed code checks
Sep 5, 2022
e4be8cb
Update control page
Sep 5, 2022
be12df7
added metadata field to modelupdaterequest
Sep 14, 2022
1c83ac2
Client passes on metadata dict with model update
Sep 15, 2022
b9e4980
Latest
Sep 16, 2022
e2295a8
Latest
Sep 19, 2022
47a0409
Merge branch 'develop' into feature/refactor-control
Oct 3, 2022
0941195
latest
Oct 3, 2022
e60ec65
Resolve conflict
Oct 3, 2022
5ead760
Refactor aggregation
Oct 17, 2022
cd28882
Fix
Oct 17, 2022
f9f4321
Merge branch 'bugfix/430' into feature/429
Oct 17, 2022
1be171a
Add docstring for load_model_update
Oct 17, 2022
b70347b
Extract model update metadata and make available in aggregator
Oct 17, 2022
aabaac3
Added some docstrings
Oct 17, 2022
85d58b3
More docstrings
Oct 17, 2022
e54024d
Renamed aggregator files and base class
Oct 17, 2022
4cba0e6
suppress LOG status messages in stdout
Oct 17, 2022
1bed1aa
Introduce policy for when to trigger aggregation at combiner
Oct 17, 2022
d61f256
Latest
Oct 23, 2022
cc71a7b
Merge branch 'develop' into feature/429
Oct 23, 2022
1e770a3
Added files
Oct 24, 2022
580cb4e
Fixes
Oct 30, 2022
83086d2
Fixed broken congig file generation.
Oct 31, 2022
897ea39
Added option to parse client name from config file
Oct 31, 2022
1036a49
Flattened client config file, generalized so that all settings can be…
Nov 1, 2022
7936c4e
Fixed file generation
Nov 1, 2022
513f010
Resolved conflict
Nov 1, 2022
b66d1d9
Latest
Nov 1, 2022
1fad8d0
Updated config template
Nov 1, 2022
aee5e2a
Merge branch 'feature/438' into feature/429
Nov 2, 2022
8b1a595
Resolved conflict
Jan 25, 2023
1ae52ff
Removed mongotracing in control, will refactor to have all tracing da…
Jan 26, 2023
ab9cec9
Refactored combiner job submit
Jan 26, 2023
3a867a2
Remove psutil tracing
Jan 26, 2023
2b3098c
Refactor tracer
Jan 26, 2023
016275c
cleaning
Jan 26, 2023
758c551
get latest round refactored
Jan 26, 2023
1c6319c
Enable early termination by default
Jan 27, 2023
13379b9
Removed unused round_config object
Jan 29, 2023
e0ff053
Remove printout of sensitive information
Jan 30, 2023
858ccce
Remove old control, make new version default
Jan 30, 2023
61da4ea
Remove unused code
Jan 30, 2023
112de4d
Changed default name for fedn network in config template
Jan 30, 2023
82a8ee8
Cleaning, docstrings
Feb 8, 2023
9130b8e
bugfix
Feb 8, 2023
9f01ca0
Variable name changes
Feb 8, 2023
aee9cef
Removed old combine models implementation
Feb 8, 2023
981703d
bugfix
Feb 8, 2023
85bca23
Add a hook to validate the model update before putting it on the aggr…
Feb 8, 2023
2718797
Validate metadata on model 'update
Feb 8, 2023
88ce477
Validate metadata on model 'update
Feb 8, 2023
c970571
incremental weighted average in new style aggregator
Feb 11, 2023
27362da
small cleaning in control form
Feb 11, 2023
b8058dc
Added instructions in controller form, rearranged menu items
Feb 11, 2023
4733d34
Merge pull request #1 from scaleoutsystems/feature/inference
ahellander Feb 16, 2023
1ce1094
latest
Feb 16, 2023
cb386a8
started mergin
Feb 16, 2023
745a7d8
Resolve merge conflicts
Feb 16, 2023
47c0497
Added back accidentally removed file
Feb 17, 2023
c03146b
Conflict resolution
Feb 17, 2023
678716f
Remove unused readme file
Feb 17, 2023
de43d59
More merging
Feb 17, 2023
f2eaf58
latest
Feb 21, 2023
087fb63
Fixed round_config regression
Feb 23, 2023
e2bd997
Controller polls db instead of combiners
Feb 23, 2023
746475e
More api docs
Feb 27, 2023
b096aff
Add infer_instruct
Feb 27, 2023
b427d5b
Cleaning
Feb 27, 2023
2d1f213
Added training metadata for keras example
Mar 6, 2023
7470f04
work in progress db cleanup
Mar 6, 2023
834a342
Refactor
Mar 7, 2023
4f80eef
More refactoring in db backend
Mar 13, 2023
dd14149
Remove 'control' setting from reducer config file
Mar 13, 2023
148e98c
Flatten combiner config
Mar 13, 2023
2cdf437
Flatten combiner config
Mar 13, 2023
25d3149
Flatten combiner config
Mar 13, 2023
f31ab71
Harmonize CLI option names
Mar 13, 2023
d7eeb62
Refactor helpers
Mar 13, 2023
199de56
Refactor helpers
Mar 14, 2023
5d0f125
Merge branch 'master' into feature/refactor
Mar 14, 2023
81138d7
Refactor helpers
Mar 22, 2023
b3cd90e
Refactor helpers
Mar 22, 2023
3ee493b
Refactor helpers
Mar 23, 2023
a2f0e96
Plugin arch for helpers
Mar 27, 2023
b3fed84
Updated UI config
Mar 27, 2023
a59215d
Raise exception if misconfigured helper
Mar 28, 2023
59fdb38
Added tracing of sessions in the db
Mar 28, 2023
209c4dc
Update version to 0.5-dev
Mar 31, 2023
b6f3879
Merge branch 'develop' into feature/refactor
ahellander Apr 9, 2023
dc265c7
Updated torch version
Apr 11, 2023
287b63c
resolved conflict
Apr 11, 2023
62426d3
Updated torch version
Apr 11, 2023
e020bbf
bugfix
Apr 14, 2023
d475f2f
Skip osx tests
Apr 16, 2023
8416030
latest
May 9, 2023
cf96344
change helper name
Wrede May 9, 2023
f719e83
fix formatting and syntax
Wrede May 9, 2023
25800ab
fix formatting and syntax errors
Wrede May 9, 2023
3db2cca
Resolved conflicts
May 15, 2023
d8a7b16
Merge branch 'develop' of github.com:scaleoutsystems/fedn into featur…
Wrede May 15, 2023
5ea7e74
update ci new db
Wrede May 16, 2023
5521e20
Merge branch 'feature/refactor' of https://github.com/scaleoutsystems…
May 17, 2023
44fc3a3
fix round_id key and equal weight to reduce models
Wrede May 17, 2023
5d86b11
save helper for metrics and metadata
Wrede May 17, 2023
9a96996
merge conflict
May 17, 2023
2c4475b
improve readability and add test for fedavg
Wrede May 17, 2023
7e77ad9
update doc strings for client and combiner
Wrede May 19, 2023
857b80a
Merge branch 'feature/refactor' of https://github.com/scaleoutsystems…
May 23, 2023
8f99e44
Resolve conflict
May 23, 2023
cf1e8bf
formatting
May 25, 2023
b3c6316
add id to logging
Wrede May 29, 2023
3554c1f
Merge branch 'feature/refactor' of github.com:scaleoutsystems/fedn in…
Wrede May 29, 2023
a45a722
extra logging and doc strings
Wrede May 29, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
35 changes: 35 additions & 0 deletions .ci/tests/examples/inference_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import sys
from time import sleep

import pymongo

N_CLIENTS = 2
RETRIES = 18
SLEEP = 10


def _eprint(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)


def _wait_n_rounds(collection):
n = 0
for _ in range(RETRIES):
query = {'type': 'INFERENCE'}
n = collection.count_documents(query)
if n == N_CLIENTS:
return n
_eprint(f'Succeded cleints {n}. Sleeping for {SLEEP}.')
sleep(SLEEP)
_eprint(f'Succeded clients: {n}. Giving up.')
return n


if __name__ == '__main__':
# Connect to mongo
client = pymongo.MongoClient("mongodb://fedn_admin:password@localhost:6534")

# Wait for successful rounds
succeded = _wait_n_rounds(client['fedn-test-network']['control']['status'])
assert(succeded == N_CLIENTS) # check that all rounds succeeded
_eprint(f'Succeded inference clients: {succeded}. Test passed.')
19 changes: 19 additions & 0 deletions .ci/tests/examples/run_inference.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash
set -e

# Parse example name
if [ "$#" -lt 1 ]; then
>&2 echo "Wrong number of arguments (usage: run_infrence.sh <example-name>)"
exit 1
fi
example="$1"

>&2 echo "Run inference"
pushd "examples/$example"
curl -k -X POST https://localhost:8090/infer

>&2 echo "Checking inference success"
".$example/bin/python" ../../.ci/tests/examples/inference_test.py

>&2 echo "Test completed successfully"
popd
2 changes: 1 addition & 1 deletion .ci/tests/examples/wait_for.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def _retry(try_func, **func_args):
def _test_rounds(n_rounds):
client = pymongo.MongoClient(
"mongodb://fedn_admin:password@localhost:6534")
collection = client['fedn-test-network']['control']['round']
collection = client['fedn-network']['control']['rounds']
query = {'reducer.status': 'Success'}
n = collection.count_documents(query)
client.close()
Expand Down
10 changes: 6 additions & 4 deletions .github/workflows/integration-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,12 @@ jobs:
strategy:
matrix:
to_test:
- "mnist-keras keras"
- "mnist-pytorch pytorch"
- "mnist-keras kerashelper"
- "mnist-pytorch pytorchhelper"
python_version: ["3.8", "3.9","3.10"]
os:
- ubuntu-20.04
- ubuntu-22.04
- macos-11
runs-on: ${{ matrix.os }}
steps:
- name: checkout
Expand All @@ -38,7 +37,10 @@ jobs:

- name: run ${{ matrix.to_test }}
run: .ci/tests/examples/run.sh ${{ matrix.to_test }}
if: ${{ matrix.os != 'macos-11' }} # skip Docker part for MacOS

- name: run ${{ matrix.to_test }} inference
run: .ci/tests/examples/run_inference.sh ${{ matrix.to_test }}
if: ${{ matrix.os != 'macos-11' && matrix.to_test == 'mnist-keras keras' }} # example available for Keras

- name: print logs
if: failure()
Expand Down
1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,4 @@
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

2 changes: 1 addition & 1 deletion config/settings-client.yaml.template
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
network_id: fedn-test-network
network_id: fedn-network
discover_host: reducer
discover_port: 8090
18 changes: 9 additions & 9 deletions config/settings-combiner.yaml.template
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
network_id: fedn-test-network
controller:
discover_host: reducer
discover_port: 8090
network_id: fedn-network
discover_host: reducer
discover_port: 8090

name: combiner
host: combiner
port: 12080
max_clients: 30


combiner:
name: combiner
host: combiner
port: 12080
max_clients: 30
7 changes: 1 addition & 6 deletions config/settings-reducer.yaml.template
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
network_id: fedn-test-network
token: fedn_token

control:
state: idle
helper: keras
network_id: fedn-network

statestore:
type: MongoDB
Expand Down
4 changes: 2 additions & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ services:
- ${HOST_REPO_DIR:-.}/fedn:/app/fedn
entrypoint: [ "sh", "-c" ]
command:
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run combiner -in config/settings-combiner.yaml"
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run combiner --init config/settings-combiner.yaml"
ports:
- 12080:12080

Expand All @@ -110,6 +110,6 @@ services:
- ${HOST_REPO_DIR:-.}/fedn:/app/fedn
entrypoint: [ "sh", "-c" ]
command:
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run client -in config/settings-client.yaml"
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run client --init config/settings-client.yaml"
deploy:
replicas: 0
19 changes: 19 additions & 0 deletions examples/mnist-keras/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,22 @@ Finally, you can start the experiment from the "control" tab of the UI.

## Clean up
You can clean up by running `docker-compose down`.

## Connecting to a distributed deployment
To start and remotely connect a client with the required dependencies for this example, start by downloading the `client.yaml` file. You can either navigate the reducer UI or run the following command.

```bash
curl -k https://<reducer-fqdn>:<reducer-port>/config/download > client.yaml
```
> **Note** make sure to replace `<reducer-fqdn>` and `<reducer-port>` with appropriate values.

Now you are ready to start the client via Docker by running the following command.

```bash
docker run -d \
-v $PWD/client.yaml:/app/client.yaml \
-v $PWD/data:/var/data \
-e ENTRYPOINT_OPTS=--data_path=/var/data/mnist.npz \
ghcr.io/scaleoutsystems/fedn/fedn:develop-mnist-keras run client -in client.yaml
```
> **Note** If reducer and combiner host names, as specfied in the configuration files, are not resolvable in the client host network you need to use the docker option `--add-hosts` to make them resolvable. Please refer to the Docker documentation for more detail.
2 changes: 1 addition & 1 deletion examples/mnist-keras/bin/init_venv.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
set -e

# Init venv
python -m venv .mnist-keras
python3 -m venv .mnist-keras

# Pip deps
.mnist-keras/bin/pip install --upgrade pip
Expand Down
49 changes: 39 additions & 10 deletions examples/mnist-keras/client/entrypoint
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@ import fire
import numpy as np
import tensorflow as tf

from fedn.utils.kerashelper import KerasHelper
from fedn.utils.helpers import get_helper, save_metadata, save_metrics

HELPER_MODULE = 'kerashelper'
NUM_CLASSES = 10


Expand All @@ -17,7 +18,6 @@ def _get_data_path():
client = docker.from_env()
container = client.containers.get(os.environ['HOSTNAME'])
number = container.name[-1]

# Return data path
return f"/var/data/clients/{number}/mnist.npz"

Expand Down Expand Up @@ -64,8 +64,8 @@ def _load_data(data_path, is_train=True):

def init_seed(out_path='seed.npz'):
weights = _compile_model().get_weights()
helper = KerasHelper()
helper.save_model(weights, out_path)
helper = get_helper(HELPER_MODULE)
helper.save(weights, out_path)


def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1):
Expand All @@ -74,16 +74,26 @@ def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1

# Load model
model = _compile_model()
helper = KerasHelper()
weights = helper.load_model(in_model_path)
helper = get_helper(HELPER_MODULE)
weights = helper.load(in_model_path)
model.set_weights(weights)

# Train
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs)

# Save
weights = model.get_weights()
helper.save_model(weights, out_model_path)
helper.save(weights, out_model_path)

# Metadata needed for aggregation server side
metadata = {
'num_examples': len(x_train),
'batch_size': batch_size,
'epochs': epochs,
}

# Save JSON metadata file
save_metadata(metadata, out_model_path)


def validate(in_model_path, out_json_path, data_path=None):
Expand All @@ -93,8 +103,8 @@ def validate(in_model_path, out_json_path, data_path=None):

# Load model
model = _compile_model()
helper = KerasHelper()
weights = helper.load_model(in_model_path)
helper = get_helper(HELPER_MODULE)
weights = helper.load(in_model_path)
model.set_weights(weights)

# Evaluate
Expand All @@ -111,15 +121,34 @@ def validate(in_model_path, out_json_path, data_path=None):
"test_accuracy": model_score_test[1],
}

# Save JSON
save_metrics(report, out_json_path)


def infer(in_model_path, out_json_path, data_path=None):
# Using test data for inference but another dataset could be loaded
x_test, _ = _load_data(data_path, is_train=False)

# Load model
model = _compile_model()
helper = get_helper(HELPER_MODULE)
weights = helper.load(in_model_path)
model.set_weights(weights)

# Infer
y_pred = model.predict(x_test)
y_pred = np.argmax(y_pred, axis=1)

# Save JSON
with open(out_json_path, "w") as fh:
fh.write(json.dumps(report))
fh.write(json.dumps({'predictions': y_pred.tolist()}))


if __name__ == '__main__':
fire.Fire({
'init_seed': init_seed,
'train': train,
'validate': validate,
'infer': infer,
'_get_data_path': _get_data_path, # for testing
})
4 changes: 3 additions & 1 deletion examples/mnist-keras/client/fedn.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@ entry_points:
train:
command: /venv/bin/python entrypoint train $ENTRYPOINT_OPTS
validate:
command: /venv/bin/python entrypoint validate $ENTRYPOINT_OPTS
command: /venv/bin/python entrypoint validate $ENTRYPOINT_OPTS
infer:
command: /venv/bin/python entrypoint infer $ENTRYPOINT_OPTS
30 changes: 20 additions & 10 deletions examples/mnist-pytorch/client/entrypoint
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
#!./.mnist-pytorch/bin/python
import collections
import json
import math
import os

import docker
import fire
import torch

from fedn.utils.pytorchhelper import PytorchHelper
from fedn.utils.helpers import get_helper, save_metadata, save_metrics

HELPER_MODULE = 'pytorchhelper'
NUM_CLASSES = 10


Expand Down Expand Up @@ -69,13 +69,13 @@ def _save_model(model, out_path):
weights_np = collections.OrderedDict()
for w in weights:
weights_np[w] = weights[w].cpu().detach().numpy()
helper = PytorchHelper()
helper.save_model(weights, out_path)
helper = get_helper(HELPER_MODULE)
helper.save(weights, out_path)


def _load_model(model_path):
helper = PytorchHelper()
weights_np = helper.load_model(model_path)
helper = get_helper(HELPER_MODULE)
weights_np = helper.load(model_path)
weights = collections.OrderedDict()
for w in weights_np:
weights[w] = torch.tensor(weights_np[w])
Expand Down Expand Up @@ -118,7 +118,18 @@ def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1
print(
f"Epoch {e}/{epochs-1} | Batch: {b}/{n_batches-1} | Loss: {loss.item()}")

# Save
# Metadata needed for aggregation server side
metadata = {
'num_examples': len(x_train),
'batch_size': batch_size,
'epochs': epochs,
'lr': lr
}

# Save JSON metadata file
save_metadata(metadata, out_model_path)

# Save model update
_save_model(model, out_model_path)


Expand Down Expand Up @@ -151,14 +162,13 @@ def validate(in_model_path, out_json_path, data_path=None):
}

# Save JSON
with open(out_json_path, "w") as fh:
fh.write(json.dumps(report))
save_metrics(report, out_json_path)


if __name__ == '__main__':
fire.Fire({
'init_seed': init_seed,
'train': train,
'validate': validate,
'_get_data_path': _get_data_path, # for testing
# '_get_data_path': _get_data_path, # for testing
})
2 changes: 1 addition & 1 deletion examples/mnist-pytorch/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
torch==1.13.1
torchvision==0.14.1
fire==0.3.1
docker==6.1.1
docker==6.1.1
2 changes: 1 addition & 1 deletion fedn/README.md
Original file line number Diff line number Diff line change
@@ -1 +1 @@
# FEDn SDk #
FEDn
Loading