Skip to content

Commit

Permalink
Merge pull request #1466 from FedML-AI/dev/v0.7.0
Browse files Browse the repository at this point in the history
Dev/v0.7.0
  • Loading branch information
fedml-alex authored Oct 19, 2023
2 parents 169a794 + 529b396 commit a9b04b5
Show file tree
Hide file tree
Showing 21 changed files with 690 additions and 380 deletions.
48 changes: 45 additions & 3 deletions python/examples/launch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,8 @@ computing:
device_type: GPU # options: GPU, CPU, hybrid
resource_type: A100-80G # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type
framework_type: fedml # options: fedml, deepspeed, pytorch, general
job_type: train # options: train, deploy, federate
framework_type: fedml # options: fedml, deepspeed, pytorch, general
# train subtype: general_training, single_machine_training, cluster_distributed_training, cross_cloud_training
# federate subtype: cross_silo, simulation, web, smart_phone
Expand All @@ -63,6 +62,39 @@ server_job: |
echo "Hello, Here is the server job."
echo "Current directory is as follows."
pwd
# If you want to use the job created by the MLOps platform,
# just uncomment the following three, then set job_id and config_id to your desired job id and related config.
#job_args:
# job_id: 2070
# config_id: 111
# If you want to create the job with specific name, just uncomment the following line and set job_name to your desired job name.
#job_name: cv_job
# If you want to pass your API key to your job for calling FEDML APIs, you may uncomment the following line and set your API key here.
# You may use the environment variable FEDML_RUN_API_KEY to get your API key in your job commands or scripts.
#run_api_key: my_api_key
# If you want to use the model created by the MLOps platform or create your own model card with a specified name,
# just uncomment the following four lines, then set model_name to your desired model name or set your desired endpoint name
#serving_args:
# model_name: "fedml-launch-sample-model" # Model card from MLOps platform or create your own model card with a specified name
# model_version: "" # Model version from MLOps platform or set as empty string "" which will use the latest version.
# endpoint_name: "fedml-launch-endpoint" # Set your end point name which will be deployed, it can be empty string "" which will be auto generated.
# Dataset related arguments
fedml_data_args:
dataset_name: mnist
dataset_path: ./dataset
dataset_type: csv
# Model related arguments
fedml_model_args:
input_dim: '784'
model_cache_path: /Users/alexliang/fedml_models
model_name: lr
output_dim: '10'
```

You just need to customize the following config items.
Expand Down Expand Up @@ -111,10 +143,20 @@ For querying the realtime status of your job, please run the following command.
fedml job logs -jid 1696947481910317056
```

## Supported Environment Variables
You may use the following environment variables in your job commands or scripts.
```
$FEDML_CURRENT_JOB_ID, current run id for your job
$FEDML_CURRENT_EDGE_ID, current edge device id for your job
$FEDML_CLIENT_RANK, current device index for your job
$FEDML_CURRENT_VERSION, current fedml config version, options: dev, test or release
$FEDML_RUN_API_KEY, current API key from your job.yaml with the config item run_api_key
```

## Login as the GPU supplier
If you want to login as the role of GPU supplier and join into the FedML launch payment system. You just need to run the following command.
```
fedml login $YourUserId -k $YourApiKey -r gpu_supplier
fedml login $YourApiKey -r gpu_supplier
```

Then you may find your GPU device in the FedML launch platform https://open.fedml.ai/gpu-supplier/gpus/index
Expand Down
112 changes: 112 additions & 0 deletions python/examples/launch/federate_build_package/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@

## Build the package for FEDML Federate
```
Usage: fedml federate build [OPTIONS]
Build federate packages for the FedML® Launch platform (open.fedml.ai).
Options:
-h, --help Show this message and exit.
-s, --server build the server package, default is building
client package.
-sf, --source_folder TEXT the source code folder path
-ep, --entry_point TEXT the entry point of the source code
-ea, --entry_args TEXT entry arguments of the entry point program
-cf, --config_folder TEXT the config folder path
-df, --dest_folder TEXT the destination package folder path
-ig, --ignore TEXT the ignore list for copying files, the format
is as follows: *.model,__pycache__,*.data*,
-m, --model_name TEXT model name for training.
-mc, --model_cache_path TEXT model cache path for training.
-mi, --input_dim TEXT input dimensions for training.
-mo, --output_dim TEXT output dimensions for training.
-dn, --dataset_name TEXT dataset name for training.
-dt, --dataset_type TEXT dataset type for training.
-dp, --dataset_path TEXT dataset path for training.
```

At first, you need to define your package properties as follows.
If you want to ignore some folders or files, you may specify the ignore argument
or add them to the .gitignore file in the source code folder.

### Required arguments:
source code folder, entry file, entry arguments,
config folder, built destination folder

### Optional arguments:
You may define the model and data arguments using the command arguments as follows.
```
model name, model cache path, model input dimension, model output dimension,
dataset name, dataset type, dataset path.
```

Also, you may define the model and data arguments using the file named fedml_config.yaml as follows.
```
fedml_data_args:
dataset_name: mnist
dataset_path: ./dataset
dataset_type: csv
fedml_model_args:
input_dim: '784'
model_cache_path: /Users/alexliang/fedml_models
model_name: lr
output_dim: '10'
```

The above model and data arguments will be mapped to the equivalent environment variables as follows.
```
dataset_name = $FEDML_DATASET_NAME
dataset_path = $FEDML_DATASET_PATH
dataset_type = $FEDML_DATASET_TYPE
model_name = $FEDML_MODEL_NAME
model_cache_path = $FEDML_MODEL_CACHE_PATH
input_dim = $FEDML_MODEL_INPUT_DIM
output_dim = $FEDML_MODEL_OUTPUT_DIM
```

Your may pass these environment variables as your entry arguments. e.g.,
```
ENTRY_ARGS_MODEL_DATA='-m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM -mo $FEDML_MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp $FEDML_DATASET_PATH'
```

### Examples
```
# Define the federated package properties
SOURCE_FOLDER=.
ENTRY_FILE=train.py
ENTRY_ARGS='--epochs 1'
ENTRY_ARGS_MODEL_DATA='-m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM -mo $FEDML_MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp $FEDML_DATASET_PATH'
CONFIG_FOLDER=config
DEST_FOLDER=./mlops
MODEL_NAME=lr
MODEL_CACHE=~/fedml_models
MODEL_INPUT_DIM=784
MODEL_OUTPUT_DIM=10
DATASET_NAME=mnist
DATASET_TYPE=csv
DATASET_PATH=./dataset
# Build the federated client package with the model and data arguments
fedml federate build -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
-cf $CONFIG_FOLDER -df $DEST_FOLDER \
-m $MODEL_NAME -mc $MODEL_CACHE -mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM \
-dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH
# Build the federated client package without the model and data arguments
# fedml federate build -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
# -cf $CONFIG_FOLDER -df $DEST_FOLDER
# Define the federated server package properties
ENTRY_FILE=torch_server.py
# Build the federated server package with the model and data arguments
fedml federate build -s -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
-cf $CONFIG_FOLDER -df $DEST_FOLDER \
-m $MODEL_NAME -mc $MODEL_CACHE -mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM \
-dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH
# Build the federated server package without the model and data arguments
# fedml federate build -s -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
# -cf $CONFIG_FOLDER -df $DEST_FOLDER
```
Original file line number Diff line number Diff line change
@@ -1,33 +1,36 @@
# Build federated client package
SOURCE=.
ENTRY=torch_client.py
ENTRY_ARGS='-m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp $FEDML_DATASET_PATH'
CONFIG=config
DEST=./mlops
# Define the federated package properties
SOURCE_FOLDER=.
ENTRY_FILE=torch_client.py
ENTRY_ARGS='-m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM -mo $FEDML_MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp $FEDML_DATASET_PATH'
CONFIG_FOLDER=config
DEST_FOLDER=./mlops
MODEL_NAME=lr
MODEL_CACHE=~/fedml_models
MODEL_INPUT_DIM=784
MODEL_OUTPUT_DIM=10
DATASET_NAME=mnist
DATASET_TYPE=csv
DATASET_PATH=~/fedml_data
fedml federate build -sf $SOURCE -ep $ENTRY -ea "$ENTRY_ARGS" \
-cf $CONFIG -df $DEST -m $MODEL_NAME -mc $MODEL_CACHE \
-mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM -dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH

# Build federated server package
SOURCE=.
ENTRY=torch_server.py
ENTRY_ARGS=='-m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp $FEDML_DATASET_PATH'
CONFIG=config
DEST=./mlops
MODEL_NAME=lr
MODEL_CACHE=~/fedml_models
MODEL_INPUT_DIM=784
MODEL_OUTPUT_DIM=10
DATASET_NAME=mnist
DATASET_TYPE=csv
DATASET_PATH=~/fedml_data
fedml federate build -s -sf $SOURCE -ep $ENTRY -ea "$ENTRY_ARGS" \
-cf $CONFIG -df $DEST -m $MODEL_NAME -mc $MODEL_CACHE \
-mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM -dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH
# Build the federated client package with the model and data arguments
fedml federate build -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
-cf $CONFIG_FOLDER -df $DEST_FOLDER \
-m $MODEL_NAME -mc $MODEL_CACHE -mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM \
-dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH

# Build the federated client package without the model and data arguments
#fedml federate build -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
# -cf $CONFIG_FOLDER -df $DEST_FOLDER

# Define the federated server package properties
ENTRY_FILE=torch_server.py

# Build the federated server package with the model and data arguments
fedml federate build -s -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
-cf $CONFIG_FOLDER -df $DEST_FOLDER \
-m $MODEL_NAME -mc $MODEL_CACHE -mi $MODEL_INPUT_DIM -mo $MODEL_OUTPUT_DIM \
-dn $DATASET_NAME -dt $DATASET_TYPE -dp $DATASET_PATH

# Build the federated server package without the model and data arguments
# fedml federate build -s -sf $SOURCE_FOLDER -ep $ENTRY_FILE -ea "$ENTRY_ARGS" \
# -cf $CONFIG_FOLDER -df $DEST_FOLDER
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
comm_args:
backend: MQTT_S3
mqtt_config_path: config/mqtt_config.yaml
s3_config_path: config/s3_config.yaml
common_args:
random_seed: 0
scenario: horizontal
training_type: cross_silo
using_mlops: false
data_args:
data_cache_dir: ~/fedml_data
dataset: mnist
partition_alpha: 0.5
partition_method: hetero
device_args:
gpu_mapping_file: config/gpu_mapping.yaml
gpu_mapping_key: mapping_default
using_gpu: false
worker_num: 2
environment_args:
bootstrap: config/bootstrap.sh
fedml_data_args:
dataset_name: mnist
dataset_path: /Users/alexliang/fedml_data
dataset_type: csv
fedml_entry_args:
arg_items: -m $FEDML_MODEL_NAME -mc $FEDML_MODEL_CACHE_PATH -mi $FEDML_MODEL_INPUT_DIM
-mo $FEDML_MODEL_OUTPUT_DIM -dn $FEDML_DATASET_NAME -dt $FEDML_DATASET_TYPE -dp
$FEDML_DATASET_PATH
fedml_model_args:
input_dim: '784'
model_cache_path: /Users/alexliang/fedml_models
model_name: lr
output_dim: '10'
model_args:
global_model_file_path: ./model_file_cache/global_model.pt
model: lr
model_file_cache_folder: ./model_file_cache
tracking_args:
enable_wandb: false
wandb_key: ee0b5f53d949c84cee7decbe7a629e63fb2f8408
wandb_name: fedml_torch_fedavg_mnist_lr
wandb_project: fedml
train_args:
batch_size: 10
client_id_list: null
client_num_in_total: 2
client_num_per_round: 2
client_optimizer: sgd
comm_round: 3
epochs: 1
federated_optimizer: FedAvg
learning_rate: 0.03
weight_decay: 0.001
validation_args:
frequency_of_the_test: 1
14 changes: 12 additions & 2 deletions python/examples/launch/hello_job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ job: |
# If you want to use the job created by the MLOps platform,
# just uncomment the following three, then set job_id and config_id to your desired job id and related config.
# set job_name to your desired job name
#job_args:
# job_id: 2070
# config_id: 111
Expand All @@ -54,4 +53,15 @@ computing:
maximum_cost_per_hour: $3000 # max cost per hour for your job per gpu card
#allow_cross_cloud_resources: true # true, false
#device_type: CPU # options: GPU, CPU, hybrid
resource_type: A100-80G # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type
resource_type: A100-80G # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type

fedml_data_args:
dataset_name: mnist
dataset_path: ./dataset
dataset_type: csv

fedml_model_args:
input_dim: '784'
model_cache_path: /Users/alexliang/fedml_models
model_name: lr
output_dim: '10'
1 change: 0 additions & 1 deletion python/examples/launch/serve_job_mnist.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ task_type: deploy # options: train, deploy, federate
# If you want to use the model created by the MLOps platform or create your own model card with a specified name,
# just uncomment the following four lines, then set model_name to your desired model name or set your desired endpoint name
#serving_args:
# model_name: serve_mnist_FedMLLaunchApp
# model_name: "fedml-launch-sample-model" # Model card from MLOps platform or create your own model card with a specified name
# model_version: "" # Model version from MLOps platform or set as empty string "" which will use the latest version.
# endpoint_name: "fedml-launch-endpoint" # Set your end point name which will be deployed, it can be empty string "" which will be auto generated.
Expand Down
Loading

0 comments on commit a9b04b5

Please sign in to comment.