address vdr report

NVIDIA · Jan 31, 2024 · 212cdfb · 212cdfb
1 parent 0b4bac8
commit 212cdfb
Show file tree

Hide file tree

Showing 30 changed files with 126 additions and 48 deletions.
diff --git a/docs/getting_started.rst b/docs/getting_started.rst
@@ -66,6 +66,11 @@ Installation
 .. note::
    The server and client versions of nvflare must match, we do not support cross-version compatibility.
 
+Supported Operating Systems
+---------------------------
+- Linux
+- OSX (Note: some dependencies are not compatible)
+
 Python Version
 --------------
 
@@ -120,7 +125,6 @@ You may find that the pip and setuptools versions in the venv need updating:
   (nvflare-env) $ python3 -m pip install -U pip
   (nvflare-env) $ python3 -m pip install -U setuptools
 
-
 Install Stable Release
 ----------------------
 
@@ -130,6 +134,10 @@ Stable releases are available on `NVIDIA FLARE PyPI <https://pypi.org/project/nv
 
   $ python3 -m pip install nvflare
 
+.. note::
+
+    In addition to the dependencies included when installing nvflare, many of our example applications have additional packages that must be installed.
+    Be on the lookout for provided requirement.txt files before running the examples.
 
 .. _containerized_deployment:
 
@@ -213,7 +221,7 @@ Production mode is secure with TLS certificates - depending the choice the deplo
 
   - HA or non-HA
   - Local or remote
-  - On-premise or on cloud
+  - On-premise or on cloud (See :ref:`cloud_deployment`)
 
 Using non-HA, secure, local mode (all clients and server running on the same host), production mode is very similar to POC mode except it is secure.
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -48,18 +48,20 @@ and simulation to real-world production deployment.  Some of the key components
  - **Management tools** for secure provisioning and deployment, orchestration, and management
  - **Specification-based API** for extensibility
 
-Learn more in the :ref:`FLARE Overview <flare_overview>`, :ref:`What's New <whats_new>`, and the
-:ref:`User Guide <user_guide>` and :ref:`Programming Guide <programming_guide>`.
+Learn more about FLARE features in the :ref:`FLARE Overview <flare_overview>` and :ref:`What's New <whats_new>`.
 
 Getting Started
 ===============
 For first-time users and FL researchers, FLARE provides the :ref:`FL Simulator <fl_simulator>` that allows you to build, test, and deploy applications locally.
 The :ref:`Getting Started <getting_started>` guide covers installation and walks through an example application using the FL Simulator.
+Additional examples can be found at the :ref:`Examples Applications <example_applications_algorithms>`, which showcase different federated learning workflows and algorithms on various machine learning and deep learning tasks.
 
-When you are ready to for a secure, distributed deployment, the :ref:`Real World Federated Learning <real_world_fl>` section covers the tools and process
+FLARE for Users
+===============
+When you are ready to for a secure, distributed deployment, the :ref:`User Guide <user_guide>` and :ref:`Real World Federated Learning <real_world_fl>` sections covers the tools and process
 required to deploy and operate a secure, real-world FLARE project.
 
 FLARE for Developers
 ====================
-When you're ready to build your own application, the :ref:`Programming Best Practices <best_practices>`, :ref:`FAQ<faq>`, and
-:ref:`Programming Guide <programming_guide>` give an in depth look at the FLARE platform and APIs.
+When you're ready to build your own application, the :ref:`Programming Guide <programming_guide>`, :ref:`Programming Best Practices <best_practices>`, :ref:`FAQ<faq>`, and :ref:`API Reference <apidocs/modules>`
+give an in depth look at the FLARE platform and APIs.
diff --git a/docs/programming_guide/experiment_tracking.rst b/docs/programming_guide/experiment_tracking.rst
@@ -37,6 +37,12 @@ provided examples, the Receiver is on the FL server, but it could also be on the
     - Server-side experiment tracking also can organize different clients' results into different experiment runs so they can be easily
       compared side-by-side. 
 
+.. note::
+
+    This page covers experiments tracking using LogWriters, however if using the Client API,
+    please refer to :ref:`client_api` and :ref:`nvflare.client.tracking` for their experiment tracking APIs.
+
+
 **************************************
 Tools, Sender, LogWriter and Receivers
 **************************************
@@ -60,9 +66,9 @@ where the actual experiment logs are recorded. The components that receive
 these logs are called Receivers based on :class:`AnalyticsReceiver <nvflare.app_common.widgets.streaming.AnalyticsReceiver>`.
 The receiver component leverages the experiment tracking tool and records the logs during the experiment run.
 
-In a normal setting, we would have pairs of sender and receivers, such as:
+In a normal setting, we would have pairs of sender and receivers, with some provided implementations in :mod:`nvflare.app_opt.tracking`:
 
-    - TBWriter  <-> TBReceiver
+    - TBWriter  <-> TBAnalyticsReceiver
     - MLflowWriter <-> MLflowReceiver
     - WandBWriter <-> WandBReceiver
 
@@ -95,12 +101,12 @@ Data Type
 =========
 
 Currently, the supported data types are metrics, params, and text. If you require other data types, may sure you add
-the type to :class:`AnalyticsDataType <nvflare.apis.analytix.AnalyticsDataType>`.
+
+Currently, the supported data types are listed in :class:`AnalyticsDataType <nvflare.apis.analytix.AnalyticsDataType>`, and other data types can be added as needed.
 
 Writer
 ======
-
-Implement LogWriter interface with the API syntax. For each tool, we mimic the API syntax of the underlying tool,
+Implement :class:`LogWriter <nvflare.app_common.tracking.log_writer.LogWriter>` interface with the API syntax. For each tool, we mimic the API syntax of the underlying tool,
 so users can use what they are familiar with without learning a new API.
 For example, for Tensorboard, TBWriter uses add_scalar() and add_scalars(); for MLflow, the syntax is
 log_metric(), log_metrics(), log_parameter(), and log_parameters(); for W&B, the writer just has log().
@@ -109,7 +115,7 @@ The data collected with these calls will all send to the AnalyticsSender to deli
 Receiver
 ========
 
-Implement AnalyticsReceiver interface and determine how to represent different sites' logs.  In all three implementations
+Implement :class:`AnalyticsReceiver <nvflare.app_common.widgets.streaming.AnalyticsReceiver>` interface and determine how to represent different sites' logs.  In all three implementations
 (Tensorboard, MLflow, WandB), each site's log is represented as one run. Depending on the individual tool, the implementation 
 can be different. For example, for both Tensorboard and MLflow, we create different runs for each client and map to the 
 site name. In the WandB implementation, we have to leverage multiprocess and let each run in a different process.  
@@ -121,13 +127,19 @@ Examples Overview
 The :github_nvflare_link:`experiment tracking examples <examples/advanced/experiment-tracking>`
 illustrate how to leverage different writers and receivers. All examples are based upon the hello-pt example.
 
+TensorBoard
+===========
 The example in the "tensorboard" directory shows how to use the Tensorboard Tracking Tool (for both the
 sender and receiver). See :ref:`tensorboard_streaming` for details.
 
+MLflow
+======
 Under the "mlflow" directory, the "hello-pt-mlflow" job shows how to use MLflow for tracking with both the MLflow sender
 and receiver. The "hello-pt-tb-mlflow" job shows how to use the Tensorboard Sender, while the receiver is MLflow.
 See :ref:`experiment_tracking_mlflow` for details.
 
+Weights & Biases
+================
 Under the :github_nvflare_link:`wandb <examples/advanced/experiment-tracking/wandb>` directory, the
 "hello-pt-wandb" job shows how to use Weights and Biases for experiment tracking with
 the WandBWriter and WandBReceiver to log metrics.

diff --git a/docs/user_guide/nvflare_cli/fl_simulator.rst b/docs/user_guide/nvflare_cli/fl_simulator.rst
@@ -837,6 +837,11 @@ Specifying threads
 ==================
 The simulator ``-t`` option provides the ability to specify how many threads to run the simulator with.
 
+.. note::
+
+    We use the term threads for simplicity, however technically each client actually runs in a separate process.
+    This difference will not affect the user experience.
+
 When you run the simulator with ``-t 1``, there is only one client active and running at a time, and the clients will be running in
 turn. This is to enable the simulation of large number of clients using a single machine with limited resources.
 

diff --git a/examples/advanced/experiment-tracking/wandb/README.md b/examples/advanced/experiment-tracking/wandb/README.md
@@ -26,7 +26,9 @@ export PYTHONPATH=${PWD}/..
 Import the W&B Python SDK and log in:
 
 ```
-wandb.login()
+python3
+>>> import wandb
+>>> wandb.login()
 ```
 
 Provide your API key when prompted.

diff --git a/examples/hello-world/step-by-step/README.md b/examples/hello-world/step-by-step/README.md
@@ -7,7 +7,7 @@ To run the notebooks in each example, please make sure you first set up a virtua
 
 These step-by-step example series are aimed to help users quickly get started and learn about FLARE.
 For consistency, each example in the series uses the same dataset- CIFAR10 for image data and the HIGGS dataset for tabular data.
-The examples will build upon previous ones to showcase different features, workflows, or APIs, allowing users to gain a comprehensive understanding of FLARE functionalities. See the README in each directory for more details about each series.
+The examples will build upon previous ones to showcase different features, workflows, or APIs, allowing users to gain a comprehensive understanding of FLARE functionalities (Note: each example is self-contained, so going through them in order is not required, but recommended). See the README in each directory for more details about each series.
 
 ## Common Questions
 

diff --git a/examples/hello-world/step-by-step/cifar10/cse/cse.ipynb b/examples/hello-world/step-by-step/cifar10/cse/cse.ipynb
@@ -180,9 +180,9 @@
    "id": "48271064",
    "metadata": {},
    "source": [
-    "For additional resources, see other examples for SAG with CSE using the [ModelLearner](../sag_model_learner/sag_model_learner.ipynb), [Executor](../sag_executor/sag_executor.ipynb), and [Hello-Numpy](https://github.com/NVIDIA/NVFlare/tree/main/examples/hello-world/hello-numpy-cross-val).\n",
+    "For additional resources, see other examples for SAG with CSE using the [ModelLearner](../sag_model_learner/sag_model_learner.ipynb) and [Executor](../sag_executor/sag_executor.ipynb). [Hello-Numpy](https://github.com/NVIDIA/NVFlare/tree/main/examples/hello-world/hello-numpy-cross-val) also demonstrates how to run cross-site evaluation using the previous training results.\n",
     "\n",
-    "Also the ability to run Cross-site Evaluation without having to re-run training will be added in the near future."
+    "Next we will look at the [cyclic](../cyclic/cyclic.ipynb) example, which shows the cyclic workflow for the Cyclic Weight Transfer algorithm."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/cifar10/cyclic/cyclic.ipynb b/examples/hello-world/step-by-step/cifar10/cyclic/cyclic.ipynb
@@ -140,7 +140,10 @@
    "id": "48271064",
    "metadata": {},
    "source": [
-    "As an additional resource, also see the [hello-cyclic](../../../../hello-world/hello-cyclic/README.md) for a Tensorflow Executor implementation using the MNIST dataset."
+    "As an additional resource, also see the [hello-cyclic](../../../../hello-world/hello-cyclic/README.md) for a Tensorflow Executor implementation using the MNIST dataset.\n",
+    "\n",
+    "While this example focused on the server-controlled cyclic workflow, now we will introduce the idea of client-controlled workflows.\n",
+    "The next [cyclic_ccwf](../cyclic_ccwf/cyclic_ccwf.ipynb) example is a client-controlled version of the cyclic workflow."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/cifar10/cyclic_ccwf/cyclic_ccwf.ipynb b/examples/hello-world/step-by-step/cifar10/cyclic_ccwf/cyclic_ccwf.ipynb
@@ -145,7 +145,9 @@
    "cell_type": "markdown",
    "id": "9bef3134",
    "metadata": {},
-   "source": []
+   "source": [
+    "Lastly, we have the [swarm](../swarm/swarm.ipynb) example, which covers swarm learning and client-controlled cross-site evaluation workflows."
+   ]
   }
  ],
  "metadata": {

diff --git a/examples/hello-world/step-by-step/cifar10/sag/sag.ipynb b/examples/hello-world/step-by-step/cifar10/sag/sag.ipynb
@@ -262,7 +262,10 @@
    "id": "b055bde7-432d-4e6b-9163-b5ab7ede7b73",
    "metadata": {},
    "source": [
-    "The job should be running in the simulator mode. We are done with the training. "
+    "The job should be running in the simulator mode. We are done with the training. \n",
+    "\n",
+    "The next 5 examples will use the same ScatterAndGather workflow, but will demonstrate different execution APIs and feature.\n",
+    "In the next example [sag_deploy_map](../sag_deploy_map/sag_deploy_map.ipynb), we will learn about the deploy_map configuration for deployment of apps to different sites."
    ]
   }
  ],

diff --git a/examples/hello-world/step-by-step/cifar10/sag_deploy_map/sag_deploy_map.ipynb b/examples/hello-world/step-by-step/cifar10/sag_deploy_map/sag_deploy_map.ipynb
@@ -261,7 +261,10 @@
    "id": "0af8036f-1f94-426d-8eb7-6e8b9be70a7e",
    "metadata": {},
    "source": [
-    "The job should be running in the simulator mode. We are done with the training. "
+    "The job should be running in the simulator mode. We are done with the training. \n",
+    "\n",
+    "In the next example [sag_model_learner](../sag_model_learner/sag_model_learner.ipynb), we will illustrate how to use the Model Learner API instead of the Client API,\n",
+    "and highlight why and when to use it."
    ]
   }
  ],

diff --git a/examples/hello-world/step-by-step/cifar10/sag_executor/sag_executor.ipynb b/examples/hello-world/step-by-step/cifar10/sag_executor/sag_executor.ipynb
@@ -222,7 +222,12 @@
    "id": "48271064",
    "metadata": {},
    "source": [
-    "For additional resources, take a look at the various other executors with different use cases in the app_common, app_opt, and examples folder."
+    "For additional resources, take a look at the various other executors with different use cases in the app_common, app_opt, and examples folder.\n",
+    "\n",
+    "In the previous examples we have finished covering each of Execution API types: the Client API, Model Learner, and Executor.\n",
+    "Now we will revert back to using the Client API in future examples to highlight other features and workflows.\n",
+    "\n",
+    "Next we have the [sag_mlflow](../sag_mlflow/sag_mlflow.ipynb) example, which shows how to enable MLflow experiment tracking logs."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/cifar10/sag_he/sag_he.ipynb b/examples/hello-world/step-by-step/cifar10/sag_he/sag_he.ipynb
@@ -197,7 +197,10 @@
    "id": "b19da336",
    "metadata": {},
    "source": [
-    "As an additional resource, see the [CIFAR10 Real World Example](https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/cifar10/cifar10-real-world) for creating a secure workspace for HE using provisioning instead of POC mode."
+    "As an additional resource, see the [CIFAR10 Real World Example](https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/cifar10/cifar10-real-world) for creating a secure workspace for HE using provisioning instead of POC mode.\n",
+    "\n",
+    "Now we will begin to take a look at other workflows besides ScatterAndGather.\n",
+    "First we have the [cse](../cse/cse.ipynb) example, which shows the server-controlled cross-site evaluation workflow."
    ]
   }
  ],

diff --git a/examples/hello-world/step-by-step/cifar10/sag_mlflow/sag_mlflow.ipynb b/examples/hello-world/step-by-step/cifar10/sag_mlflow/sag_mlflow.ipynb
@@ -183,12 +183,12 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e69c9ed2-359a-4f97-820f-25e9323a4e92",
+   "cell_type": "markdown",
+   "id": "58037d1e",
    "metadata": {},
-   "outputs": [],
-   "source": []
+   "source": [
+    "Next we will look at the [sag_he](../sag_he/sag_he.ipynb) example, which demonstrates how to enable homomorphic encryption using the POC -he mode."
+   ]
   }
  ],
  "metadata": {

diff --git a/examples/hello-world/step-by-step/cifar10/sag_model_learner/sag_model_learner.ipynb b/examples/hello-world/step-by-step/cifar10/sag_model_learner/sag_model_learner.ipynb
@@ -204,7 +204,9 @@
    "id": "48271064",
    "metadata": {},
    "source": [
-    "As an additional resource, also see the [CIFAR10 examples](../../../../advanced/cifar10/README.md) for a comprehensive implementation of a PyTorch ModelLearner."
+    "As an additional resource, also see the [CIFAR10 examples](../../../../advanced/cifar10/README.md) for a comprehensive implementation of a PyTorch ModelLearner.\n",
+    "\n",
+    "In the next example [sag_executor](../sag_executor/sag_executor.ipynb), we will illustrate how to use the Executor API for more specific use cases."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/cifar10/stats/image_stats.ipynb b/examples/hello-world/step-by-step/cifar10/stats/image_stats.ipynb
@@ -664,9 +664,8 @@
     "\n",
     "If you would like to see another example of federated statistics calculations and configurations, please checkout [federated_statistics](https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/federated-statistics) and [fed_stats with spleen_ct_segmentation](https://github.com/NVIDIA/NVFlare/tree/main/integration/monai/examples/spleen_ct_segmentation_sim)\n",
     "\n",
-    "Let's move on to the next example and see how can we train the image classifier using pytorch with CIFAR10 data.\n",
-    "\n",
-    "\n"
+    "Let's move on to the next examples and see how can we train the image classifier using pytorch with CIFAR10 data.\n",
+    "First we will look at the [sag](../sag/sag.ipynb) example, which illustrates how to use the ScatterAndGather workflow for FedAvg with the Client API.\n"
    ]
   }
  ],

diff --git a/examples/hello-world/step-by-step/cifar10/swarm/swarm.ipynb b/examples/hello-world/step-by-step/cifar10/swarm/swarm.ipynb
@@ -154,7 +154,10 @@
    "id": "48271064",
    "metadata": {},
    "source": [
-    "As an additional resource, also see the [Swarm Learning Example](../../../../advanced/swarm_learning/README.md) which utilizes the CIFAR10 ModelLearner instead of the Client API."
+    "As an additional resource, also see the [Swarm Learning Example](../../../../advanced/swarm_learning/README.md) which utilizes the CIFAR10 ModelLearner instead of the Client API.\n",
+    "\n",
+    "Congratulations! You have completed the CIFAR10 step-by-step example series.\n",
+    "Next take a look at the [higgs](../../higgs/README.md) example series for how to use machine learning methods for federated learning on tabular data."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/higgs/README.md b/examples/hello-world/step-by-step/higgs/README.md
@@ -1,7 +1,8 @@
 
 # Training traditional ML classifiers with HIGGS dataset
 
-The [HIGGS dataset](https://archive.ics.uci.edu/dataset/280/higgs) contains 11 million instances, each with 28 attributes, for binary classification to predict whether an event corresponds to the decayment of a Higgs boson or not. (Please note that the [UCI's website](https://archive.ics.uci.edu/dataset/280/higgs) may experience occasional downtime)
+The [HIGGS dataset](https://archive.ics.uci.edu/dataset/280/higgs) contains 11 million instances, each with 28 attributes, for binary classification to predict whether an event corresponds to the decayment of a Higgs boson or not. Follow the [prepare_data.ipynb](prepare_data.ipynb) notebook to download the HIGGS dataset and prepare the data splits.
+(Please note that the [UCI's website](https://archive.ics.uci.edu/dataset/280/higgs) may experience occasional downtime)
 
 The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. 
 The data has been produced using Monte Carlo simulations. The first 21 features are kinematic properties measured by the particle detectors in the accelerator. The last 7 features are functions of the first 21 features; these are high-level features derived by physicists to help discriminate between the two classes.

diff --git a/examples/hello-world/step-by-step/higgs/sklearn-kmeans/sklearn_kmeans.ipynb b/examples/hello-world/step-by-step/higgs/sklearn-kmeans/sklearn_kmeans.ipynb
@@ -452,7 +452,10 @@
     "HIGGS dataset is challenging for unsupervised clustering, as we can observe from the result. As shown by the local training with same number of iterations, the score is `model homogeneity_score: 0.0049`. As compared with the FL score of `0.0068`, FL in this case still provides some benefit from the collaborative learning.\n",
     "\n",
     "## We are done !\n",
-    "Congratulations! you have just completed the federated k-Means clustering for tabular data. "
+    "Congratulations! you have just completed the federated k-Means clustering for tabular data. \n",
+    "\n",
+    "Now we will move on from scikit-learn and take a look at how to use federated XGBoost.\n",
+    "In the next example [xgboost](../xgboost/xgboost_horizontal.ipynb), we will show a federated horizontal xgboost learning with bagging collaboration."
    ]
   },
   {

diff --git a/examples/hello-world/step-by-step/higgs/sklearn-linear/sklearn_linear.ipynb b/examples/hello-world/step-by-step/higgs/sklearn-linear/sklearn_linear.ipynb
@@ -454,12 +454,14 @@
    "id": "ea7bbacc-b059-4f82-9785-2b22bf840ef9",
    "metadata": {},
    "source": [
-    "In this experiment, all three clients have relatively large amount data wiht homogeneous distribution, we would expect the three numbers align within reasonable variation range. \n",
+    "In this experiment, all three clients have relatively large amount data with homogeneous distribution, we would expect the three numbers align within reasonable variation range. \n",
     "\n",
     "The final result for iterative learning is `ending model AUC: 0.6352`, and one-shot learning is `local model AUC: 0.6355`, as compared with FL's `local model AUC: 0.6351`, the numbers do align.\n",
     "\n",
     "## We are done !\n",
-    "Congratulations! you have just completed the federated linear model for tabular data. "
+    "Congratulations! you have just completed the federated linear model for tabular data. \n",
+    "\n",
+    "In the next example [sklearn-svm](../sklearn-svm/sklearn_svm.ipynb), we will demonstrate training a federated SVM model."
    ]
   },
   {