Fixed Workflow_Interface_Mnist_Implementation_2.py #1227

agnivac123 · 2024-12-23T09:37:29Z

Hi @scngupta-dsp , @teoparvanov , and @psfoley ,here is the PR that I created for the GlobalDP tutorial. Please let me know your thoughts on whether we can remove the customized getstate and setstate methods.

…ements_global_dp.txt Signed-off-by: Agniva Chowdhury <agniva.chowdhury@intel.com>

Signed-off-by: Agniva Chowdhury <agniva.chowdhury@intel.com>

scngupta-dsp · 2024-12-24T14:44:51Z

Hi @agnivac123

I spent some time investigating this tutorial and my observations are as follows:

Tutorial is failing even with backend = single_process with following error:

File "/home/scngupta/openfl_scngupta/openfl/openfl-tutorials/experimental/workflow/Global_DP/Workflow_Interface_Mnist_Implementation_2.py", line 205, in FedAvg
global_model_tools.global_optimizer.step()
File "/home/scngupta/miniforge-pypy3/envs/env_openfl_scngupta_globaldp/lib/python3.10/site-packages/torch/optim/optimizer.py", line 375, in wrapper
for pre_hook in chain(_global_optimizer_pre_hooks.values(), self._optimizer_step_pre_hooks.values()):
AttributeError: 'DPOptimizer' object has no attribute '_optimizer_step_pre_hooks'. Did you mean: '_optimizer_step_code'?

This was quite unexpected because our previous investigations indicated that there were no issues with single process backend.

To investigate further, I reverted the dependencies of tutorial back to their original version and with this I was able to run the tutorial with backend = single_process
With above changes the tutorial fails with backend = ray with following error:

File "/home/scngupta/openfl_scngupta/openfl/openfl-tutorials/experimental/workflow/Global_DP/Workflow_Interface_Mnist_Implementation_2.py", line 201, in FedAvg
global_model_tools.global_optimizer.zero_grad()
File "/home/scngupta/miniforge-pypy3/envs/env_openfl_scngupta_globaldp/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 474, in zero_grad
if not self._is_last_step_skipped:
AttributeError: 'DPOptimizer' object has no attribute '_is_last_step_skipped'

As discussed previously this issue seems to be a serialization issue with PrivacyEngine. To overcome this issue I modified the Tutorial to define global_model_tools as an aggregator private attribute (similar to what has been done in this PR)

With these changes the tutorial seems to be working fine (reference: scngupta-dsp#1) with both backends. Refer attached logs

Summary:
My current understanding is that there seem to be two issues with this Tutorial

a) Issue in DPOptimizer with single process backend. Since it is working with previous versions of Pytorch and Opacus, there could be a need to adapt the tutorial to latest versions

b) Issue in serialization of DPOptimizer with ray backend. This issue should get solved by defining global_model_tools as an aggregator private attribute

For both the issues, I am not able to see why there would be a need to add getstate and setstate methods for GlobalModelTools(). WDYT ?

GlobalDP_ray_logs.txt
GlobalDP_single_process_logs.txt

scngupta-dsp · 2025-01-06T06:27:37Z

Hi @agnivac123,

Some more updates to address the issue observed with the latest Torch versions and backend=single_process:

Error:
File "/home/scngupta/openfl_scngupta/openfl/openfl-tutorials/experimental/workflow/Global_DP/Workflow_Interface_Mnist_Implementation_2.py", line 205, in FedAvg
global_model_tools.global_optimizer.step()
File "/home/scngupta/miniforge-pypy3/envs/env_openfl_scngupta_globaldp/lib/python3.10/site-packages/torch/optim/optimizer.py", line 375, in wrapper
for pre_hook in chain(_global_optimizer_pre_hooks.values(), self._optimizer_step_pre_hooks.values()):
AttributeError: 'DPOptimizer' object has no attribute '_optimizer_step_pre_hooks'. Did you mean: '_optimizer_step_code'?

I updated the GlobalModelTools and manually defined _optimizer_step_pre_hooks and _optimizer_step_post_hooks as attributes of the global_optimizer. With these changes, the tutorial appears to work fine (reference: scngupta-dsp#2) with both backends and the latest Torch versions.

Summary:

Issue with DPOptimizer:
- The DPOptimizer object created by the Opacus library seems to have compatibility issues with the latest Torch versions.
- Previously, there were serialization issues with this object. Now, certain attributes expected by the native Torch optimizer (_optimizer_step_pre_hooks) are also missing.
Serialization Issue:
- Serialization issues with DPOptimizer can be mitigated by defining GlobalModelTools as an aggregator private attribute.
- There’s no need to create __setstate__ and __getstate__ methods.
- Guideline: For the Workflow Interface, objects that are not serializable should be created as private attributes of participants (Aggregator or Collaborator).
Attribute Mismatch:
- The missing attributes in DPOptimizer (_optimizer_step_pre_hooks) indicate a mismatch between the Opacus library and the latest PyTorch versions

Recommendation:

Due to the compatibility issues between the Opacus library and the latest PyTorch versions observed in this tutorial, I would recommend continuing with earlier versions of Opacus and Pytorch

agnivac123 marked this pull request as draft December 23, 2024 09:40

agnivac123 marked this pull request as ready for review December 23, 2024 09:50

agnivac123 added 2 commits December 23, 2024 10:00

Fixed Workflow_Interface_Mnist_Implementation_2.py and updated requir…

b8cf66e

…ements_global_dp.txt Signed-off-by: Agniva Chowdhury <agniva.chowdhury@intel.com>

Fix trailing whitespace and end-of-file newlines

3ecc561

Signed-off-by: Agniva Chowdhury <agniva.chowdhury@intel.com>

agnivac123 force-pushed the globaldp_pr branch from b50ec24 to 3ecc561 Compare December 23, 2024 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed Workflow_Interface_Mnist_Implementation_2.py #1227

Fixed Workflow_Interface_Mnist_Implementation_2.py #1227

agnivac123 commented Dec 23, 2024

scngupta-dsp commented Dec 24, 2024

scngupta-dsp commented Jan 6, 2025

Fixed Workflow_Interface_Mnist_Implementation_2.py #1227

Are you sure you want to change the base?

Fixed Workflow_Interface_Mnist_Implementation_2.py #1227

Conversation

agnivac123 commented Dec 23, 2024

scngupta-dsp commented Dec 24, 2024

scngupta-dsp commented Jan 6, 2025

Summary:

Recommendation: