Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedpara updated #2722

Merged
merged 47 commits into from
Feb 1, 2024
Merged

Fedpara updated #2722

merged 47 commits into from
Feb 1, 2024

Conversation

yehias21
Copy link
Contributor

Issue

#2031

Description

Updated draft

Remaining tasks

[ ] Finalizing personalized FedPara with femnist

omarmoo5 and others added 13 commits December 17, 2023 18:54
* added the baseline folder

* Added Boilerplate code

* Updated config file

* cross finger: first commit on Cifar-10

* dimension fix

* minor fix

* vgg-base with overfit bug

* vgg with overfit bug

* fixed low rank convergence

* output cleaned

* Low rank, cifar iid fixed and added

* fedpara cifar done

---------

Co-authored-by: Omar Mokhtar <o.mokhtar@aic.gov.eg>
Co-authored-by: Roeia Amr <60478505+Roeia99@users.noreply.github.com>
Co-authored-by: Yahia Shaaban <yahia.shaaban@loginnode.imhotep.aic.gov.eg>
Co-authored-by: Yahia Salaheldin Shaaban <62369984+yehias21@users.noreply.github.com>
Co-authored-by: yahia <yehia.salah.ms@alexu.edu.eg>
Co-authored-by: = <=>
@jafermarq jafermarq added new baseline summer-of-reproducibility About a baseline for Summer of Reproducibility and removed new baseline labels Dec 20, 2023
Copy link
Contributor

@jafermarq jafermarq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yehias21,

This is just a preliminar review. I think before moving into adding FEMNIST experiments (which you can find how other baselines integrated that dataset here: https://github.com/adap/flower/tree/main/baselines/fedmeta -- i.e. cloning from LEAF directly), it's preferable to finalise the CIFAR-10 and CIFAR-100 experiments.

  • Would it be possible to present the line plots in the same format as in Figure 3 in the paper. That means showing both VGG16_ori and VGG16_fedpara curves with the x-axis showing communication costs (instead of rounds).
  • The environment setup section needs to be completed with the instructions to follow in order to construct the python environment. (likely you can borrow from what your colleagues did for FedExp: https://github.com/Roeia99/flower/tree/FedExP/baselines/fedexp#environment-setup)
  • please include all necessary packages in pyproject.toml and detail how i can run the experiments (I see you provide a run.sh but you should mention this in the README.md in the Running the experiments section).

@yehias21
Copy link
Contributor Author

Hi Javier, I've addressed all the comments related to the Cifar baselines as mentioned. Please inform me if there are any additional matters to attend to.

Copy link
Contributor

@jafermarq jafermarq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yehias21,

Just a small comment below in pyproject.toml. I also made some edits, primarily to the README.md.

I ran your CIFAR-10/100 experiments and all curves resulted very close to the ones you include in the README with the exception of the model.conv_type=standard CIFAR-10 iid setting. The results i get are more in line with those in the paper than what your plots show. For example, for CIFAR-10 non-iid crosses the 60% threshold approx after 180GB of comms (see Fig3d in the paper), I get something similar, but in your plot it seems this happens after just 75GB of communication. For the IID CIFAR-10 setting I observe a similar situation: paper and results after running the code show croshing the 70% threshold after 100GB of communication but your plot shows that can be achieve with 50GB of communication.
Screenshot 2024-01-03 at 22 32 18

I wonder the setting you run for those experiments you made the plots from was slightly different for CIFAR-10 with model.conv_type=standard and that's why the communication costs seem to be about half. Also note that max comm costs shown in the plots is 175GB while when running the code it would be a bit over 350GB (which matches the logic of (2 x 200rounds x 16clients x 15.3M params x 4 / 1024) -- I multiply by 4 to "convert" to MB.

Finally do you have an update about FEMNIST?

baselines/fedpara/pyproject.toml Outdated Show resolved Hide resolved
@jafermarq jafermarq mentioned this pull request Jan 4, 2024
2 tasks
@yehias21 yehias21 force-pushed the fedpara-updated branch 3 times, most recently from 7ea0c2e to f702821 Compare January 21, 2024 16:08
Copy link
Contributor

@jafermarq jafermarq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yehias21 and team,

Following our last chat, i think we can start wrapping up this baseline and focus on CIFAR-10/100 results. It's fine leaving the personalization with MNIST ones as an extension (can be added as a new section at the end of the readme -- see my comments about that below).

I've pushed a commit that fixing formatting and some typing issues. Most issues are resolved except for a couple (one being passing an argument that's not expected to gen_evaluate_fn() and another in client's _get_keys_state_dict())

Please find my comments below. I applied them all locally, the code runs but I'm not seeing the global model improving. I ran this:

python -m fedpara.main --multirun model.conv_type=standard,lowrank 

It could be that I made a mistake.

Let me know when you have gone through my comments.

baselines/fedpara/fedpara/main.py Outdated Show resolved Hide resolved
baselines/fedpara/fedpara/main.py Outdated Show resolved Hide resolved
baselines/fedpara/fedpara/dataset.py Show resolved Hide resolved
baselines/fedpara/fedpara/main.py Outdated Show resolved Hide resolved
baselines/fedpara/README.md Outdated Show resolved Hide resolved
baselines/fedpara/README.md Show resolved Hide resolved
baselines/fedpara/fedpara/models.py Outdated Show resolved Hide resolved
baselines/fedpara/fedpara/models.py Outdated Show resolved Hide resolved
baselines/fedpara/fedpara/models.py Outdated Show resolved Hide resolved
@jafermarq jafermarq marked this pull request as ready for review January 31, 2024 00:16
Copy link
Contributor

@jafermarq jafermarq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yehias21 and collaborators! Your implementation of FedPara is ready to be merged! Amazing work!!

@jafermarq jafermarq enabled auto-merge (squash) January 31, 2024 00:18
@jafermarq jafermarq merged commit 038cf75 into adap:main Feb 1, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
summer-of-reproducibility About a baseline for Summer of Reproducibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants