-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue of fetchpgd #14
Comments
It looks like in fetchpgd we're trying to add together a full gradient and a sketch. The sketching step should come after the pgd step. |
Does fetchpgd is the implementation of paper SparseFed? I study the source code and found out that true_topk is actually in line with SparseFed paper in communication efficient. |
I think fetchpgd is some unfinished code, actually. It's supposed to be evaluating the adaptive attack combination between SparseFed and my other paper Neurotoxin. We actually have results for it, so I might need to check whether the finished implementation is on another server. But the idea is basically that the attacker does multiple steps of PGD according to Neurotoxin, where they project the update onto the bottom-k gradients at each iteration. Then the server implements SparseFed by doing the overall top-k operation. As you noted, the robustness defenses in SparseFed are just top-k and then the sketching. |
Thanks for the giant helping. |
Oh, I think that for communication efficiency you should be using the main branch and not the attacks branch. In case you mean communication efficiency with robustness: FetchSGD is in the main branch, and SparseFed is just per-user-gradient-clipping + either top-k or sketching. |
yeah, I know. In fact, I want to reference true_topk because it can provide a novel perspective on server compression (I find there are so many papers focusing on client compression), but I cannot find a proper paper that proposes this method. |
Sure, so SparseFed doesn't introduce top-k. Top-k is introduced by some of the papers that we cite in FetchSGD, in particular; (the particular mechanism that we use with memory) https://arxiv.org/abs/1809.07599 I would note that the FetchSGD work does compare to true top-k for the server compression. |
ok, thank you a lot~ how kind of you |
hi, sorry to bother you again.
When I want to only implement fetchpgd on commefficient-attack, to test the accuracy difference between SpaseFed and fetchSGD methods. I encounter a problem.
My hyperparameter is --dataset_dir data/cifar10 --tensorboard --dataset_name CIFAR10 --model ResNet9 --mode fetchpgd --
k 10000 --num_blocks 1 --num_rows 1 --num_cols 325000 --num_clients 200 --num_workers 10 --error_type virtual --local_momentum 0.0 --virtual_momentum 0.9
the K, rows, cols value I use is same with fetchSGD.
But I got: CommEfficient-attacks\CommEfficient-attacks\CommEfficient\fed_worker.py", line 177, in worker_loop
sum_g += g
RuntimeError: The size of tensor a (500000) must match the size of tensor b (6568640) at non-singleton dimension 1
Could you please help me to fix it? THX a lot!
The text was updated successfully, but these errors were encountered: