Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About gradient descent on the client side #2

Closed
JackFroster opened this issue Sep 16, 2022 · 3 comments
Closed

About gradient descent on the client side #2

JackFroster opened this issue Sep 16, 2022 · 3 comments
Labels
good first issue Good for newcomers

Comments

@JackFroster
Copy link

Hi, Jiahao Tan.
Thanks for your work.

I have some confusion about the code on lines 98 of "per-fedavg /perfedavg.py".
param.data.sub_(self.beta * grad1 - self.beta * self.alpha * grad2)
According to the formula in the article, I think "self.beta * self.alpha * grad2" seems to miss "grad1".

@KarhouTam
Copy link
Owner

KarhouTam commented Sep 17, 2022

Hi, Jack.
Thanks for your attention to my reproduction.

Actually, the formula for computing $\tilde{\nabla}^2$ is referred to in another paper by the author of PerFedAvg. https://arxiv.org/abs/1908.10400

image

According to the formula shown above and the computation way of grad2, the update method should be reasonable now.

grad2 is $\nabla^2$, and it takes grad1 as the $v$ in the formula. So there is unnecessary to multiply another grad1 by grad2.😏

To fully allay your concern, the fraction of source code I got from asking the author of PerFedAvg is shown below. I don't know if the author of PerFedAvg wants the source code shared, so I choose to show only some codes to you rather than give you the whole file.

        for t = 1:tau

            B_1 = randperm(user_l(i), D_i); % get data batch

            [lgw12, lgw23, lgw34, lgb12, lgb23, lgb34] = grad_batch (lw12, lw23, lw34, lb12, lb23, lb34, Dat(:, B_1, i), Lab(:, B_1, i), D_i);

            B_2 = randperm(user_l(i), D_o);

            [lgw12, lgw23, lgw34, lgb12, lgb23, lgb34] = grad_batch (lw12 - al * lgw12, lw23 - al * lgw23, lw34 - al * lgw34, lb12 - al * lgb12, lb23 - al * lgb23, lb34 - al * lgb34, Dat(:, B_2, i), Lab(:, B_2, i), D_o);

            B_3 = randperm(user_l(i), D_h);
            % NOTE: batch_3's size is 20, not 40; v is the grads produced by batch_1 and _2, not 1!
            [lh1w12, lh1w23, lh1w34, lh1b12, lh1b23, lh1b34] = grad_batch (lw12 - de * lgw12, lw23 - de * lgw23, lw34 - de * lgw34, lb12 - de * lgb12, lb23 - de * lgb23, lb34 - de * lgb34, Dat(:, B_3, i), Lab(:, B_3, i), D_h);
            [lh2w12, lh2w23, lh2w34, lh2b12, lh2b23, lh2b34] = grad_batch (lw12 + de * lgw12, lw23 + de * lgw23, lw34 + de * lgw34, lb12 + de * lgb12, lb23 + de * lgb23, lb34 + de * lgb34, Dat(:, B_3, i), Lab(:, B_3, i), D_h);

            lw12 = lw12 - be * lgw12 + be * al / (2 * de) * (lh2w12 - lh1w12);
            lw23 = lw23 - be * lgw23 + be * al / (2 * de) * (lh2w23 - lh1w23);
            lw34 = lw34 - be * lgw34 + be * al / (2 * de) * (lh2w34 - lh1w34);
            lb12 = lb12 - be * lgb12 + be * al / (2 * de) * (lh2b12 - lh1b12);
            lb23 = lb23 - be * lgb23 + be * al / (2 * de) * (lh2b23 - lh1b23);
            lb34 = lb34 - be * lgb34 + be * al / (2 * de) * (lh2b34 - lh1b34);

        end

@KarhouTam KarhouTam added the good first issue Good for newcomers label Sep 17, 2022
@KarhouTam KarhouTam reopened this Sep 17, 2022
@JackFroster
Copy link
Author

Thanks for your answer. I understood it.

@KarhouTam
Copy link
Owner

I'm glad for helping you. Just keep this issue open for someone else who also feels confused about that. 😏

@KarhouTam KarhouTam pinned this issue Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants