-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some doubts and questions #7
Comments
Hi, did you get any answers? |
I haven't, unfortunately. |
hi, basing on the understanding for this paper, I have reproduced the whole code of this paper. Although i still have some little bug about it, the final result can be got. I will release the whole code in the future. After that, we can discuss about it |
Cool! If you've ever decided to implement it using FedSim I can help setting up. |
Thx!I have developed the code based on Open-Ai's gym and this repository. I will try my best to achieve that! :) |
Hi, thanks for letting us know! Would you mind sharing your implementation with OpenAI gym so far so that maybe I can help debug it? |
thx! but there is just a few problems, i think i can handle it :), i wiil try my best! |
Cool!
Cool! Looking forward to seeing your code! |
Hi there, I am wondering in your implementation, during the training of DDQN, did you choose only 1 client in each communication round? If that's the case, then only 1 device would report its local weights to the server and thus there would be no FedAvg in the server. I have implemented this scheme, but the results were horrible. I doubt the accuracy would improve using this scheme because for every round, the selected 1 device cannot benefit from observing the weights update from other devices. Would you mind sharing your experiences? Thank you! |
According to the paper, they sorted the Q value of the total 100 clients, and then selected 10 clients which were with bigger Q value than others. While training the Q network, they just use the biggest Q value to train. I have tried many ways, but the performance still does not achieve the goal which was mentioned in the paper. Actually, I strongly suspect that DDQN can actually work.
I have given up on continuing to optimize using reinforcement learning, as the many methods tried do not achieve the results mentioned in the paper. And knowing that now the authors still haven't open sourced it shows that there are many problems. I am very sorry that I really do not have the means to achieve. If you have new ideas, you can discuss with me.
|
@firewood1996 hi, thanks a lot for sharing your experiences! I have implemented the DQN based on flsim, but still cannot reproduce the training performance. Based on my experiments I strongly agree with you that the "FL model will converge with or without DQN". In another word, during the training of DQN, even selecting the same device for every round, the testing accuracy will improve as more communication rounds go. I also believe that the DQN is not suitable for this type of device selection problem because of the strong dependency between actions. In case you are interested, you can find our implementation, short presentation, slides and report here. https://github.com/tian1327/flsim_dqn |
I have read about your work, it is a very excellent work!!! And I strongly agree with your opinion that DQN is not suit and the reward setting also does not reveal the intrinsic connections between different clients. By the way, would you mind add my QQ (2497978657) to discuss more about FL? |
@firewood1996 |
Dear @tian1327 and @firewood1996, Best of luck! |
Feel free to check our implementation and results (code, report, slide, Youtube) here: https://github.com/tian1327/flsim_dqn |
I understand that for some reasons you might not have been able to release your complete code but I would highly appreciate if you could help me answering some questions about your implementation.
Thank you in advance!
The text was updated successfully, but these errors were encountered: