Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on the convergence of DQN on Pong game #1

Open
HoiM opened this issue Nov 18, 2022 · 0 comments
Open

Question on the convergence of DQN on Pong game #1

HoiM opened this issue Nov 18, 2022 · 0 comments

Comments

@HoiM
Copy link

HoiM commented Nov 18, 2022

Hi
Thank you for your tutorials on medium and example codes.

I am a new learner on reinforcement learning.

I tried to run the code of DRL_15_16_17_DQN_Pong but I failed to make it converge when training it.

I am trying to find the reason and I am wondering if the problem results from the reward mechanism of the game environment. When the game is on-going, for most of the time the reward is zero. Only when an episode ends does the game return a 1 or -1 reward. Therefore, for most of the time, the loss becomes the MSE between "current predicted Q value" and "discounted Q value under next state" without reward value involved. I thus deduce that all predicted Q values will eventually be equal after many training iterations, which therefore results in failure to converge.

Am I correct?

I appreciate it if you could offer help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant