Question on the convergence of DQN on Pong game #1

HoiM · 2022-11-18T01:51:41Z

Hi
Thank you for your tutorials on medium and example codes.

I am a new learner on reinforcement learning.

I tried to run the code of DRL_15_16_17_DQN_Pong but I failed to make it converge when training it.

I am trying to find the reason and I am wondering if the problem results from the reward mechanism of the game environment. When the game is on-going, for most of the time the reward is zero. Only when an episode ends does the game return a 1 or -1 reward. Therefore, for most of the time, the loss becomes the MSE between "current predicted Q value" and "discounted Q value under next state" without reward value involved. I thus deduce that all predicted Q values will eventually be equal after many training iterations, which therefore results in failure to converge.

Am I correct?

I appreciate it if you could offer help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on the convergence of DQN on Pong game #1

Question on the convergence of DQN on Pong game #1

HoiM commented Nov 18, 2022 •

edited

Loading

Question on the convergence of DQN on Pong game #1

Question on the convergence of DQN on Pong game #1

Comments

HoiM commented Nov 18, 2022 • edited Loading

HoiM commented Nov 18, 2022 •

edited

Loading