Analysis of Deep Q-Network algorithm on Simple Pong Environment
I defined agent’s states like the following:
- Position of the left paddle on the y-axis
- Position of the right paddle on the y-axis
- Position of the ball on the y-axis
- Position of the ball on the x-axis
- Velocity of the ball on the x-direction
- Velocity of the ball on the y-direction
A set of predefined constants delineate the rewards and penalties for different in-game events as following:
- Game End Condition: If the game concludes , the function checks the outcome: - If the agent scores, the +10 reward is granted.
- If the opponent scores, the -10 penalty is deducted.
- In-Game Rewards/Penalties: For ongoing games:
- The function first checks for the ball's collision with the agent's paddle. If the ball hits the center of the paddle, a reward of +0.1 is given; otherwise, a penalty of -0.1 is applied.
- If there's no collision, the function evaluates the agent's movement towards or away from the ball, using the difference in vertical distance between the ball and the paddle. Depending on the movement direction, a corresponding reward of +0.5 or penalty of - 0.5 is assigned.
![Screenshot 2023-08-31 at 17 33 19](https://private-user-images.githubusercontent.com/92628109/264679791-bf6f6745-4b29-49f0-8876-51d227cd42e4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0ODk0NDQsIm5iZiI6MTczOTQ4OTE0NCwicGF0aCI6Ii85MjYyODEwOS8yNjQ2Nzk3OTEtYmY2ZjY3NDUtNGIyOS00OWYwLTg4NzYtNTFkMjI3Y2Q0MmU0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDIzMjU0NFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY5MTRkMDNiOWE2NDViY2EyZTYyM2RmZjIzMDEyMTc0MTYwZjA5OTk4YTlkZDIzYmMxODUxOTY1N2E2MjQwZDkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.xLXtrMTAmTaI_AjFhXCPCB20CzNAZzB_AZqMO0WtKLM)
Test of agent (DQN) against nominal player
![Screenshot 2023-08-31 at 17 30 04](https://private-user-images.githubusercontent.com/92628109/264678844-77d52ef6-c84f-4516-953c-216d4dc12268.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0ODk0NDQsIm5iZiI6MTczOTQ4OTE0NCwicGF0aCI6Ii85MjYyODEwOS8yNjQ2Nzg4NDQtNzdkNTJlZjYtYzg0Zi00NTE2LTk1M2MtMjE2ZDRkYzEyMjY4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDIzMjU0NFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNiNGI5ZjEyYjhiZWI0M2QwOThkMzc3Njc2ZjJkYjI1ZWViMDU1ZTI3MGQ4MjQ2NzJiNzE1YmY2Y2U3NjhiMGYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.98R8booDZmoSX05d3V0yxmDVjQfaUWEYpA94LMDjqKY)