Policy_Gradients_to_beat_Pong

This is the code for the "How to Beat Pong Using Policy Gradients (LIVE)" by Siraj Raval on Youtube I only adapted it for Python 3 (Siraj's code works on Python 2).

Overview

This is the code for this video by Siraj Raval on Youtube. We're going to beat the game of Pong using Policy Gradients (a type of reinforcement algo). PG outperformed DeepMind's Deep Q Network, so its a worthy algo to look into.

Dependencies

gym (https://gym.openai.com/docs)
numpy
pickle

Install dependencies with pip

I also put some conda environment file (env.yml). One can install it with:

conda env create -f env.yml

I also had to install cmake (on ubuntu 17.10).

sudo apt-get install cmake

Usage

Run python demo.py and the AI will start playing the game

Credits

Credits go to AndrejK from whom the initial code comes from. Credits go to Siraj Raval I just modified some bits of the code to make it compatible with python 3.

After a few days training

After 6 days of training on Dell XPS 9560 15'' Nvidia GTX 1050, the trained agent could compete... (see below: positive rewards = wins, negative rewards = losses)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
env.yml		env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Policy_Gradients_to_beat_Pong

Overview

Dependencies

Usage

Credits

After a few days training

About

Releases

Packages

Languages

jpoullet2000/Policy_Gradients_to_beat_Pong

Folders and files

Latest commit

History

Repository files navigation

Policy_Gradients_to_beat_Pong

Overview

Dependencies

Usage

Credits

After a few days training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages