Q-Learning Implementation

Basic implementation of q-learning in a 4*4 grid.

Agent starts at (0,0).

White cells have a negative reward.

Episode ends when agent finds the black cell (positive reward).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
README.md		README.md
qlearning.py		qlearning.py

Provide feedback