Inspired by DIAMOND
This is a diffusion model simulating Snake.
The idea is to take some previous frames and the user input to predict the next frame.
It was never explicitly taught any Snake rules.
Note that apples are nondeterministic!
Download diffuzer weights (best.pt
) from here.
Save them at: snake/diffusion/models/best.pt
To run do:
python -m snake
Tweak the SUPPORTED
parameter in __main__.py
to take the median of 5 model outputs and regenerate apples if they disappear. This makes the game smoother.
# Train the RL Agent
python -m snake.agent.train # or use best_agent.pt
# Generate a dataset
python -m snake.dataset.dataset
# Train the diffuzer
python -m snake.diffusion.train
Training best_agent.pt
agent took 14000 (short) epochs with learning rate 0.005.
Training best.pt
diffuzer took 15 epochs with learning rate 0.001 and additional 5 epochs with learning rate 0.0001.