Skip to content

astanic/crafter-ood

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter



Paper link: https://arxiv.org/abs/2208.03374

Example of Crafter gameplay and object-centric agent's attention visualizations

Reproducing the experiments from the paper

To install all the requirements use:

pip install -r requirements.txt

To reproduce numbers from Table 1 for all the agents run the following commands:

python3 main.py --profile=ppo_cnn
python3 main.py --profile=ppo_spcnn
python3 main.py --profile=lstm_cnn
python3 main.py --profile=lstm_spcnn
python3 main.py --profile=oc_sa
python3 main.py --profile=oc_ca

For a run without the scoreboard rendered run:

python3 main.py --profile=ppo_cnn -crf.render_scoreboard=False

To evaluate a PPO CNN agent (change profile argument for other agents) in different CrafterOODapp environments (Table 4) run the commands shown below. Explanation: the comma-separated string contains four numbers, representing the probablities of each of the four object variants appearing in the map. For example, 88,4,4,4 means that the first object variant appears with 88% probability, and all the other objects with 4% probability. Note that the validation environment always contains only the last three object, with equal likelihood of appearance.

python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=25,25,25,25 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=52,16,16,16 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=76,8,8,8 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=88,4,4,4 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=94,2,2,2 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=97,1,1,1 --el_freq_valid=0,33,33,34
python3 main.py --profile=ppo_cnn --el_vars=tczsuk --el_freq_train=100,0,0,0 --el_freq_valid=0,33,33,34

To evaluate a PPO CNN agent (change profile argument for other agents) in different CrafterOODnum environments (Table 6) run the commands shown below.

python3 main.py --profile=ppo_cnn -el_app_freq_train=easyX2 --el_ap_freq_valid=default
python3 main.py --profile=ppo_cnn -el_app_freq_train=easyX4 --el_ap_freq_valid=default
python3 main.py --profile=ppo_cnn -el_app_freq_train=mix --el_ap_freq_valid=default
python3 main.py --profile=ppo_cnn -el_app_freq_train=default --el_ap_freq_valid=mix
python3 main.py --profile=ppo_cnn -el_app_freq_train=default --el_ap_freq_valid=easyX2
python3 main.py --profile=ppo_cnn -el_app_freq_train=default --el_ap_freq_valid=easyX4
python3 main.py --profile=ppo_cnn -el_app_freq_train=easyX2 --el_ap_freq_valid=hardX2
python3 main.py --profile=ppo_cnn -el_app_freq_train=easyX4 --el_ap_freq_valid=hardX4

Additionally, it is possible to have a fine-grained control over the generation of individual elements, namely trees, coal, cows, zombies and skeleton. To achieve this, use a string of length 5 (each char for each of the objects previously mentioned, in that order(!)). The string specifies the number of objects - f=4x, d=2x, s=1x, h=1/2, q=1/4 relative to their number in the default environment. The 5-length string specifies the relative increase (f,d) or decrease (h,q) in the number of following objects: tree, coal, cow, zombie, skeleton (in this exact order). For example, using easyX2 is equivalent to using the dddhh string.

python3 main.py --profile=ppo_cnn -el_app_freq_train=dddhh --el_ap_freq_valid=sssss  # easy(x2) -> default
python3 main.py --profile=ppo_cnn -el_app_freq_train=fffqq --el_ap_freq_valid=sssss  # easy(x4) -> default
python3 main.py --profile=ppo_cnn -el_app_freq_train=fffff --el_ap_freq_valid=sssss  # mix(x4)  -> default
python3 main.py --profile=ppo_cnn -el_app_freq_train=sssss --el_ap_freq_valid=fffff  # default  -> mix
python3 main.py --profile=ppo_cnn -el_app_freq_train=sssss --el_ap_freq_valid=dddhh  # default  -> easy(x2)
python3 main.py --profile=ppo_cnn -el_app_freq_train=sssss --el_ap_freq_valid=fffqq  # default  -> easy(x4)
python3 main.py --profile=ppo_cnn -el_app_freq_train=dddhh --el_ap_freq_valid=hhhdd  # easy(x2) -> hard(x2)
python3 main.py --profile=ppo_cnn -el_app_freq_train=fffqq --el_ap_freq_valid=qqqff  # easy(x4) -> hard(x4)

Acknowledgements

This repository is based on the following resources:

  1. CrafterOOD environments are created by forking and adapting the coder from https://github.com/danijar/crafter.
  2. Object centric agents were implemented in stable-baselines3 codebase, by forking it and adapting from https://github.com/DLR-RM/stable-baselines3.
  3. The recurrent agents were ported and adapted from the contrib repository of stable-baselines3: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib.
  4. Arguments, logger, and training helper are adapted from: https://github.com/RobertCsordas/modules.

Citation

Please cite our paper if you use our code or if you re-implement our method:

@article{stanic2022learning,
  title = {Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter},
  author = {Stani\'{c}, Aleksandar and Tang, Yujin and Ha, David and Schmidhuber, J{\"u}rgen},
  year = {2022},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages