ICLR 2023 - Neuroevolution is a competitive alternative to reinforcement learning for skill discovery
This repository contains the code for the paper "Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery.", (Chalumeau, Boige et al., 2023) 💻 ⚡.
This code requires docker to run. To install docker please follow the online instructions here. To enable the code to run on GPU, please install Nvidia-docker (as well as the latest nvidia driver available for your GPU).
Once docker and docker Nvidia are installed, you can simply build the docker image with the following command:
make build
and, once the image is built, start the container with:
make dev_container
Inside the container, you can run the nvidia-smi
command to verify that your GPU is found.
To train an algorithm on an environemnt use the following command:
make train script_name=[SCRIPT_NAME] env_name=[ENV_NAME]
This will load the default config in qdbenchmark/training/config
and launch the training. The list of available scripts and configs are present in qdbenchmark/training/
.
For instance to train MAP-ELITES on the environment Halfcheetah Uni run:
make train script_name=train_map_elites.py env_name=halfcheetah_uni
To perform adaptations tasks, the user must provide a path to the policy/repertoire resulting from the training of an agent as well as the path to the config used to train this agent. Three commands are available, one for each of the three adaptation-task families (Gravity multiplier, Actuator update or Default position change).
For the QD algorithms:
make adaptation_gravity_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_actuator_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_position_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
and for the Skill Discovery algorithms:
make adaptation_gravity_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_actuator_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_position_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
Sample configs and checkpoints are provided to give an example, for instance the user can run the following command:
make adaptation_gravity_sd policy_path=sample/sample_policy/dads-reward-ant-uni-policy-0.npy run_config_path=sample/sample_config/dads_reward_ant_uni.yaml env_name=ant_uni algorithm_name=DADS+REWARD
To perform the Halfcheetah-Hurdle hierarchical task, the user should also provid a policy/repertoire path. The syntax is the following:
For QD algorithms:
make hierarchical_qd repertoire_path=[PATH_TO_REPERTOIRE] algorithm_name=[ALGORITHM_NAME]
For Skill Discovery algorithms:
make hierarchical_sd policy_path=[PATH_TO_POLICY] algorithm_name=[ALGORITHM_NAME]
Sample checkpoints are provided to give an example, for instance the user can run the following command:
make hierarchical_qd repertoire_path=sample/sample_repertoire/map_elites_halfcheetah_uni/ algorithm_name=MAP-ELITES
If you use the code or data in this package, please cite:
@inproceedings{
chalumeau2023neuroevolution,
title={Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery},
author={Felix Chalumeau and Raphael Boige and Bryan Lim and Valentin Mac{\'e} and Maxime Allard and Arthur Flajolet and Antoine Cully and Thomas PIERROT},
booktitle={International Conference on Learning Representations},
year={2023},
url={https://openreview.net/forum?id=6BHlZgyPOZY}
}