Status: Maintenance (expect bug fixes and minor updates)
16 simple-to-use procedurally-generated gym environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills. The environments run at high speed (thousands of steps per second) on a single core.
These environments are associated with the paper Leveraging Procedural Generation to Benchmark Reinforcement Learning (citation). The code for running some experiments from the paper is in the train-procgen repo. For those familiar with the original CoinRun environment, be sure to read the updated CoinRun description below as there have been subtle changes to the environment.
Compared to Gym Retro, these environments are:
- Faster: Gym Retro environments are already fast, but Procgen environments can run >4x faster.
- Non-deterministic: Gym Retro environments are always the same, so you can memorize a sequence of actions that will get the highest reward. Procgen environments are randomized so this is not possible.
- Customizable: If you install from source, you can perform experiments where you change the environments, or build your own environments. The environment-specific code for each environment is often less than 300 lines. This is almost impossible with Gym Retro.
Supported platforms:
- Windows 10
- macOS 10.14 (Mojave), 10.15 (Catalina)
- Linux (manylinux2010)
Supported Pythons:
- 3.6 64-bit
- 3.7 64-bit
- 3.8 64-bit
Supported CPUs:
- Must have at least AVX
First make sure you have a supported version of python:
# run these commands to check for the correct python version
python -c "import sys; assert (3,6,0) <= sys.version_info <= (3,9,0), 'python is incorrect version'; print('ok')"
python -c "import platform; assert platform.architecture()[0] == '64bit', 'python is not 64-bit'; print('ok')"
To install the wheel:
pip install procgen
If you get an error like "Could not find a version that satisfies the requirement procgen"
, please upgrade pip: pip install --upgrade pip
.
To try an environment out interactively:
python -m procgen.interactive --env-name coinrun
The keys are: left/right/up/down + q, w, e, a, s, d for the different (environment-dependent) actions. Your score is displayed as "episode_return" on the right. At the end of an episode, you can see your final "episode_return" as well as "level_completed" which will be 1
if you successfully completed the level.
To create an instance of the gym environment:
import gym
env = gym.make("procgen:procgen-coinrun-v0")
To create an instance of the vectorized environment:
from procgen import ProcgenEnv
venv = ProcgenEnv(num_envs=1, env_name="coinrun")
The environment uses the VecEnv
interface from baselines
, baselines
is not a dependency of this library.
A Dockerfile
is included to demonstrate a minimal Docker-based setup that works for running random agent.
docker build docker --tag procgen
docker run --rm -it procgen python3 -m procgen.examples.random_agent
The observation space is a box space with the RGB pixels the agent sees in a numpy array of shape (64, 64, 3). The expected step rate for a human player is 15 Hz.
The action space is Discrete(15)
for which button combo to press. The button combos are defined in env.py
.
If you are using the vectorized environment, the observation space is a dictionary space where the pixels are under the key "rgb".
Here are the 16 environments:
env_name
- Name of environment, or comma-separate list of environment names to instantiate as each env in the VecEnv.num_levels
- The number of unique levels that can be generated. Set to 0 to use unlimited levels.start_level
- The lowest seed that will be used to generated levels. 'start_level' and 'num_levels' fully specify the set of possible levels.paint_vel_info
- Paint player velocity info in the top left corner. Only supported by certain games.use_generated_assets
- Use randomly generated assets in place of human designed assets.debug_mode
- A useful flag that's passed through to procgen envs. Use however you want during debugging.center_agent
- Determines whether observations are centered on the agent or display the full level. Override at your own risk.use_sequential_levels
- When you reach the end of a level, the episode is ended and a new level is selected. Ifuse_sequential_levels
is set toTrue
, reaching the end of a level does not end the episode, and the seed for the new level is derived from the current level seed. If you combine this withstart_level=<some seed>
andnum_levels=1
, you can have a single linear series of levels similar to a gym-retro or ALE game.distribution_mode
- What variant of the levels to use, the options are"easy", "hard", "extreme", "memory", "exploration"
. All games support"easy"
and"hard"
, while other options are game-specific. The default is"hard"
. Switching to"easy"
will reduce the number of timesteps required to solve each game and is useful for testing or when working with limited compute resources.
Here's how to set the options:
import gym
env = gym.make("procgen:procgen-coinrun-v0", start_level=0, num_levels=1)
For the vectorized environment:
from procgen import ProcgenEnv
venv = ProcgenEnv(num_envs=1, env_name="coinrun", start_level=0, num_levels=1)
- You should depend on a specific version of this library (using
==
) for your experiments to ensure they are reproducible. You can get the current installed version withpip show procgen
. - This library does not require or make use of GPUs.
- While the library should be thread safe, each individual environment instance should only be used from a single thread. The library is not fork safe unless you set
num_threads=0
. Even if you do that,Qt
is not guaranteed to be fork safe, so you should probably create the environment after forking or not use fork at all. - Calling
reset()
early will not do anything, please re-create the environment if you want to reset it early.
If you want to change the environments or create new ones, you should build from source. You can get miniconda from https://docs.conda.io/en/latest/miniconda.html if you don't have it, or install the dependencies from environment.yml
manually. On Windows you will also need "Visual Studio 15 2017" installed.
git clone git@github.com:openai/procgen.git
cd procgen
conda env update --name procgen --file environment.yml
conda activate procgen
pip install -e .
# this should say "building procgen...done"
python -c "from procgen import ProcgenEnv; ProcgenEnv(num_envs=1, env_name='coinrun')"
# this should create a window where you can play the coinrun environment
python -m procgen.interactive
The environment code is in C++ and is compiled into a shared library loaded by python using a C interface based on libenv
. The C++ code uses Qt for drawing.
Once you have installed from source, you can customize an existing environment or make a new environment of your own. If you want to create a fast C++ 2D environment, you can fork this repo and do the following:
- Copy
src/games/bigfish.cpp
tosrc/games/<name>.cpp
- Replace
BigFish
with<name>
and"bigfish"
with"<name>"
in your cpp file - Add
src/games/<name>.cpp
toCMakeLists.txt
- Run
python -m procgen.interactive --env-name <name>
to test it out
This repo includes a travis configuration that will compile your environment and build python wheels for easy installation. In order to have this build more quickly by caching the Qt compilation, you will want to configure a GCS bucket in common.py and setup service account credentials.
See CHANGES for changes present in each release.
See CONTRIBUTING for information on contributing.
See ASSET_LICENSES for asset license information.
Please cite using the following bibtex entry:
@article{cobbe2019procgen,
title={Leveraging Procedural Generation to Benchmark Reinforcement Learning},
author={Cobbe, Karl and Hesse, Christopher and Hilton, Jacob and Schulman, John},
journal={arXiv preprint arXiv:1912.01588},
year={2019}
}