Skip to content
forked from Baichenjia/PBRL

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Notifications You must be signed in to change notification settings

yuxudong20/PBRL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

PBRL

Introduction

This is a Pytorch implementation for our paper on

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning, ICLR 2022.

Prerequisites

Installation and Usage

Install the package of rlkit with

cd d4rl
pip install -e .

For running PBRL on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --gpu 0

For running PBRL-Prior on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --prior --gpu 0

For running PBRL on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --gpu 0

For running PBRL-Prior on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --prior --gpu 0

The core implementation is given in d4rl/rlkit/torch/sac/pevi.py

Execution

The data for separate runs is stored on disk under the result directory with filename <env-id>-<timestamp>/<seed>/. Each run directory contains

  • debug.log Record the epoch, Q-value, Uncertainty-value, scores.
  • progress.csv Same data as debug.log but with csv format.
  • variant.json The hyper-parameters in training.
  • models The final actor-critic network.

The evaluation/d4rl score in debug.log or progress.csv records the normalized score in our paper.

In case of any questions, bugs, suggestions or improvements, please feel free to open an issue.

About

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.3%
  • Dockerfile 0.7%