Technische Universität Darmstadt, Winter Semester 2018/2019
Supervision: Jan Peters, Samuele Tosatto
- Cartpole Stabilization (Further info)
- Cartpole Swing-up (Further info)
- Qube/Furuta Pendulum (Further info)
The following Python packages are required:
- autograd
- baselines - For installation details see: https://github.com/openai/baselines
- dill
- GPy
- gym
- matplotlib
- matplotlib2tikz (Optional in case PILCO plots should be saved)
- numpy
- pytorch
- scipy
- tensorboard
- tensorboardX
- tensorflow
- torchvision
- quanser_robots
The following Linux packages are required:
- ffmpeg
The following is required for PILCO Test cases:
- Octave installation
- oct2py (python package)
We also offer to install all required packages directly through our anaconda environment export.
For creating a new anaconda environment based on a YML-file use:
conda env create --name my_env_name --file path/to/conda_env.yml python=3.6.5
Please be aware that the Quanser environments are still subject to change and results or policies might not be reproducible or applicable anymore. The latest Quanser version introduced different constraints for the cartpole environment which can cause issues.
We added a small subset of experiment runs, which we found useful in order to get a better feeling for hyper-parameters and the algorithm in general. This allows to compare different hyper-parameter settings, performance and sample efficiency.
More details can be found here.
In order to run experiments with A3C or PILCO, please check the corresponding README.
Log files for all runs will be saved to ./experiments/logs/
.
Our comprehensive report can be found here
@software{otto_czech_2019,
title = {Project Lab Reinforcement Learning, {TU} Darmstadt, {WS}18/19: {ottofabian}/{RL}-Project},
url = {https://github.com/ottofabian/RL-Project},
shorttitle = {Project Lab Reinforcement Learning, {TU} Darmstadt, {WS}18/19},
author = {{Otto, Fabian and Czech, Johannes}},
urldate = {2019-03-15},
date = {2019-03-15},
}