Mainly following OPENAI Spinningup
- Pytorch cuda support
- D3QN implementation
- Auto alpha tuning for SAC
- Some extra useful args and updates for rl methods
- Better code linting
- Using robel for method evaluation
- multiprocess learning for DDPG, SAC, TD3, D3QN