Skip to content

Action Candidate based Clipped Double Q-learning (Accepted by AAAI 2021)

License

Notifications You must be signed in to change notification settings

Jiang-HB/AC_CDQ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Action Candidate Based Clipped Double Q-learning for Discrete and Continuous Actions

PyTorch implementation of our action candidate based clipped double estimator (AC-CDE), action candidate based clipped Double Q-learning (AC-CDQ), action candidate based clipped Double DQN (AC-CDDQN) and action candidate based TD3 (AC-TD3).

Paper link arXiv.

Usage

  1. For AC-CDE, we evaluate it on the multi-armed bandits problem. The result can be reproduced by running:

    cd AC_CDE_code
    python3 main.py
    
  2. For AC-CDQ, we evaluate it on the grid world game. The result can be reproduced by running:

    cd AC_CDQ_code
    python3 main.py
    
  3. For AC-CDDQN, we evaluate it on the MinAtar benchmark. The result can be reproduced by running:

    cd AC_CDDQN_code
    CUDA_VISIBLE_DEVICES=0 python3 main.py
    
  4. For AC-TD3, we evaluate it on MuJoCo continuous control tasks. The result can be reproduced by running:

    cd AC_TD3_code
    CUDA_VISIBLE_DEVICES=0 python3 main.py
    

About

Action Candidate based Clipped Double Q-learning (Accepted by AAAI 2021)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages