Implementation of the phasic policy gradient (PPG) algorithm for stable-baselines3.
The CNN policy with an auxiliary head is currently missing, so you can
only use the AuxMlpPolicy
.
To initialize the policy with the paper's initialization values,
uncomment the code for init_weights
in
./ppg/aux_ac_policy.py.