Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Adds TorchRL training configuration for the Anymal-D velocity environment as a template for training IsaacLab environments with the new torchrl training workflow.
You can try training Anymal with TorchRL (after merging all TorchRL PRs) using
/workspace/isaaclab/source/standalone/workflows/torchrl python train.py --task Isaac-Velocity-Flat-Anymal-D-v0 --num_envs 4096
Related PRs:
#1178, #1179
This is the last PR in the group of 3 that adds the TorchRL training pipeline.
Unfortunately, the Anymal-D environment converges slower than RSL-RL. This is probably due to policy architecture and PPO implementation differences which requires different hyperparameter settings. I have provided the best hyperparameters that have worked for me so far. While it is possible to speed up the convergence by increasing
desired_kl
targets andentropy_coef
to closely match RSL-RL, torchrl policies seem to crash late during training due to spurious KL/action noise spikes after the reward has long converged.Training curves and video
Video_3197_7d9452698382ed1512bd.mp4
Type of change
Checklist
pre-commit
checks with./isaaclab.sh --format
config/extension.toml
fileCONTRIBUTORS.md
or my name already exists there