Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape Mismatch Error in VecNormalize during PPO Training #3

Closed
Cozokim opened this issue Sep 18, 2024 · 2 comments
Closed

Shape Mismatch Error in VecNormalize during PPO Training #3

Cozokim opened this issue Sep 18, 2024 · 2 comments

Comments

@Cozokim
Copy link

Cozokim commented Sep 18, 2024

Steps to reproduce: I ran the following command as suggested in the readme:

python train_policy.py ../config/mujuco_BlockedHalfCheetah/train_ppo_HCWithPos-v0.yaml -n 1 -s 123

Beginning training
100%| 200000/200000 [00:00<00:00, 176379478.55it/s]
Traceback (most recent call last):
File "/home/username/Code/ICRL-benchmarks-public/interface/train_policy.py", line 311, in
train(args)
File "/home/username/Code/ICRL-benchmarks-public/interface/train_policy.py", line 211, in train
policy_agent.learn(total_timesteps=forward_timesteps,
File "/home/username/Code/ICRL-benchmarks-public/stable_baselines3/ppo/ppo.py", line 255, in learn
return super(PPO, self).learn(
File "/home/username/Code/ICRL-benchmarks-public/stable_baselines3/common/on_policy_algorithm.py", line 223, in learn
total_timesteps, callback = self._setup_learn(
File "/home/username/Code/ICRL-benchmarks-public/stable_baselines3/common/base_class.py", line 347, in _setup_learn
self._last_obs = self.env.reset()
File "/home/username/Code/ICRL-benchmarks-public/stable_baselines3/common/vec_env/vec_normalize.py", line 157, in reset
return self.normalize_obs(obs)
File "/home/username/Code/ICRL-benchmarks-public/stable_baselines3/common/vec_env/vec_normalize.py", line 113, in normalize_obs
obs = np.clip((obs - self.obs_rms.mean) / np.sqrt(self.obs_rms.var + self.epsilon), -self.clip_obs, self.clip_obs)
ValueError: operands could not be broadcast together with shapes (1,18) (17,)
Description: I encountered this error while following the tutorial to train a policy using PPO with train_policy.py. It seems like there's a mismatch between vector shapes during normalization. Specifically, the code is trying to operate on arrays with incompatible shapes: (1,18) and (17,).

It looks like the observation shapes aren't handled correctly when normalizing observations with VecNormalize.

Request: Could anyone help with this issue? It seems like a problem with the observation vector dimensions during PPO training. Any advice on how to resolve this shape mismatch would be appreciated.

Thank you!

@Guiliang
Copy link
Owner

Hi, this issue is primarily due to a version mismatch with the 'gym' library. Please ensure that you are using version 0.21.0 of 'gym'.

@Cozokim
Copy link
Author

Cozokim commented Sep 18, 2024

Thanks for the quick response!
I had initially installed a different version due to issues with Gym 0.21.0, but I managed to figure out how to install this version, and it indeed resolved the problem.
For anyone having trouble installing Gym 0.21.0, follow this link: openai/gym#3202

@Cozokim Cozokim closed this as completed Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants