render_agilerl_maddpg.py run error #1254

skr3178 · 2024-12-19T21:44:28Z

Describe the bug

/home/skr/miniconda3/envs/py38_2/bin/python /home/skr/PettingZoo/tutorials/AgileRL/render_agilerl_maddpg.py
Traceback (most recent call last):
File "/home/skr/PettingZoo/tutorials/AgileRL/render_agilerl_maddpg.py", line 118, in
cont_actions, discrete_action = maddpg.getAction(
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/agilerl/algorithms/maddpg.py", line 418, in getAction
action_values = actor(state)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/agilerl/networks/evolvable_mlp.py", line 287, in forward
x = self.feature_net(x)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/container.py", line 219, in forward
input = module(input)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/skr/miniconda3/envs/py38_2/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 117, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (336x84 and 4x64)

Process finished with exit code 1

Code example

import os

import imageio
import numpy as np
import supersuit as ss
import torch
from agilerl.algorithms.maddpg import MADDPG
from PIL import Image, ImageDraw

from pettingzoo.atari import space_invaders_v2


# Define function to return image
def _label_with_episode_number(frame, episode_num):
    im = Image.fromarray(frame)

    drawer = ImageDraw.Draw(im)

    if np.mean(frame) < 128:
        text_color = (255, 255, 255)
    else:
        text_color = (0, 0, 0)
    drawer.text(
        (im.size[0] / 20, im.size[1] / 18), f"Episode: {episode_num+1}", fill=text_color
    )

    return im


if __name__ == "__main__":
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Configure the environment
    env = space_invaders_v2.parallel_env(render_mode="rgb_array")
    channels_last = True  # Needed for environments that use images as observations
    if channels_last:
        # Environment processing for image based observations
        env = ss.frame_skip_v0(env, 4)
        env = ss.clip_reward_v0(env, lower_bound=-1, upper_bound=1)
        env = ss.color_reduction_v0(env, mode="B")
        env = ss.resize_v1(env, x_size=84, y_size=84)
        env = ss.frame_stack_v1(env, 4)
    env.reset()
    try:
        state_dim = [env.observation_space(agent).n for agent in env.agents]
        one_hot = True
    except Exception:
        state_dim = [env.observation_space(agent).shape for agent in env.agents]
        one_hot = False
    try:
        action_dim = [env.action_space(agent).n for agent in env.agents]
        discrete_actions = True
        max_action = None
        min_action = None
    except Exception:
        action_dim = [env.action_space(agent).shape[0] for agent in env.agents]
        discrete_actions = False
        max_action = [env.action_space(agent).high for agent in env.agents]
        min_action = [env.action_space(agent).low for agent in env.agents]

    # Pre-process image dimensions for pytorch convolutional layers
    if channels_last:
        state_dim = [
            (state_dim[2], state_dim[0], state_dim[1]) for state_dim in state_dim
        ]

    # Append number of agents and agent IDs to the initial hyperparameter dictionary
    n_agents = env.num_agents
    agent_ids = env.agents

    # Instantiate an MADDPG object
    maddpg = MADDPG(
        state_dim,
        action_dim,
        one_hot,
        n_agents,
        agent_ids,
        max_action,
        min_action,
        discrete_actions,
        device=device,
    )

    # Load the saved algorithm into the MADDPG object
    # path = "./models/MADDPG/MADDPG_trained_agent.pt"
    # maddpg.loadCheckpoint(path)

    # Define test loop parameters
    episodes = 10  # Number of episodes to test agent on
    max_steps = 500  # Max number of steps to take in the environment in each episode

    rewards = []  # List to collect total episodic reward
    frames = []  # List to collect frames
    indi_agent_rewards = {
        agent_id: [] for agent_id in agent_ids
    }  # Dictionary to collect inidivdual agent rewards

    # Test loop for inference
    for ep in range(episodes):
        state, info = env.reset()
        agent_reward = {agent_id: 0 for agent_id in agent_ids}
        score = 0
        for _ in range(max_steps):
            if channels_last:
                state = {
                    agent_id: np.moveaxis(np.expand_dims(s, 0), [3], [1])
                    for agent_id, s in state.items()
                }

            agent_mask = info["agent_mask"] if "agent_mask" in info.keys() else None
            env_defined_actions = (
                info["env_defined_actions"]
                if "env_defined_actions" in info.keys()
                else None
            )

            # Get next action from agent
            cont_actions, discrete_action = maddpg.getAction(
                state,
                epsilon=0,
                agent_mask=agent_mask,
                env_defined_actions=env_defined_actions,
            )
            if maddpg.discrete_actions:
                action = discrete_action
            else:
                action = cont_actions

            # Save the frame for this step and append to frames list
            frame = env.render()
            frames.append(_label_with_episode_number(frame, episode_num=ep))

            # Take action in environment
            state, reward, termination, truncation, info = env.step(action)

            # Save agent's reward for this step in this episode
            for agent_id, r in reward.items():
                agent_reward[agent_id] += r

            # Determine total score for the episode and then append to rewards list
            score = sum(agent_reward.values())

            # Stop episode if any agents have terminated
            if any(truncation.values()) or any(termination.values()):
                break

        rewards.append(score)

        # Record agent specific episodic reward for each agent
        for agent_id in agent_ids:
            indi_agent_rewards[agent_id].append(agent_reward[agent_id])

        print("-" * 15, f"Episode: {ep}", "-" * 15)
        print("Episodic Reward: ", rewards[-1])
        for agent_id, reward_list in indi_agent_rewards.items():
            print(f"{agent_id} reward: {reward_list[-1]}")
    env.close()

    # Save the gif to specified path
    gif_path = "./videos/"
    os.makedirs(gif_path, exist_ok=True)
    imageio.mimwrite(
        os.path.join("./videos/", "space_invaders.gif"), frames, duration=10
    )

System info

(py38) skr@skr-B650M-Pro-RS-WiFi:~/PettingZoo$ pip list
Package Version

accelerate 0.18.0
agilerl 0.1.19
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
async-timeout 5.0.1
atari-py 0.2.9
AutoROM 0.6.1
AutoROM.accept-rom-license 0.6.1
blinker 1.8.2
cachetools 5.5.0
certifi 2024.8.30
cffi 1.17.1
cfgv 3.4.0
charset-normalizer 3.4.0
chess 1.7.0
click 8.1.7
cloudpickle 1.2.2
contourpy 1.1.1
cycler 0.12.1
dill 0.3.9
distlib 0.3.9
docker-pycreds 0.4.0
Farama-Notifications 0.0.4
fastrand 1.8.0
filelock 3.16.1
Flask 3.0.3
flatten-dict 0.4.2
fonttools 4.55.3
fsspec 2024.10.0
future 1.0.0
gitdb 4.0.11
GitPython 3.1.43
google-api-core 2.24.0
google-auth 2.37.0
google-cloud-core 2.4.1
google-cloud-storage 2.19.0
google-crc32c 1.5.0
google-resumable-media 2.7.2
googleapis-common-protos 1.66.0
gym 0.14.0
gym-notices 0.0.8
gymnasium 0.28.1
h5py 3.11.0
hanabi-learning-environment 0.0.4
huggingface-hub 0.27.0
hydra-core 1.3.2
identify 2.6.1
idna 3.10
imageio 2.35.1
importlib_metadata 8.5.0
importlib_resources 6.4.5
itsdangerous 2.2.0
jax-jumpy 1.0.0
Jinja2 3.1.4
kiwisolver 1.4.7
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.7.5
mdurl 0.1.2
minari 0.4.3
mpmath 1.3.0
multi-agent-ale-py 0.1.11
multiagent 0.0.1
networkx 3.1
nodeenv 1.9.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvtx-cu12 12.1.105
omegaconf 2.3.0
opencv-python 4.10.0.84
packaging 24.2
pathtools 0.1.2
pettingzoo 1.23.1
pillow 10.4.0
pip 24.2
platformdirs 4.3.6
portion 2.6.0
pre-commit 3.5.0
proto-plus 1.25.0
protobuf 4.25.5
psutil 6.1.1
pyasn1 0.6.1
pyasn1_modules 0.4.1
pycparser 2.22
pygame 2.3.0
pyglet 1.3.2
Pygments 2.18.0
pyparsing 3.1.4
python-dateutil 2.9.0.post0
PyYAML 6.0.2
redis 4.6.0
regex 2024.11.6
requests 2.32.3
rich 13.9.4
rlcard 1.0.5
rsa 4.9
safetensors 0.4.5
scipy 1.10.1
sentry-sdk 2.19.2
setproctitle 1.3.4
setuptools 75.1.0
shellingham 1.5.4
six 1.16.0
smdv 0.1.1
smmap 5.0.1
sortedcontainers 2.4.0
SuperSuit 3.9.3
sympy 1.13.3
termcolor 1.1.0
tinyscaler 1.2.8
tokenizers 0.20.3
torch 2.4.1
torchvision 0.19.1
tqdm 4.67.1
transformers 4.46.3
triton 3.0.0
typer 0.15.1
typing_extensions 4.12.2
urllib3 2.2.3
virtualenv 20.28.0
wandb 0.13.11
websockets 13.1
Werkzeug 3.0.6
wheel 0.44.0
zipp 3.20.2

Additional context

Python3.8

Checklist

I have checked that there is no similar issue in the repo

skr3178 added the bug Something isn't working label Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

render_agilerl_maddpg.py run error #1254

render_agilerl_maddpg.py run error #1254

skr3178 commented Dec 19, 2024

render_agilerl_maddpg.py run error #1254

render_agilerl_maddpg.py run error #1254

Comments

skr3178 commented Dec 19, 2024

Describe the bug

Code example

System info

Additional context

Checklist