Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] target_value_network_params initialization bug in convert_to_functional() #2523

Open
3 tasks done
matinmoezzi opened this issue Oct 28, 2024 · 0 comments
Open
3 tasks done
Assignees
Labels
bug Something isn't working

Comments

@matinmoezzi
Copy link

matinmoezzi commented Oct 28, 2024

Bug Description

  1. During the initialization of LossModule and execution of convert_to_functional(), the first layer parameters are set as UninitializedParameter(shape=torch.Size(-1)). Consequently, when target_value_network_params is cloned from this, it becomes Parameter(torch.Size(0)), leading to the error: RuntimeError: mat2 must be a matrix, got 1-D tensor.

  2. In the DiscreteCQLLoss class, when calling value_estimate, the params argument should reference target_params instead, as indicated in this line of code.

To Reproduce

Steps to reproduce the behavior.

import torch
from tensordict import TensorDict
from torchrl.data import OneHotDiscreteTensorSpec
from torchrl.modules import DistributionalQValueActor, MLP
from torchrl.objectives import DistributionalDQNLoss

nbins = 3
batch_size = 5
action_dim = 2
module = MLP(out_features=(nbins, action_dim), depth=2)
action_spec = OneHotDiscreteTensorSpec(action_dim)
qvalue_actor = DistributionalQValueActor(
    module=module,
    spec=action_spec,
    support=torch.arange(nbins),
)

loss_module = DistributionalDQNLoss(
    qvalue_actor,
    gamma=0.99,
    delay_value=True,
)
td = TensorDict(
    {
        "observation": torch.randn(batch_size, 4),
        "action": torch.nn.functional.one_hot(
            torch.randint(0, action_dim, (batch_size,)), action_dim
        ).float(),
        "next": {
            "observation": torch.randn(batch_size, 4),
            "reward": torch.randn(batch_size, 1),
            "done": torch.zeros(batch_size, 1, dtype=torch.bool),
        },
    },
    batch_size=[batch_size],
)
loss = loss_module(td)
print("Computed loss:", loss)
File ../../python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat2 must be a matrix, got 1-D tensor

System info

python=3.10.15
torchrl=0.5.0
torch=2.4.1

import torchrl, numpy, sys
print(torchrl.__version__, numpy.__version__, sys.version, sys.platform)

0.5.0 2.1.2 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:20) [Clang 17.0.6 ] darwin

Possible fixes

Specify input features in MLP module.
module = MLP(in_features=4, out_features=(nbins, action_dim), depth=2)
This prevents the module parameters from being UninitializedParameter

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

Thanks to @BY571 and @vmoens

@matinmoezzi matinmoezzi added the bug Something isn't working label Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants