-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Dispatch for DDPG loss module #1215
Conversation
Open QuestionsThis pull request raises some open questions regarding the patch. Some of these questions may also apply to other loss modules.
class DQNLoss(LossModule):
...
@dispatch(
source=[
"observation",
("next", "observation"),
"action",
("next", "reward"),
("next", "done"),
],
dest=["loss"],
)
def forward(self, tensordict: TensorDictBase) -> TensorDict:
... Possible solutions include:
@property
def out_keys(self):
outs = ["loss_objective"]
if self.critic_coef:
outs.append("loss_critic")
if self.entropy_bonus:
outs.append("entropy")
outs.append("loss_entropy")
return outs Do you see any problems with that approach? |
You can ignore this one. |
I'm open to bc-breaking change in this case |
Two things here: value = data.get(key, None)
if value is None:
foo()
else:
bar() In this case, we could have the user call
which will populate the tensordict with that key, and if not the resulting value will be None. In summary: |
Yes I think they should be part of the |
Yes we should be using a property! |
For example ("next", "reward") or better ("next", self.tensor_keys.reward) is used in all advantages. |
That should be customisable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, really fancy!
On a high level: do you think that the in_keys will be recyclable across modules or will we need to re-code it every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, really fancy!
On a high level: do you think that the in_keys will be recyclable across modules or will we need to re-code it every time?
No sure if there is really something common to all loss modules here. For me it seems to dependent on what actually happens inside the loss to recycle something. But I will think about it, maybe I've got some idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final review
Description
Enable dispatching arguments for the
.forward()
method of the DDPG loss module.Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!