You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been using ACME for some of my projects and ran into an issue in the implementation of distributional reinforcement learning. I thought it might be worth bringing up.
The Issue
While using the losses.categorical function, I noticed it expects both q_tm1 and q_t to be instances of DiscreteValuedDistribution:
Since q_t and q_tm1 are tensors, not instances of DiscreteValuedDistribution, this leads to an AttributeError.
Quick Questions
Is this intentional, or perhaps an oversight?
If it's intentional, what's the reasoning behind it?
If it's not, what's the best way to go about fixing it?
That's it! Would love to get some insights into this. Thanks!
Best,
Miguel Rangel
The text was updated successfully, but these errors were encountered:
FernandoRangel666
changed the title
Issue with Acme's Distributional RL Implementation
Issue with ACME's Distributional RL Implementation
Aug 30, 2023
Hi everyone,
I've been using ACME for some of my projects and ran into an issue in the implementation of distributional reinforcement learning. I thought it might be worth bringing up.
The Issue
While using the losses.categorical function, I noticed it expects both
q_tm1
andq_t
to be instances ofDiscreteValuedDistribution
:However, in
learning.py
, these variables are generated as tensors:The problem arises when the function tries to access the values and logits attributes of
q_tm1
andq_t
:Since
q_t
andq_tm1
are tensors, not instances of DiscreteValuedDistribution, this leads to an AttributeError.Quick Questions
Is this intentional, or perhaps an oversight?
If it's intentional, what's the reasoning behind it?
If it's not, what's the best way to go about fixing it?
That's it! Would love to get some insights into this. Thanks!
Best,
Miguel Rangel
The text was updated successfully, but these errors were encountered: