This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Continuous Action Apace #36

Open

MXD6 opened this issue Nov 16, 2021 · 0 comments

MXD6 commented Nov 16, 2021

Hello Author:
How can I apply vtrace to continuous action space?
I take the policy_logits as the normal distribution.

import torch.distributions as tdist

def __init__(self, observation_shape, num_actions, use_lstm=False):
     ...
     self.policy = nn.Linear(core_output_size, 2)
     ...
def forward(self, inputs, core_state=()):
     ...
     policy_logits = self.policy(core_output) 
     mu = policy_logits[0]
     sigma = policy_logits[1]
     action = tdist.Normal(mu, sigma).sample(1)
     ...

The text was updated successfully, but these errors were encountered:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.