Return shape of the observation space in a custom environment in OpenAI gym #746

PBerit · 2022-01-31T16:25:04Z

Hi all,

I have a custom environment in OpenAI gym with 3 discrete action variables and 3 continuous state variables and 1 discrete state variable (observation_space). My question now is what exactly does the step function have to return? I have the following code:

#%% import 
from gym import Env
from gym.spaces import Discrete, Box, Tuple, MultiDiscrete
import numpy as np


#%%
class Custom_Env(Env):

    def __init__(self):
        
       # Define the state space
       
       #State variables
       self.state_1 = 0
       self.state_2 =  0
       self.state_3 = 0
       self.state_4_currentTimeSlots = 0
       
       #Define the gym components

       self.action_space = MultiDiscrete([ 10, 10, 27 ])
                                                                             
       self.observation_space = Box(low=np.array([20, -20, 0, 0]), high=np.array([22, 250, 100, 287]), dtype=np.float32)

    def step(self, action ):

        # Update state variables
        self.state_1 = self.state_1 + action [0]
        self.state_2 = self.state_2 + action [1]
        self.state_3 = self.state_3 + action [2]

        #Calculate reward
        reward = float(self.state_1 + self.state_2 + self.state_3)
       
        #Set placeholder for info
        info = {}    
        
        #Check if it's the end of the day
        if self.state_4_currentTimeSlots >= 287:
            done = True
        if self.state_4_currentTimeSlots < 287:
            done = False       
        
        #Move to the next timeslot 
        self.state_4_currentTimeSlots +=1

        state = np.array([self.state_1,self.state_2, self.state_3, self.state_4_currentTimeSlots ])

        #Return step information
        return state, reward, done, info
        
    def render (self):
        pass
    
    def reset (self):
       self.state_1 = 21
       self.state_2 =  0
       self.state_3 = 0
       self.state_4_currentTimeSlots = 0
       state = np.array([self.state_1,self.state_2, self.state_3, self.state_4_currentTimeSlots ])

       return state

#%% Set up the environment and check it
from stable_baselines3.common.env_checker import check_env

env = Custom_Env()
# It will check your custom environment and output additional warnings if needed
check_env(env)


from stable_baselines3 import A2C
model = A2C('MlpPolicy', env, verbose=1)

print("Learning started")
model.learn(total_timesteps=10000)
print("Learning ended")

When I check the custom environment with "from stable_baselines3.common.env_checker import check_env from StableBaselines3", I get the assertion error: "AssertionError: The observation returned by the step() method does not match the given observation space". I don't understand this, because I am actually returning 4 values with "state = np.array([self.state_1,self.state_2, self.state_3, self.state_4_currentTimeSlots ])". Can you tell me, what the problem might be?

The text was updated successfully, but these errors were encountered:

Miffyli · 2022-01-31T16:28:36Z

Make sure you are returning values of correct type as well (np.float32). I recommend you print out what kind of observations (states) your network outputs, and double-check that everything is in the bounds you set.

Btw I recommend you check the tips section of docs, specifically about normalizing inputs / outputs.

Do note that we do not offer extensive tech support for per-case questions. These issues are mainly for bug reports and enhancement proposals.

PBerit · 2022-01-31T17:12:16Z

@Miffyli :Thanks a lot Miffyli for your answer. I really appreciate it.

PBerit added custom gym env Issue related to Custom Gym Env question Further information is requested labels Jan 31, 2022

PBerit closed this as completed Jan 31, 2022

qgallouedec mentioned this issue Jun 13, 2022

[Bug] check_env() output error on the reset() method, observation space not matching #932

Closed

4 tasks

FieteO mentioned this issue Mar 22, 2023

[Feature Request] Make check_env assertions in regard to observation_space more actionable #1399

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return shape of the observation space in a custom environment in OpenAI gym #746

Return shape of the observation space in a custom environment in OpenAI gym #746

PBerit commented Jan 31, 2022

Miffyli commented Jan 31, 2022

PBerit commented Jan 31, 2022

Return shape of the observation space in a custom environment in OpenAI gym #746

Return shape of the observation space in a custom environment in OpenAI gym #746

Comments

PBerit commented Jan 31, 2022

Miffyli commented Jan 31, 2022

PBerit commented Jan 31, 2022