Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector #1159

NeoBerekov · 2024-06-07T15:28:16Z

Tag Request: Please add the tag documentation request

Hi Tianshou Team,

I am currently working on a gymnasium DQN project with action masking and noticed that in Tianshou, all action masks need to be added to the Batch as a "mask" item so that DQNPolicy can handle the masking automatically. To achieve this, I tried passing a preprocess_fn hook when constructing the Collector class, as described in the documentation. However, I found the documentation a bit unclear and couldn't find any relevant examples in the referenced file (test/base/test_collector.py).

The documentation states:

The "preprocess_fn" is a function called before the data has been added to the buffer with batch format. It will receive only "obs" and "env_id" when the collector resets the environment, and will receive the keys "obs_next", "rew", "terminated", "truncated, "info", "policy" and "env_id" in a normal env step. Alternatively, it may also accept the keys "obs_next", "rew", "done", "info", "policy" and "env_id". It returns either a dict or a :class:`~tianshou.data.Batch` with the modified keys and values. Examples are in "test/base/test_collector.py".

In my current DQN project, the observation space is a dictionary that includes an action mask, which complicates things further:

self.observation_space = gym.spaces.Dict({
    'local_obs': gym.spaces.Box(low=-1, high=5000, shape=(9, local_obs_window, local_obs_window), dtype=np.int16),
    'global_obs': gym.spaces.Box(low=0, high=5000, shape=(9, map_size[0], map_size[1]), dtype=np.int16),
    'action_mask': gym.spaces.MultiBinary(11)
})

Do you have any plans to improve the documentation or provide examples regarding this issue? Or should the mask be added to the Batch in a different way other than using preprocess_fn, which I might have overlooked?

Here are the versions of the relevant libraries I am using:
Tianshou version: 0.5.0
Gym version: 0.29.1

Thank you for your assistance.

Best regards

The text was updated successfully, but these errors were encountered:

MischaPanch · 2024-09-08T10:28:29Z

Hi @NeoBerekov , sorry for the late reply. You are using a very old version of Tianshou which is no longer supported.

Recently new functionality on the master branch was added that replaces preprocess_fn. There is quite extensive documentation on it in the code, pls have a look at the StepHook and EpisodeHook and on how they are used in Collector. Also overall the code for collect became much more readable and should be easier to understand now. Could you check whether these hooks fit your needs?

MischaPanch added the question Further information is requested label Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector #1159

Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector #1159

NeoBerekov commented Jun 7, 2024 •

edited

Loading

MischaPanch commented Sep 8, 2024

Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector #1159

Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector #1159

Comments

NeoBerekov commented Jun 7, 2024 • edited Loading

MischaPanch commented Sep 8, 2024

NeoBerekov commented Jun 7, 2024 •

edited

Loading