-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Transform that stacks data for agents with identical specs #2566
Comments
Have you taken a look at the If you are using this setting, then imo there's an issue in the implementation of the grouping of agents in |
@thomasbbrunner, the One of the reasons why But I'm happy to change the behavior if there is good reason to do so. @vmoens, wdyt? |
Ah, I see! Thanks for the explanation. It seems that the
Personally, I don't see the benefit of grouping heterogeneous agents. Maybe that makes sense in your use-case, but I'd argue that there is no difference between having one group for each agent and a group containing sub-groups of heterogenous agents. I quite like the default behavior of Would be interested in hearing more about your use-case! |
It would be nice if the behavior of all TorchRL environments would be consistent with each other. It sometimes feels like multi-agent is not part of the "core" TorchRL capabilities and I'm hoping we can change that! |
I agree that we should make environments consistent with each other when possible. At the time, I didn't think that putting the Unity env agents under separate keys would be a significant inconsistency with other TorchRL envs (at least compared to VMAS, PettingZoo, and OpenSpiel)--it just seemed like the right choice because it's more consistent with the underlying @vmoens, do you agree with this?
That sounds like a good idea to me. Feel free to submit an issue
I don't either, but then again, I'm fairly new here. Maybe I took |
I'm sorry this is the impression you have, @matteobettini gave a great deal of effort in unifying MARL APIs, but if there are inconsistencies we should address them!
I 100% agree on this. Now RE the "consistentcy" problem highlighted by @thomasbbrunner, as I said earlier we should make sure that things can be made uniform at a low cost. env = UnityMLAgentsEnv(...)
transform = GroupMARLAgents(MARLGroups.ONE_GROUP_PER_AGENT)
env = env.append_transform(transform) and then, internally, we make sure that the transform does its job for every MARL library we support. Would that make sense? nb: Using metaclasses, we could even pass directly a group argument in each wrapper that automatically append the transform if required. |
Hey guys so I think there might have been some general confusion about the multi-agent environment API here. If you can, whatch this section of this video to understand how it works https://youtu.be/1tOIMgJf_VQ?si=1RJ7PGD3s5--hI2o&t=1235 Here is a recap: The choice of which agents should be stacked together and which kept separate (what we call grouping) has to be 100% up to the user. That is why multi-agent envs should take a group map that maps groups to agents in that group. The Then environments have a default behaviour for building the map when nothing is passed. This to me was better than GROUP_HOMOGENEOUS because it is difficult to define a sensible and consistent bhavior for that. But I am open to discuss this. The choice of the grouping needs to be passed at environment construction. If MLAgents is overfitted to one specific choice I believe we should extend it. Modifying the grouping in a transform later is very inefficient and I think should be avoided MLAgents also has a concept of an agent type or something similar which I remember would be great for the default grouping strategy |
Maybe i am missing what the |
Motivation
Some multi-agent environments, like
VmasEnv
, stack all of the tensors for observations, rewards, etc. for different agents that have identical specs. For instance, in one of these stacked environments, if there are 2 agents that each have 8 observations, the observation spec might look like this:In contrast, other environments, like
UnityMLAgentsEnv
, have separate keys for each agent, even if the agents' specs are identical. For instance, with 2 agents that each have 8 observations, the observation spec might look like this:It is not easy to apply the same training script to two environments that use these two different formats. For instance, applying the multi-agent PPO tutorial to a Unity env is not straightforward.
Solution
If we had an environment transform that could stack all the data from different keys, we could convert an environment that uses the unstacked format into an environment that uses the stacked format. Then it should be straightforward to use the same (or almost the same) training script on the two different environments.
Alternatives
Additional context
Checklist
The text was updated successfully, but these errors were encountered: