-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Agent Abstraction [Part 1 of Multi-Agent RL] #1169
Conversation
@@ -149,6 +149,23 @@ def register_storage(cls, to_register=None, *, name: Optional[str] = None): | |||
def get_storage(cls, name: str): | |||
return cls._get_impl("storage", name) | |||
|
|||
@classmethod | |||
def register_agent_access_mgr( | |||
cls, to_register=None, *, name: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you describe what to_register means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what is the use case of name being None? Is this for single agent trainer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am following the same model as the other registry functions for policies, storage, updaters, etc. This is to be used as a class decorator. to_register
is the class it is registering and name
is the key to refer to the updater. The registry is being used in the exact same way for the AgentAccessMgr as the other objects in the registry. I also added a doc string which hopefully clarifies usage more too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with these changes. Some minor comments. Please make sure there are no training regressions with either DDPPO or VER. No need for new tests since this does not add functionality.
habitat-baselines/habitat_baselines/config/default_structured_configs.py
Outdated
Show resolved
Hide resolved
@abstractmethod | ||
def after_update(self) -> None: | ||
""" | ||
Called after the updater has called `update` and the rollout `after_update` is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like the method is automatically called after the conditions are met while I think you mean this this method NEEDS to be called after the conditions are met. Am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I updated docstring.
self.agent.optimizer.load_state_dict(resume_state["optim_state"]) | ||
if self._is_distributed: | ||
self.agent.init_distributed(find_unused_params=False) # type: ignore | ||
logger.add_filehandler(self.config.habitat_baselines.log_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is add_filehandler added here and not at the entry point of training, much earlier in the code ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually occurring at the same point in the code, the previous code in self._setup_actor_critic_agent
is now happening in self._create_agent
. I think this is the natural place in the code to manage aspects of the config such as creating directories and log files it requires.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair. The directories might not exist before then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks good to me. Thanks, @ASzot!
Do I understand correctly that with these changes all the previous code should be runnable/trainable? Should we have a green CI before merging this PR?
habitat-baselines/habitat_baselines/config/default_structured_configs.py
Outdated
Show resolved
Hide resolved
habitat-baselines/habitat_baselines/rl/ppo/single_agent_access_mgr.py
Outdated
Show resolved
Hide resolved
habitat-baselines/habitat_baselines/rl/ppo/single_agent_access_mgr.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure the tests pass correctly. Thanks for doing this!
* Added TrainerAgent * VER changes * Consolidated into Agent * Integrated VER trainer * Fixed CI * Added pop play trainer * Refactored agent access interface * Functioning * Pop play running fine * Pre-commit fixes * Updated naming * removed multi-agent * Removed unnecessary MA file * Added docstring * PR comments * Syntax in VER * Addressed PR comments * CI tests * Swap to val for tests * Fixed test
* Added TrainerAgent * VER changes * Consolidated into Agent * Integrated VER trainer * Fixed CI * Added pop play trainer * Refactored agent access interface * Functioning * Pop play running fine * Pre-commit fixes * Updated naming * removed multi-agent * Removed unnecessary MA file * Added docstring * PR comments * Syntax in VER * Addressed PR comments * CI tests * Swap to val for tests * Fixed test
Motivation and Context
Refactored trainer code to interface with a new agent abstraction which wraps the rollouts, policy, and updaters. This will enable multi-agent RL in Part 2 of the Multi-Agent RL PR.
How Has This Been Tested
Integration tests with training policies.
Types of changes