New Agent Abstraction [Part 1 of Multi-Agent RL] #1169

ASzot · 2023-03-01T05:57:25Z

Motivation and Context

Refactored trainer code to interface with a new agent abstraction which wraps the rollouts, policy, and updaters. This will enable multi-agent RL in Part 2 of the Multi-Agent RL PR.

How Has This Been Tested

Integration tests with training policies.

Types of changes

Docs change / refactoring / dependency upgrade

xavierpuigf · 2023-03-01T18:22:28Z

habitat-baselines/habitat_baselines/common/baseline_registry.py

@@ -149,6 +149,23 @@ def register_storage(cls, to_register=None, *, name: Optional[str] = None):
    def get_storage(cls, name: str):
        return cls._get_impl("storage", name)

+    @classmethod
+    def register_agent_access_mgr(
+        cls, to_register=None, *, name: Optional[str] = None


Could you describe what to_register means?

Also, what is the use case of name being None? Is this for single agent trainer?

I am following the same model as the other registry functions for policies, storage, updaters, etc. This is to be used as a class decorator. to_register is the class it is registering and name is the key to refer to the updater. The registry is being used in the exact same way for the AgentAccessMgr as the other objects in the registry. I also added a doc string which hopefully clarifies usage more too.

vincentpierre

I am okay with these changes. Some minor comments. Please make sure there are no training regressions with either DDPPO or VER. No need for new tests since this does not add functionality.

habitat-baselines/habitat_baselines/config/default_structured_configs.py

habitat-baselines/habitat_baselines/rl/ppo/agent_access_mgr.py

vincentpierre · 2023-03-01T19:02:39Z

habitat-baselines/habitat_baselines/rl/ppo/agent_access_mgr.py

+    @abstractmethod
+    def after_update(self) -> None:
+        """
+        Called after the updater has called `update` and the rollout `after_update` is called.


This sounds like the method is automatically called after the conditions are met while I think you mean this this method NEEDS to be called after the conditions are met. Am I correct?

Yes, I updated docstring.

vincentpierre · 2023-03-01T19:03:33Z

habitat-baselines/habitat_baselines/rl/ppo/ppo_trainer.py

-            self.agent.optimizer.load_state_dict(resume_state["optim_state"])
-        if self._is_distributed:
-            self.agent.init_distributed(find_unused_params=False)  # type: ignore
+        logger.add_filehandler(self.config.habitat_baselines.log_file)


Why is add_filehandler added here and not at the entry point of training, much earlier in the code ?

It is actually occurring at the same point in the code, the previous code in self._setup_actor_critic_agent is now happening in self._create_agent. I think this is the natural place in the code to manage aspects of the config such as creating directories and log files it requires.

Ok, fair. The directories might not exist before then

habitat-baselines/habitat_baselines/rl/ppo/ppo_trainer.py

rpartsey

In general looks good to me. Thanks, @ASzot!

Do I understand correctly that with these changes all the previous code should be runnable/trainable? Should we have a green CI before merging this PR?

habitat-baselines/habitat_baselines/common/rollout_storage.py

habitat-baselines/habitat_baselines/config/default_structured_configs.py

habitat-baselines/habitat_baselines/rl/ppo/policy.py

habitat-baselines/habitat_baselines/rl/ppo/single_agent_access_mgr.py

habitat-baselines/habitat_baselines/rl/ver/ver_trainer.py

vincentpierre

Make sure the tests pass correctly. Thanks for doing this!

* Added TrainerAgent * VER changes * Consolidated into Agent * Integrated VER trainer * Fixed CI * Added pop play trainer * Refactored agent access interface * Functioning * Pop play running fine * Pre-commit fixes * Updated naming * removed multi-agent * Removed unnecessary MA file * Added docstring * PR comments * Syntax in VER * Addressed PR comments * CI tests * Swap to val for tests * Fixed test

ASzot added 14 commits February 10, 2023 22:37

Added TrainerAgent

d819a84

VER changes

b2cb48a

Consolidated into Agent

1f02d08

Integrated VER trainer

42df194

Fixed CI

273d954

Added pop play trainer

d75e07c

Refactored agent access interface

eb7ccf8

Functioning

b586b81

Pop play running fine

af68559

Merge remote-tracking branch 'origin' into multi_agent_rl

ce3029d

Pre-commit fixes

1641cb0

Updated naming

658914b

removed multi-agent

927e385

Removed unnecessary MA file

b5b164c

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Mar 1, 2023

ASzot requested review from vincentpierre, rpartsey, akshararai, mathfac and xavierpuigf March 1, 2023 06:13

xavierpuigf reviewed Mar 1, 2023

View reviewed changes

Added docstring

7e5805a

vincentpierre reviewed Mar 1, 2023

View reviewed changes

ASzot added 2 commits March 1, 2023 21:39

PR comments

5f67773

Syntax in VER

d7d13ef

rpartsey reviewed Mar 2, 2023

View reviewed changes

Merge branch 'main' into agent_access_mgr

db4ef21

vincentpierre approved these changes Mar 7, 2023

View reviewed changes

ASzot added 2 commits March 10, 2023 11:18

Merged

1b8b73b

Addressed PR comments

f2638f4

ASzot added 4 commits March 10, 2023 12:05

CI tests

d2c11df

Swap to val for tests

20033f3

Merged

432d537

Fixed test

5b2833f

ASzot merged commit 1622862 into main Mar 14, 2023

ASzot deleted the agent_access_mgr branch March 14, 2023 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Agent Abstraction [Part 1 of Multi-Agent RL] #1169

New Agent Abstraction [Part 1 of Multi-Agent RL] #1169

ASzot commented Mar 1, 2023

xavierpuigf Mar 1, 2023

xavierpuigf Mar 1, 2023

ASzot Mar 1, 2023

vincentpierre left a comment

vincentpierre Mar 1, 2023

ASzot Mar 2, 2023

vincentpierre Mar 1, 2023

ASzot Mar 2, 2023

vincentpierre Mar 2, 2023

rpartsey left a comment

vincentpierre left a comment

New Agent Abstraction [Part 1 of Multi-Agent RL] #1169

New Agent Abstraction [Part 1 of Multi-Agent RL] #1169

Conversation

ASzot commented Mar 1, 2023

Motivation and Context

How Has This Been Tested

Types of changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vincentpierre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpartsey left a comment

Choose a reason for hiding this comment

vincentpierre left a comment

Choose a reason for hiding this comment