Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement on lab #1120

Merged
merged 10 commits into from
Feb 8, 2023

Conversation

vincentpierre
Copy link
Contributor

@vincentpierre vincentpierre commented Feb 3, 2023

Motivation and Context

It seems checking if a key is in a Hydra DictConfig is rather slow. We cache the agent index to avoid having to recompute it every time.

Findings: If you have to access a configuration field several times per second, please store the value somewhere instead of accessing the configuration. The configuration will not change, I promise.

1 process interact benchmark
Before regression of November 15 2022 : 168.11183922196446
After regression of November 15 2022 : 134.35855893433697
After first commit of this PR : 158.04943632647434
After second commit of this PR : 163.95664144856636

How Has This Been Tested

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@vincentpierre vincentpierre self-assigned this Feb 3, 2023
@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 3, 2023
@Skylion007
Copy link
Contributor

@vincentpierre Another perf solution for this is to materialize the HydraConfig back into a slot enabled dataclass, and that would enable extremely fast access.

@vincentpierre
Copy link
Contributor Author

Another perf solution for this is to materialize the HydraConfig back into a slot enabled dataclass, and that would enable extremely fast access.

You have a reference link on how to do this ?

@Skylion007
Copy link
Contributor

Copy link
Contributor

@ASzot ASzot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are measures everywhere that access the Hydra config. If we follow your suggestion of Findings: If you have to access a configuration field several times per second, please store the value somewhere instead of accessing the configuration. this will require many more changes? (for example see all the reads from the config in this file).

@vincentpierre
Copy link
Contributor Author

@Skylion007
The documentation does not say one is faster than the other or what the trade offs are :

        :param structured_config_mode: Specify how Structured Configs (DictConfigs backed by a dataclass) are handled.
            - By default (``structured_config_mode=SCMode.DICT``) structured configs are converted to plain dicts.
            - If ``structured_config_mode=SCMode.DICT_CONFIG``, structured config nodes will remain as DictConfig.
            - If ``structured_config_mode=SCMode.INSTANTIATE``, this function will instantiate structured configs
              (DictConfigs backed by a dataclass), by creating an instance of the underlying dataclass.
              ```
 We often need to edit the configs at runtime and I am afraid converting the configs will cause issues within the code.

@ASzot
Copy link
Contributor

ASzot commented Feb 4, 2023

Many of the actions also constantly read from the Hydra configs.

@Skylion007
Copy link
Contributor

@Skylion007 The documentation does not say one is faster than the other or what the trade offs are :

        :param structured_config_mode: Specify how Structured Configs (DictConfigs backed by a dataclass) are handled.
            - By default (``structured_config_mode=SCMode.DICT``) structured configs are converted to plain dicts.
            - If ``structured_config_mode=SCMode.DICT_CONFIG``, structured config nodes will remain as DictConfig.
            - If ``structured_config_mode=SCMode.INSTANTIATE``, this function will instantiate structured configs
              (DictConfigs backed by a dataclass), by creating an instance of the underlying dataclass.
              ```
 We often need to edit the configs at runtime and I am afraid converting the configs will cause issues within the code.

It will be faster because slot enabled dataclass (or attrs) access is the fastest lookup in Python (except for caching them as variables of course). It's not necessarily faster for one time access, and it will have some overhead, but amortized it will be faster than accessing a normal Python dictionary and therefore faster than accessing an OmegaConf one. See https://threeofwands.com/attrs-ii-slots/

@Skylion007
Copy link
Contributor

@vincentpierre Python also has some other caching mechanism implemented on Python classes to make lookup more efficient than a naive dict implementation so it will definitely be more efficient to access.

class GymConfig(HabitatBaseConfig):
auto_name: str = ""
obs_keys: Optional[List[str]] = None
action_keys: Optional[List[str]] = None
achieved_goal_keys: List = field(default_factory=list)
desired_goal_keys: List[str] = field(default_factory=list)
achieved_goal_keys: List = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need to use an attrs.factory or the field alias unfortunately just like dataclasses: https://www.attrs.org/en/stable/api.html#attr.ib

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, you can just attr.Factory as syntactic sugar: https://www.attrs.org/en/stable/examples.html#defaults

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like I need to. I looked into the configuration that gets passed to run.py and it looks like all the fields (even lists) are correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentpierre It works for now, but it's bugprone behvaior. We have rule against it explicitly, but they don't work on attrs:

B006: Do not use mutable data structures for argument defaults. They are created during function definition time. All calls to the function reuse this one instance of that data structure, persisting changes between them.

^ this also applies to dataclasses and attrs

Copy link
Contributor

@Skylion007 Skylion007 Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentpierre, it's less likely to be an issue since it would primarily be an issue with rewriting the configs, which we do sparingly, but it could lead to hard to spot bugs down the line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote from this comment

Note that when OmegaConf uses the default value to create a ListConfig, it creates a separate container (i.e., adding element to the ListConfig will not add elements to the original Python list it was created from). As a result, it should actually be safe to provide a mutable list as default value (as long as you don't use that class elsewhere...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentpierre Still really yucky if you want to use the dataclasses without Hydra (like for UnitTesting). Also the attrs.slot thing doesn't buy us a whole lot of speed unless we are instantiating the hydra config back into the dataclass.

A better method for speedup is just to use instationate the config when storing it in the class. This converts it into the dataclass which will be faster to access than the DictConfig with or without slots.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentpierre You can even just make it pass all the config keys to the class like so: https://hydra.cc/docs/advanced/instantiate_objects/overview/

@vincentpierre vincentpierre requested a review from ASzot February 6, 2023 19:53
@@ -233,6 +242,9 @@ class ArmRelPosKinematicReducedActionStretch(RobotAction):
def __init__(self, *args, config, sim: RearrangeSim, **kwargs):
super().__init__(*args, config=config, sim=sim, **kwargs)
self.last_arm_action = None
self._delta_pos_limit = self._config.delta_pos_limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of code refactoring could be safe here by just converting the omega config to the dataclass here and access those members, no?

Copy link
Contributor Author

@vincentpierre vincentpierre Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave this refactor in place, just because I think it makes more sense.
I instantiated the configs for the RobotActions, but leaving the measures and sensors as is because they all use the config differently Trying this broke everything. I just want to merge this now.

Copy link
Contributor

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the issues with attr.Factory, I would actually prefer if they remained dataclasses. Either way this looks god to me now.

rpartsey added a commit that referenced this pull request Mar 3, 2023
dannymcy pushed a commit to dannymcy/habitat-lab that referenced this pull request Jul 8, 2024
* a few %

* adding anoter percent or two

* speedup grasp

* optimizing rearrange_sim

* adding the measures

* moving to attr.s

* converting the action classes to be faster

* typo

* resolving issues

---------

Co-authored-by: vincentpierre <vincentpierre@users.noreply.github.com>
dannymcy pushed a commit to dannymcy/habitat-lab that referenced this pull request Jul 8, 2024
HHYHRHY pushed a commit to SgtVincent/EMOS that referenced this pull request Aug 31, 2024
* a few %

* adding anoter percent or two

* speedup grasp

* optimizing rearrange_sim

* adding the measures

* moving to attr.s

* converting the action classes to be faster

* typo

* resolving issues

---------

Co-authored-by: vincentpierre <vincentpierre@users.noreply.github.com>
HHYHRHY pushed a commit to SgtVincent/EMOS that referenced this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants