-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement on lab #1120
Conversation
@vincentpierre Another perf solution for this is to materialize the HydraConfig back into a slot enabled dataclass, and that would enable extremely fast access. |
You have a reference link on how to do this ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are measures everywhere that access the Hydra config. If we follow your suggestion of Findings: If you have to access a configuration field several times per second, please store the value somewhere instead of accessing the configuration.
this will require many more changes? (for example see all the reads from the config in this file).
@Skylion007
|
Many of the actions also constantly read from the Hydra configs. |
It will be faster because slot enabled dataclass (or attrs) access is the fastest lookup in Python (except for caching them as variables of course). It's not necessarily faster for one time access, and it will have some overhead, but amortized it will be faster than accessing a normal Python dictionary and therefore faster than accessing an OmegaConf one. See https://threeofwands.com/attrs-ii-slots/ |
@vincentpierre Python also has some other caching mechanism implemented on Python classes to make lookup more efficient than a naive dict implementation so it will definitely be more efficient to access. |
class GymConfig(HabitatBaseConfig): | ||
auto_name: str = "" | ||
obs_keys: Optional[List[str]] = None | ||
action_keys: Optional[List[str]] = None | ||
achieved_goal_keys: List = field(default_factory=list) | ||
desired_goal_keys: List[str] = field(default_factory=list) | ||
achieved_goal_keys: List = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still need to use an attrs.factory or the field alias unfortunately just like dataclasses: https://www.attrs.org/en/stable/api.html#attr.ib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, you can just attr.Factory as syntactic sugar: https://www.attrs.org/en/stable/examples.html#defaults
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like I need to. I looked into the configuration that gets passed to run.py and it looks like all the fields (even lists) are correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentpierre It works for now, but it's bugprone behvaior. We have rule against it explicitly, but they don't work on attrs:
B006: Do not use mutable data structures for argument defaults. They are created during function definition time. All calls to the function reuse this one instance of that data structure, persisting changes between them.
^ this also applies to dataclasses and attrs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentpierre, it's less likely to be an issue since it would primarily be an issue with rewriting the configs, which we do sparingly, but it could lead to hard to spot bugs down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quote from this comment
Note that when OmegaConf uses the default value to create a ListConfig, it creates a separate container (i.e., adding element to the ListConfig will not add elements to the original Python list it was created from). As a result, it should actually be safe to provide a mutable list as default value (as long as you don't use that class elsewhere...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentpierre Still really yucky if you want to use the dataclasses without Hydra (like for UnitTesting). Also the attrs.slot thing doesn't buy us a whole lot of speed unless we are instantiating the hydra config back into the dataclass.
A better method for speedup is just to use instationate the config when storing it in the class. This converts it into the dataclass which will be faster to access than the DictConfig with or without slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincentpierre You can even just make it pass all the config keys to the class like so: https://hydra.cc/docs/advanced/instantiate_objects/overview/
@@ -233,6 +242,9 @@ class ArmRelPosKinematicReducedActionStretch(RobotAction): | |||
def __init__(self, *args, config, sim: RearrangeSim, **kwargs): | |||
super().__init__(*args, config=config, sim=sim, **kwargs) | |||
self.last_arm_action = None | |||
self._delta_pos_limit = self._config.delta_pos_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code refactoring could be safe here by just converting the omega config to the dataclass here and access those members, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave this refactor in place, just because I think it makes more sense.
I instantiated the configs for the RobotActions, but leaving the measures and sensors as is because they all use the config differently Trying this broke everything. I just want to merge this now.
ceeda41
to
ea9c00a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the issues with attr.Factory, I would actually prefer if they remained dataclasses. Either way this looks god to me now.
* a few % * adding anoter percent or two * speedup grasp * optimizing rearrange_sim * adding the measures * moving to attr.s * converting the action classes to be faster * typo * resolving issues --------- Co-authored-by: vincentpierre <vincentpierre@users.noreply.github.com>
* a few % * adding anoter percent or two * speedup grasp * optimizing rearrange_sim * adding the measures * moving to attr.s * converting the action classes to be faster * typo * resolving issues --------- Co-authored-by: vincentpierre <vincentpierre@users.noreply.github.com>
Motivation and Context
It seems checking if a key is in a Hydra DictConfig is rather slow. We cache the agent index to avoid having to recompute it every time.
Findings: If you have to access a configuration field several times per second, please store the value somewhere instead of accessing the configuration. The configuration will not change, I promise.
1 process interact benchmark
Before regression of November 15 2022 : 168.11183922196446
After regression of November 15 2022 : 134.35855893433697
After first commit of this PR : 158.04943632647434
After second commit of this PR : 163.95664144856636
How Has This Been Tested
Types of changes
Checklist