-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamics randomization for MuJoCo #51
Conversation
Reviewers, this PR was brought from rllab, so all of it has been reviewed previously. However, if you want to request a change, feel free to do so. Otherwise, please approve to merge this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fixup comments and submit
@@ -0,0 +1,21 @@ | |||
from rllab.envs.mujoco import SwimmerEnv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put this file in tests/
@@ -0,0 +1,4 @@ | |||
from rllab.envs.mujoco.dynamics_randomization.randomized_env import randomize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please name the package randomization
, since it's not actually specific to dynamics.
A simple data structure consisting of a list of variation objects was implemented. Each variation object is an instance of the Variation class that works as a container for each of the fields used to randomized a dynamic parameter within the simulation environment. This list of variations is further tested in script test_dynamics_rand.py to verify that fields within each variation can be set and get.
A setter for each field in the Variation class is used now instead of a constructor containing all fields as parameters. This allows a modular setting of fields for different configuration scenarios for dynamics randomization. To define the methods and distributions, two enumeration classes were created: VariationMethods and VariationDistributions.
Implement basic feature of a wrappered environment, which choose new randomized physics params in mujoco on every reset().
All the python files have file names in lowercase. To keep this standard, refactor RandomizeEnv.py to randomized_env.py.
Add error handling in constructor and reset(). Remove variables that doesn't depend on self._wrapped_env.sim in reset(). Reuse MODEL_DIR in mujoco_env.py Alphabetize imports.
Fix wrong AttributeError raising in constructor when there is element in xml. Add error handling towards the Variation.range attribute. When the range shape isn't the same as the attribute value shape, raise an AttributeError.
Finish the thread when the simulaton is interrupted.
Other miscellaneous changes include: - Rename classes VariationsMethods and VariationDistributions to VariationsMethod and VariationDistribution respectively. - The parsing of the XML string and fetch of the dynamic parameters to randomize is now done within the worker thread. - The file randomize_env.py was renamed to randomized_env.py
Before this commit, when there is an error raised when loading the xml object, only the worker_thread terminates. This commit fixes this bug by terminating all the processes. Fix some typo in the last commit.
Create an 10-length queue in MujocoModelGenerator to store the mujoco_models.
- Renamed classes VariationMethod and VariationDistribution to Method and Distribution. - Enforced the use of methods exclusive for uniform or normal distributions in the fluent interface pattern provided in class Variations by splitting the class into VariationsBase, VariationsGaussian and VariationsUniform. - Included the module os.path.osp in rllab.envs.mujoco modules. - Changed error types and improved messages for two errors in class MujocoModelGenerator.
- Delete unused threading.Event in MujocoModelGenerator. - Correct error types in MujocoModelGenerator. - Renamed classes RandomizedEnv to RandomizedDynamicsEnv. - Delete wrong try-except in RandomizedDynamicsEnv. - Use randomize_dynamics() in the launcher. - Format method chains onto multiple lines. - Correct wrong param name in Variation.
Solve the problem with v.elem=e, which calls the setter method in Variation. Replace this with a local cache of elements. Same with v.default in MujocoModelGenerator.
- Remove the setter in Variation - Add check of parameter shape in MujocoModelGenerator - Fix some typo
- Add more detailed information in handling the shape of the sampled value with the default value - Add timeout in the Queue.get() in MujocoModelGenerator so the main thread will catch error raised in worker thread
The cached property action_space found in mujoco_env.py produces an error in Linux for dynamics randomization. The idea behind the cached property is to avoid doing an expensive computation several times, so for regular execution, action_space is obtained from the model that is used for the entire training once, improving the performance. However, for dynamics randomization there's a new model for each episode, and that requires that the action_space is updated accordingly, but that does not happen because it's cached. To update action_space and not make an invasive change, the attribute is invalidated for each reset in the RandomizedEnv class.
- Move the dynamics_randomization package to rllab.envs.mujoco. - Delete tosser.xml, use xml in rllab/vendor/mujoco_model for test - The old test_dynamics_rand.py only tests for the Variations API, so rewrite it to test for both Variations API and RandomizedEnv. - Reorder imports. - Delete import os.path as osp in rllab/envs/mujoco/__init__.py. Previously added by mistake.
Package names should follow class names.
test_dynamics_rand.py is enough for testing. Remove trpo_swimmer.py
The code to initialize the variations and to generate the randomized parameters was moved into the Variations class. This will keep all the current code related to variations in the same file to improve the API of dynamics randomization, and will enable a more modular code for further features in the module.
There is a bottleneck at function load_model_from_xml from mujoco-py when using multi threading. In a single thread, a call to this function takes units of milliseconds, while in multi threading it takes tens of milliseconds. Maybe this is due to internal data structures that are required for both loading the model in the worker thread and performing the simulations in the main thread, causing the delay in load_model_from_xml and other functions that can be perceived in the cumulative time obtained by the profiler by running test_dynamics_rand.py. Due to this poor performance, the file mujoco_model_gen.py was removed since it serves no purpose now that the variations.py file contains methods to process the XML file, and the calls to obtain the randomized model are done directly in the class RandomizedEnv.
The changes in this commit include: - The modules in the __ini__.py file were sorted alphabetically. - The new line at the end of file was added in randomized_env.py. - Default values were assigned for VariationSpec, specifically for fields method, distribution, mean_std and var_range. Fields xpath, attrib, elem and default are specific to the model in the XML file provided by the user, so they cannot be default parameters. Further more, elem and default are obtained by parsing the XML file, so the user won't set them.
Alphabetize the imports order in the package
The dynamic randomization is set to single thread, so the comments contains 'MujocoModelGenerator' need to be changed or deleted.
Rename the dynamics_randmization to randomization since it's not actually specific to dynamics. Move tesrt_dynamic_rand.oy to tests/
6fc102e
to
360bef5
Compare
Add support for dynamics randomization for mujoco environment.
Includes a simple data structure consisting of a list of variation
objects, a wrapped environment of mujoco to perform dynamics
randomization.
Each variation object is an instance of the Variation class
that works as a container for each of the fields used to randomized a
dynamic parameter within the simulation environment.
The wrapper class of mujoco performs dynamics randomization on each
reset().
The data structure and the wrapper class are tested in
test_dynamics_rand.py.
Refer to: #14