-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamics randomization for MuJoCo #109
Conversation
tosser = ET.parse(TOSSER_XML) | ||
|
||
var_list = VariationsList().\ | ||
add_variation(".//motor[@name='a1']", "gear", "coefficient", "uniform", (0.5, 1.5)).\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fluent interface would be more like
variations = Variations()
variations.randomize()
.attribute("gear")
.at_xpath(".//motor[@name='a1']")
.with_method(Variations.COEFFICIENT)
.sampled_from(Variations.UNIFORM)
.with_range(0.5, 1.5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's what @CatherineSue suggested but I thought it's a little verbose, although more understandable.
So, Variations contains a list of variation objects right?
And if that is the case, is the randomize() method used to create a new variation object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well they contain some way of storing variations. a list of objects seems fine. randomize() create a new internal record of a variation.
fluent interfaces are verbose but easy to read. yes, you could use a very long list of arguments, but what if the relevant arguments change based on other arguments?
for instance, if i want a uniform distribution i need a range=(min,max) argument, but if I use a guassian i need a (mean, std) argument. it gets messy very quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. @ryanjulian, please review the new implementation. Also, me and @CatherineSue discussed the implementation of the wrapper for the randomized environment and work is in progress on that side as well.
|
||
@method.setter | ||
def method(self, method): | ||
self._method = method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variation objects should be immutable (i.e. there are no setters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just noticed this. There's a trade-off between having the fluent interface pattern and having immutable properties.
In order to set the members of the Variation object without setters, we have to pass their attributes in the constructor, and that breaks the interface pattern. Otherwise, without the setter decorators and with the members that correspond to each property not previously defined, we get this kind of errors:
Traceback (most recent call last):
File "rllab/dynamics_randomization/trpo_swimmer.py", line 13, in <module>
at_xpath(".//geom[@name='torso']").\
File "/home/aigonzal/ivanWorkspace/rllab/rllab/dynamics_randomization/variation.py", line 114, in at_xpath
self._list[-1].xpath = xpath
AttributeError: can't set attribute
Another way to do it is by setting the members of the class Variation directly in Variations, but that breaks the immutability of the property.
Any advice of how to achieve this is well appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to use a dictionary to store the variation then add it to Variation class at last. But I found that we need the v.elem=e
in MujocoModelGenerator, this needs to call the elem
setter in Variation. So I didn’t think of a way to implement immutable Variation after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For how to implement it, see builder pattern.
To solve the problem with v.elem = e
, the solution is to keep a local cache of those elems in MujocoModelGenerator, rather than storing the data inside variations. That would be the proper way to do it--I just used the dict in my example as a shortcut.
Note that you can use the variation object directly as a key in your dictionary, i.e.
elem_cache = {}
# caching elems
for v in variations:
# find elem, = e
elem_cache[v] = e
# making models
for v in variations:
e = elem_cache[v]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. Done
2b3d7be
to
9a9b71f
Compare
9a9b71f
to
60ea59f
Compare
The
I have reinstalled the conda env, mujoco150 and mujoco-py to ensure |
@gonzaiva found someone with a similar error message here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, all the python files in rllab have file names in lowercase with each word separated by an underscore, so we should keep this standard with RandomizeEnv.py to randomized_env.py
import os.path as osp | ||
import numpy as np | ||
from rllab.dynamics_randomization import VariationMethods | ||
from rllab.dynamics_randomization.variation import VariationDistributions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to from rllab.dynamics_randomization import VariationDistributions
from mujoco_py import MjSim | ||
from rllab.core import Serializable | ||
|
||
MODEL_DIR = osp.abspath( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse this variable from mujoco_env.py
elif v.method == VariationMethods.ABSOLUTE: | ||
e.attrib[v.attrib] = str(c) | ||
else: | ||
raise NotImplementedError("Unknown method") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add one NotImplementedError for the if statement of v.distribution as well.
self._wrapped_env.model = load_model_from_xml(model_xml) | ||
self._wrapped_env.sim = MjSim(self._wrapped_env.model) | ||
self._wrapped_env.data = self._wrapped_env.sim.data | ||
self._wrapped_env.viewer = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only set those variables that depend on self._wrapped_env.sim
return self._wrapped_env.horizon | ||
|
||
|
||
randomize = RandomizedEnv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this line for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be here. will delete in the new commit.
@ryanjulian we're able to run our current implementation of Dynamics Randomization in OS X (tested in both @CatherineSue and @jonashen laptops), however Linux is a hard cookie to crack!
This is quite a show stopper in Linux, and it points to other bugs that may happen with mujoco in rllab. Any feedback you could give is highly appreciated. |
from rllab.envs.mujoco.mujoco_env import MODEL_DIR | ||
from rllab.core import Serializable | ||
from rllab.dynamics_randomization import VariationMethods | ||
from rllab.dynamics_randomization.variation import VariationDistributions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alphabetical order c->d->e
import os.path as osp | ||
import xml.etree.ElementTree as ET | ||
|
||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m->n
from rllab.baselines import LinearFeatureBaseline | ||
from rllab.envs.mujoco import SwimmerEnv | ||
from rllab.envs import normalize | ||
from rllab.policies import GaussianMLPPolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alphabetical order
The steps to produce the issue in Linux are:
|
930f894
to
83c2d48
Compare
3b6f4a8
to
9fb3106
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize that I don't have time today for a full review of the fluent interface, but please see my comment about your implementation of the multithreading.
I'm very curious to see benchmarks of the threaded vs non-threaded versions.
""" | ||
if self._list: | ||
self._list[-1].distribution = distribution | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. if you return self
then the API with_range()
will be available after the dot operator.
e.g.
v.randomize()
.attribute("gear")
.at_xpath("something")
.with_method(COEFFICIENT)
.sampled_from(GAUSSIAN)
.with_range(0.5, 1.5)
but .with_range()
is not valid for .sampled_from(GAUSSIAN)
. you would want something like .with_parameters((1, 0.5))
or .with_mean_std(1, 0.5)
One way to solve this would be to return an instance of some new class without the method .with_range()
, and then Python would throw an error at import time. That would be ideal but perhaps difficult to implement.
Another way is to change your API slightly to look like this
v.randomize()
.attribute(gear)
.at_xpath("something")
.with_method(COEFFICIENT)
.sampled_from(GAUSSIAN)
.with_range(0.5, 1.5)
.add()
Where add()
is void (it doesn't return anything), and the new variation record is not constructed and added to the list of variations until add() is called, e.g.
class Variations():
def randomize():
return VariationSpec(self)
class Variation():
def __init__(self, spec):
# check a bunch of conditions on spec, raise an exception if it's wrong or incomplete or overspecified
class VariationSpec():
def __init__(self, variations):
self._variations = variations
def with_xpath(xpath):
self._xpath = xpath
return self
def add():
self._variations.add(Variation(self))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this turns out too much a pain to implement, on the other hand, it might be a sign that a fluent interface is not the right fit here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Implemented the first approach by segmenting the Variations class into different classes, one base class, and two classes for Gaussian and uniform distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To remove the setter in Variation
, I changed the API using the second approach.
@@ -0,0 +1,96 @@ | |||
import os.path as osp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this should be a subpackage of rllab.envs.mujoco, as it's only relevant to MuJoCo environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Should I replace the new rllab.env.mujoco import for all packages in rllab?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're asking. Can you clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To address this issue, I added osp in the modules of the package rllab.env.mujoco, so now we can import osp as from rllab.envs.mujoco import osp
.
I noticed other modules in rllab import osp as import os.path as osp
. My question is if I should replace those imports withfrom rllab.envs.mujoco import osp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the dynamics_randomization package should be moved to rllab.envs.mujoco.dynamics_randomization?
for v in variations.get_list(): | ||
e = self._model.find(v.xpath) | ||
if not e: | ||
raise AttributeError("Can't find node in xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a ValueError, not an AttributeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your error messages would be more helpful if they told me which node could not be found in which file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
v.elem = e | ||
|
||
if v.attrib not in e.attrib: | ||
raise KeyError("Attribute doesn't exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite leading to a KeyError
, this is probably still a ValueError, not a KeyError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be more helpful if it told me which attribute could not be found on which node in which file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
self._model = etree.parse(self._file_path) | ||
|
||
for v in variations.get_list(): | ||
e = self._model.find(v.xpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful to check for multiple results here.
This is an API question: what if an XPath matches multiples nodes
Should we:
- Raise an error
- Attempt to modify all matching nodes
- Something else?
I vote for (1) but I'm interested in your opinion. The problem I have with (2) is that it is hard to tell the difference between matching multiple nodes on accident (due to a malformed XPath) versus someone legitimately wanting to address multiple nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the lxml document. The ElementTree library in lxml supports:findall()
returns a list of matching Elements, find()
efficiently returns only the first match. So if we use find(), there will be no error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
|
||
variations = Variations() | ||
variations.randomize().\ | ||
attribute("gear").\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually when people format method chains onto multiple lines, they break before the .
e.g.
variations.randomize() \
.attribute("gear")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on it.
from enum import Enum | ||
|
||
|
||
class VariationMethod(Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call this Method
since its use is clear inside the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ABSOLUTE = 2 | ||
|
||
|
||
class VariationDistribution(Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call this Distribution
since its use is clear inside the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
import atexit | ||
from queue import Queue | ||
from threading import Event | ||
from threading import Thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be important to benchmark this to ensure threads (as opposed to processes) actually speed things up.
Python has a design detail called the GIL (global interpreter lock), which means that usually only one thread inside a Python application may run at once. This does not necessarily apply to parts of Python programs which are implemented using native code (e.g. mujoco), but whether that code is blocked by the GIL is very implementation-specific.
self._mujoco_model = None | ||
# Communicates the calling thread with the worker thread by awaking | ||
# the worker thread so as to generate a new model. | ||
self._model_requested = Event() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great implementation, but I think you can make it even simpler by using the maxsize
parameter of Queue
When maxsize != 0, put()
blocks when the queue is full, until get()
is called. By default, get()
blocks when the queue is empty, until something is put()
onto the queue.
Benchmarks of running trpo_swimmermulti-thread$ cat bench_multi_itr40.txt | head -40
27602992 function calls (27506126 primitive calls) in 64.712 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
2469/1 0.108 0.000 64.868 64.868 {built-in method builtins.exec}
1 0.000 0.000 64.868 64.868 trpo_swimmer.py:1(<module>)
1 0.004 0.004 56.190 56.190 batch_polopt.py:114(train)
40 0.000 0.000 45.798 1.145 batch_polopt.py:24(obtain_samples)
40 0.000 0.000 45.796 1.145 parallel_sampler.py:98(sample_paths)
40 0.002 0.000 45.786 1.145 stateful_pool.py:121(run_collect)
320 0.293 0.001 45.761 0.143 parallel_sampler.py:92(_worker_collect_one_path)
320 1.198 0.004 45.467 0.142 utils.py:10(rollout)
160000 1.889 0.000 18.428 0.000 normalized_env.py:77(step)
160763 9.610 0.000 18.323 0.000 function_module.py:743(__call__)
160000 1.703 0.000 17.649 0.000 gaussian_mlp_policy.py:124(get_action)
205 0.003 0.000 16.644 0.081 __init__.py:1(<module>)
160000 0.132 0.000 15.236 0.000 randomized_env.py:46(step)
160000 1.980 0.000 15.104 0.000 swimmer_env.py:32(step)
40 0.001 0.000 9.903 0.248 npo.py:101(optimize_policy)
40 0.006 0.000 8.754 0.219 conjugate_gradient_optimizer.py:229(optimize)
6 0.000 0.000 8.112 1.352 ext.py:123(compile_function)
6 0.000 0.000 8.112 1.352 function.py:74(function)
6 0.000 0.000 8.109 1.351 pfunc.py:283(pfunc)
6 0.000 0.000 8.079 1.346 function_module.py:1765(orig_function)
440 0.010 0.000 6.604 0.015 conjugate_gradient_optimizer.py:49(eval)
160000 1.690 0.000 6.388 0.000 mujoco_env.py:177(forward_dynamics)
40 0.013 0.000 6.264 0.157 krylov.py:7(cg)
6 0.001 0.000 6.227 1.038 function_module.py:1428(__init__)
6 0.000 0.000 6.138 1.023 opt.py:102(__call__)
485/6 0.002 0.000 6.138 1.023 opt.py:85(optimize)
24/6 0.001 0.000 6.138 1.023 opt.py:223(apply)
487207 0.758 0.000 5.508 0.000 op.py:891(rval)
2600/70 0.013 0.000 5.354 0.076 <frozen importlib._bootstrap>:966(_find_and_load)
2600/70 0.010 0.000 5.352 0.076 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
3453/202 0.003 0.000 5.335 0.026 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
2259/137 0.010 0.000 5.297 0.039 <frozen importlib._bootstrap>:659(_load_unlocked)
538 0.003 0.000 5.248 0.010 op.py:912(make_thunk)
420 0.005 0.000 5.233 0.012 op.py:826(make_c_thunk)
2005/72 0.007 0.000 5.171 0.072 <frozen importlib._bootstrap_external>:659(exec_module) single thread$ cat bench_single_itr40.txt | head -40
27753481 function calls (27656615 primitive calls) in 55.331 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
2469/1 0.107 0.000 55.475 55.475 {built-in method builtins.exec}
1 0.000 0.000 55.475 55.475 trpo_swimmer_single_thread.py:1(<module>)
1 0.006 0.006 46.967 46.967 batch_polopt.py:114(train)
40 0.000 0.000 36.747 0.919 batch_polopt.py:24(obtain_samples)
40 0.000 0.000 36.745 0.919 parallel_sampler.py:98(sample_paths)
40 0.002 0.000 36.734 0.918 stateful_pool.py:121(run_collect)
320 0.237 0.001 36.711 0.115 parallel_sampler.py:92(_worker_collect_one_path)
320 0.946 0.003 36.474 0.114 utils.py:10(rollout)
205 0.003 0.000 16.372 0.080 __init__.py:1(<module>)
160000 1.750 0.000 14.331 0.000 normalized_env.py:77(step)
160761 7.742 0.000 14.002 0.000 function_module.py:743(__call__)
160000 1.334 0.000 12.020 0.000 gaussian_mlp_policy.py:124(get_action)
160000 0.115 0.000 11.403 0.000 randomized_env_single_thread.py:69(step)
160000 1.733 0.000 11.288 0.000 swimmer_env.py:32(step)
40 0.001 0.000 9.754 0.244 npo.py:101(optimize_policy)
40 0.006 0.000 8.689 0.217 conjugate_gradient_optimizer.py:229(optimize)
6 0.000 0.000 8.038 1.340 ext.py:123(compile_function)
6 0.000 0.000 8.038 1.340 function.py:74(function)
6 0.000 0.000 8.035 1.339 pfunc.py:283(pfunc)
6 0.000 0.000 8.006 1.334 function_module.py:1765(orig_function)
440 0.010 0.000 6.592 0.015 conjugate_gradient_optimizer.py:49(eval)
320 0.001 0.000 6.279 0.020 normalized_env.py:51(reset)
320 1.087 0.003 6.278 0.020 randomized_env_single_thread.py:36(reset)
6 0.000 0.000 6.272 1.045 function_module.py:1428(__init__)
40 0.013 0.000 6.256 0.156 krylov.py:7(cg)
6 0.000 0.000 6.046 1.008 opt.py:102(__call__)
485/6 0.002 0.000 6.046 1.008 opt.py:85(optimize)
24/6 0.001 0.000 6.046 1.008 opt.py:223(apply)
2600/70 0.013 0.000 5.351 0.076 <frozen importlib._bootstrap>:966(_find_and_load)
2600/70 0.010 0.000 5.349 0.076 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
3453/202 0.004 0.000 5.331 0.026 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
2259/137 0.010 0.000 5.290 0.039 <frozen importlib._bootstrap>:659(_load_unlocked)
2005/72 0.007 0.000 5.158 0.072 <frozen importlib._bootstrap_external>:659(exec_module)
538 0.003 0.000 5.116 0.010 op.py:912(make_thunk)
48 0.073 0.002 5.101 0.106 opt.py:2394(apply) |
from rllab.policies import GaussianMLPPolicy | ||
from rllab.dynamics_randomization import RandomizedEnv | ||
from rllab.dynamics_randomization import Variations | ||
from rllab.dynamics_randomization import VariationMethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose we make this a wrapper which looks a lot like normalized()
.
So we put these classes in rllab.envs.mujoco.dynamics_randomization
. At the bottom of dynamics_randomization.py
you can put
# dynamics_randomization.py
randomize_dynamics = RandomizedDynamicsEnv
So that users can use it in a launcher like this
# in your launcher
from rllab.envs.mujoco.dynamics_randomization import randomize_dynamics # currently RandomizedEnv
from rllab.envs.mujoco.dynamics_randomization import Variations
from rllab.envs.mujoco.dynamics_randomization import Method # currently VariationMethod
from rllab.envs.mujoco.dynamics_randomization import Distribution # currently VariationDistribution
variations = Variations()
variations.randomize()\
.at_xpath(".//geom[@name='torso']")\
.attribute("density")\
.with_method(Method.COEFFICIENT)\
.sampled_from(Distribution.UNIFORM)\
.with_range(0.5, 1.5)
env = randomize_dynamics(SwimmerEnv(), variations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, after trying the module I actually agree with some of your original naming. This module is general enough to randomize any part of an environment (good job!)
So the new vocabulary would be randomize = RandomizedEnv
@@ -0,0 +1,86 @@ | |||
<!--Author: vikash@openai.com--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed, it was only in my sample for testing purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to use it in a test, you should be able to find it directly in the mujoco_py package. Or you can use some other mujoco XML file already in our own tree (preferable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also just define the XML inline in a string at the top of the file. That's probably the easiest option.
with_method(VariationMethod.COEFFICIENT).\ | ||
sampled_from(VariationDistribution.UNIFORM).\ | ||
with_range(0.5, 1.5).\ | ||
randomize().\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally you split new variations into different lines (e.g. no calling randomize twice in one statement). This is part of why I suggested that there be an add()
method which is void to finish the statement.
See the builder pattern
|
||
@method.setter | ||
def method(self, method): | ||
self._method = method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For how to implement it, see builder pattern.
To solve the problem with v.elem = e
, the solution is to keep a local cache of those elems in MujocoModelGenerator, rather than storing the data inside variations. That would be the proper way to do it--I just used the dict in my example as a shortcut.
Note that you can use the variation object directly as a key in your dictionary, i.e.
elem_cache = {}
# caching elems
for v in variations:
# find elem, = e
elem_cache[v] = e
# making models
for v in variations:
e = elem_cache[v]
self._list[-1].distribution = distribution | ||
return self | ||
|
||
def with_range(self, low, high): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what stops me from using this method along with with_distribution()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two ideas to fix this. First one is to raise an error during the runtime. e.g.
variations = Variations()
variations.randomize() \
.attribute("gear") \
.at_xpath(".//motor[@name='a1']") \
.with_method(Method.COEFFICIENT) \
.sampled_from(Distribution.UNIFORM) \
.with_mean_std(0.5, 1.5) \
.add()
will raise
ValueError: Need to call with_range when sampled from Uniform
The current commit implements this idea.
Another idea is implement something like this
There is no with_mean_std
when user calls sampled_from(UNIFORM)
I tried using mixin to achieve this but failed. Because using inheritance will return either VariationsGaussian or VariationsUniform in sampled_from. There's no way to tell which one should be called before running time.
All the python files have file names in lowercase. To keep this standard, refactor RandomizeEnv.py to randomized_env.py.
Add error handling in constructor and reset(). Remove variables that doesn't depend on self._wrapped_env.sim in reset(). Reuse MODEL_DIR in mujoco_env.py Alphabetize imports.
Fix wrong AttributeError raising in constructor when there is element in xml. Add error handling towards the Variation.range attribute. When the range shape isn't the same as the attribute value shape, raise an AttributeError.
Finish the thread when the simulaton is interrupted.
Other miscellaneous changes include: - Rename classes VariationsMethods and VariationDistributions to VariationsMethod and VariationDistribution respectively. - The parsing of the XML string and fetch of the dynamic parameters to randomize is now done within the worker thread. - The file randomize_env.py was renamed to randomized_env.py
Before this commit, when there is an error raised when loading the xml object, only the worker_thread terminates. This commit fixes this bug by terminating all the processes. Fix some typo in the last commit.
Create an 10-length queue in MujocoModelGenerator to store the mujoco_models.
- Renamed classes VariationMethod and VariationDistribution to Method and Distribution. - Enforced the use of methods exclusive for uniform or normal distributions in the fluent interface pattern provided in class Variations by splitting the class into VariationsBase, VariationsGaussian and VariationsUniform. - Included the module os.path.osp in rllab.envs.mujoco modules. - Changed error types and improved messages for two errors in class MujocoModelGenerator.
- Delete unused threading.Event in MujocoModelGenerator. - Correct error types in MujocoModelGenerator. - Renamed classes RandomizedEnv to RandomizedDynamicsEnv. - Delete wrong try-except in RandomizedDynamicsEnv. - Use randomize_dynamics() in the launcher. - Format method chains onto multiple lines. - Correct wrong param name in Variation.
Solve the problem with v.elem=e, which calls the setter method in Variation. Replace this with a local cache of elements. Same with v.default in MujocoModelGenerator.
- Remove the setter in Variation - Add check of parameter shape in MujocoModelGenerator - Fix some typo
- Add more detailed information in handling the shape of the sampled value with the default value - Add timeout in the Queue.get() in MujocoModelGenerator so the main thread will catch error raised in worker thread
The cached property action_space found in mujoco_env.py produces an error in Linux for dynamics randomization. The idea behind the cached property is to avoid doing an expensive computation several times, so for regular execution, action_space is obtained from the model that is used for the entire training once, improving the performance. However, for dynamics randomization there's a new model for each episode, and that requires that the action_space is updated accordingly, but that does not happen because it's cached. To update action_space and not make an invasive change, the attribute is invalidated for each reset in the RandomizedEnv class.
- Move the dynamics_randomization package to rllab.envs.mujoco. - Delete tosser.xml, use xml in rllab/vendor/mujoco_model for test - The old test_dynamics_rand.py only tests for the Variations API, so rewrite it to test for both Variations API and RandomizedEnv. - Reorder imports. - Delete import os.path as osp in rllab/envs/mujoco/__init__.py. Previously added by mistake.
Package names should follow class names.
test_dynamics_rand.py is enough for testing. Remove trpo_swimmer.py
The code to initialize the variations and to generate the randomized parameters was moved into the Variations class. This will keep all the current code related to variations in the same file to improve the API of dynamics randomization, and will enable a more modular code for further features in the module.
There is a bottleneck at function load_model_from_xml from mujoco-py when using multi threading. In a single thread, a call to this function takes units of milliseconds, while in multi threading it takes tens of milliseconds. Maybe this is due to internal data structures that are required for both loading the model in the worker thread and performing the simulations in the main thread, causing the delay in load_model_from_xml and other functions that can be perceived in the cumulative time obtained by the profiler by running test_dynamics_rand.py. Due to this poor performance, the file mujoco_model_gen.py was removed since it serves no purpose now that the variations.py file contains methods to process the XML file, and the calls to obtain the randomized model are done directly in the class RandomizedEnv.
The changes in this commit include: - The modules in the __ini__.py file were sorted alphabetically. - The new line at the end of file was added in randomized_env.py. - Default values were assigned for VariationSpec, specifically for fields method, distribution, mean_std and var_range. Fields xpath, attrib, elem and default are specific to the model in the XML file provided by the user, so they cannot be default parameters. Further more, elem and default are obtained by parsing the XML file, so the user won't set them.
bd93226
to
ac0d449
Compare
Please reopen this PR against https://github.com/rlworkgroup/garage |
Imported to garage. See rlworkgroup/garage#51 |
A simple data structure consisting of a list of variation objects was
implemented. Each variation object is an instance of the Variation class
that works as a container for each of the fields used to randomized a
dynamic parameter within the simulation environment.
This list of variations is further tested in script
test_dynamics_rand.py to verify that fields within each variation can be
set and get. The former script was originally started at issue #61.