Dynamics randomization for MuJoCo #109

ghost · 2018-05-30T21:17:03Z

A simple data structure consisting of a list of variation objects was
implemented. Each variation object is an instance of the Variation class
that works as a container for each of the fields used to randomized a
dynamic parameter within the simulation environment.
This list of variations is further tested in script
test_dynamics_rand.py to verify that fields within each variation can be
set and get. The former script was originally started at issue #61.

ryanjulian · 2018-05-30T21:25:46Z

rllab/dynamics_randomization/test_dynamics_rand.py

+tosser = ET.parse(TOSSER_XML)
+
+var_list = VariationsList().\
+    add_variation(".//motor[@name='a1']", "gear", "coefficient", "uniform", (0.5, 1.5)).\


A fluent interface would be more like

variations = Variations() variations.randomize() .attribute("gear") .at_xpath(".//motor[@name='a1']") .with_method(Variations.COEFFICIENT) .sampled_from(Variations.UNIFORM) .with_range(0.5, 1.5)

Ok, that's what @CatherineSue suggested but I thought it's a little verbose, although more understandable.
So, Variations contains a list of variation objects right?
And if that is the case, is the randomize() method used to create a new variation object?

well they contain some way of storing variations. a list of objects seems fine. randomize() create a new internal record of a variation.

fluent interfaces are verbose but easy to read. yes, you could use a very long list of arguments, but what if the relevant arguments change based on other arguments?

for instance, if i want a uniform distribution i need a range=(min,max) argument, but if I use a guassian i need a (mean, std) argument. it gets messy very quickly.

Got it. @ryanjulian, please review the new implementation. Also, me and @CatherineSue discussed the implementation of the wrapper for the randomized environment and work is in progress on that side as well.

experimental PR

ryanjulian · 2018-05-31T19:45:40Z

rllab/dynamics_randomization/variation.py

+
+    @method.setter
+    def method(self, method):
+        self._method = method


Variation objects should be immutable (i.e. there are no setters)

Sorry, I just noticed this. There's a trade-off between having the fluent interface pattern and having immutable properties.
In order to set the members of the Variation object without setters, we have to pass their attributes in the constructor, and that breaks the interface pattern. Otherwise, without the setter decorators and with the members that correspond to each property not previously defined, we get this kind of errors:

Traceback (most recent call last): File "rllab/dynamics_randomization/trpo_swimmer.py", line 13, in <module> at_xpath(".//geom[@name='torso']").\ File "/home/aigonzal/ivanWorkspace/rllab/rllab/dynamics_randomization/variation.py", line 114, in at_xpath self._list[-1].xpath = xpath AttributeError: can't set attribute

Another way to do it is by setting the members of the class Variation directly in Variations, but that breaks the immutability of the property.
Any advice of how to achieve this is well appreciated.

I was trying to use a dictionary to store the variation then add it to Variation class at last. But I found that we need the v.elem=e in MujocoModelGenerator, this needs to call the elem setter in Variation. So I didn’t think of a way to implement immutable Variation after all.

For how to implement it, see builder pattern.

To solve the problem with v.elem = e, the solution is to keep a local cache of those elems in MujocoModelGenerator, rather than storing the data inside variations. That would be the proper way to do it--I just used the dict in my example as a shortcut.

Note that you can use the variation object directly as a key in your dictionary, i.e.

elem_cache = {} # caching elems for v in variations: # find elem, = e elem_cache[v] = e # making models for v in variations: e = elem_cache[v]

Thanks for the comment. Done

CatherineSue · 2018-05-31T22:00:26Z

The rllab/dynamics_randomization/trpo_swimmer.py tests the run of rllab.dynamics_randomization.RandomizeEnv. It runs successfully in
macOS, but it raises this exception in Linux:

mujoco_py.builder.MujocoException: Got MuJoCo Warning: Unknown warning type Time = 0.0000.

I have reinstalled the conda env, mujoco150 and mujoco-py to ensure
the test runs on the same env. So there may be missing configure in
Linux, or it's just a bug.

CatherineSue · 2018-05-31T22:02:32Z

@gonzaiva found someone with a similar error message here:
pfnet/gym-env-mujoco150#3

ghost

Also, all the python files in rllab have file names in lowercase with each word separated by an underscore, so we should keep this standard with RandomizeEnv.py to randomized_env.py

ghost · 2018-05-31T22:20:35Z

rllab/dynamics_randomization/RandomizeEnv.py

+import os.path as osp
+import numpy as np
+from rllab.dynamics_randomization import VariationMethods
+from rllab.dynamics_randomization.variation import VariationDistributions


Change to from rllab.dynamics_randomization import VariationDistributions

ghost · 2018-05-31T22:24:43Z

rllab/dynamics_randomization/RandomizeEnv.py

+from mujoco_py import MjSim
+from rllab.core import Serializable
+
+MODEL_DIR = osp.abspath(


Reuse this variable from mujoco_env.py

ghost · 2018-05-31T22:25:24Z

rllab/dynamics_randomization/RandomizeEnv.py

+            elif v.method == VariationMethods.ABSOLUTE:
+                e.attrib[v.attrib] = str(c)
+            else:
+                raise NotImplementedError("Unknown method")


Add one NotImplementedError for the if statement of v.distribution as well.

ghost · 2018-05-31T22:25:52Z

rllab/dynamics_randomization/RandomizeEnv.py

+        self._wrapped_env.model = load_model_from_xml(model_xml)
+        self._wrapped_env.sim = MjSim(self._wrapped_env.model)
+        self._wrapped_env.data = self._wrapped_env.sim.data
+        self._wrapped_env.viewer = None


Only set those variables that depend on self._wrapped_env.sim

ghost · 2018-05-31T22:37:13Z

rllab/dynamics_randomization/RandomizeEnv.py

+        return self._wrapped_env.horizon
+
+
+randomize = RandomizedEnv


What is this line for?

shouldn't be here. will delete in the new commit.

ghost · 2018-06-01T01:13:50Z

@ryanjulian we're able to run our current implementation of Dynamics Randomization in OS X (tested in both @CatherineSue and @jonashen laptops), however Linux is a hard cookie to crack!
According to mujoco-py, Linux installation requires some dependencies as specified here, while OS X does not require to install any dependency, at the first instance this seems to be problem. However, we tried installing the dependencies and the error keeps happening.
Another thing we tried is to downgrade the mujoco-py package but the error persists.
I still think there's a dependency we're missing but there's no clear clue of what it could be.
Another thing worth mentioning is that the error usually happens in the second episode of the first iteration, but sometimes it occurs in other iterations like iteration 19, so the error cannot be exactly reproduced all the time.
Full error is:

0% [######################        ] 100% | ETA: 00:00:00Traceback (most recent call last):
  File "rllab/dynamics_randomization/trpo_swimmer.py", line 39, in <module>
    algo.train()
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/algos/batch_polopt.py", line 119, in train
    paths = self.sampler.obtain_samples(itr)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/algos/batch_polopt.py", line 30, in obtain_samples
    scope=self.algo.scope,
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/sampler/parallel_sampler.py", line 125, in sample_paths
    show_prog_bar=True
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/sampler/stateful_pool.py", line 181, in run_collect
    result, inc = collect_once(self.G, *args)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/sampler/parallel_sampler.py", line 94, in _worker_collect_one_path
    path = rollout(G.env, G.policy, max_path_length)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/sampler/utils.py", line 20, in rollout
    next_o, r, d, env_info = env.step(a)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/envs/normalized_env.py", line 86, in step
    wrapped_step = self._wrapped_env.step(scaled_action)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/dynamics_randomization/RandomizeEnv.py", line 74, in step
    return self._wrapped_env.step(action)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/envs/mujoco/swimmer_env.py", line 33, in step
    self.forward_dynamics(action)
  File "/home/aigonzal/ivanWorkspace/rllab/rllab/envs/mujoco/mujoco_env.py", line 180, in forward_dynamics
    self.sim.step()
  File "mjsim.pyx", line 119, in mujoco_py.cymj.MjSim.step
  File "cymj.pyx", line 115, in mujoco_py.cymj.wrap_mujoco_warning.__exit__
  File "cymj.pyx", line 75, in mujoco_py.cymj.c_warning_callback
  File "/home/aigonzal/anaconda2/envs/rllab3/lib/python3.5/site-packages/mujoco_py/builder.py", line 331, in user_warning_raise_exception
    raise MujocoException('Got MuJoCo Warning: {}'.format(warn))
mujoco_py.builder.MujocoException: Got MuJoCo Warning: Unknown warning type Time = 0.0000.

This is quite a show stopper in Linux, and it points to other bugs that may happen with mujoco in rllab. Any feedback you could give is highly appreciated.

jonashen · 2018-06-02T01:17:05Z

rllab/dynamics_randomization/randomize_env.py

+from rllab.envs.mujoco.mujoco_env import MODEL_DIR
+from rllab.core import Serializable
+from rllab.dynamics_randomization import VariationMethods
+from rllab.dynamics_randomization.variation import VariationDistributions


Alphabetical order c->d->e

jonashen · 2018-06-02T01:17:18Z

rllab/dynamics_randomization/test_dynamics_rand.py

+import os.path as osp
+import xml.etree.ElementTree as ET
+
+import numpy as np


jonashen · 2018-06-02T01:17:35Z

rllab/dynamics_randomization/trpo_swimmer.py

+from rllab.baselines import LinearFeatureBaseline
+from rllab.envs.mujoco import SwimmerEnv
+from rllab.envs import normalize
+from rllab.policies import GaussianMLPPolicy


Alphabetical order

ghost · 2018-06-04T18:22:12Z

The steps to produce the issue in Linux are:

Make a fresh install of rllab to include the update for mujoco 1.5 by executing the installers on tip of the feature branch named "dynamics_randomization".
Activate the rllab3 environment
Run the script rllab/dynamics_randomization/trpo_swimmer.py

ryanjulian

I apologize that I don't have time today for a full review of the fluent interface, but please see my comment about your implementation of the multithreading.

I'm very curious to see benchmarks of the threaded vs non-threaded versions.

ryanjulian · 2018-05-31T20:08:49Z

rllab/dynamics_randomization/variation.py

+        """
+        if self._list:
+            self._list[-1].distribution = distribution
+        return self


hmm. if you return self then the API with_range() will be available after the dot operator.

e.g.

v.randomize() .attribute("gear") .at_xpath("something") .with_method(COEFFICIENT) .sampled_from(GAUSSIAN) .with_range(0.5, 1.5)

but .with_range() is not valid for .sampled_from(GAUSSIAN). you would want something like .with_parameters((1, 0.5)) or .with_mean_std(1, 0.5)

One way to solve this would be to return an instance of some new class without the method .with_range(), and then Python would throw an error at import time. That would be ideal but perhaps difficult to implement.

Another way is to change your API slightly to look like this

v.randomize() .attribute(gear) .at_xpath("something") .with_method(COEFFICIENT) .sampled_from(GAUSSIAN) .with_range(0.5, 1.5) .add()

Where add() is void (it doesn't return anything), and the new variation record is not constructed and added to the list of variations until add() is called, e.g.

class Variations(): def randomize(): return VariationSpec(self) class Variation(): def __init__(self, spec): # check a bunch of conditions on spec, raise an exception if it's wrong or incomplete or overspecified class VariationSpec(): def __init__(self, variations): self._variations = variations def with_xpath(xpath): self._xpath = xpath return self def add(): self._variations.add(Variation(self))

if this turns out too much a pain to implement, on the other hand, it might be a sign that a fluent interface is not the right fit here

Done. Implemented the first approach by segmenting the Variations class into different classes, one base class, and two classes for Gaussian and uniform distributions.

To remove the setter in Variation, I changed the API using the second approach.

ryanjulian · 2018-05-31T23:15:04Z

rllab/dynamics_randomization/randomize_env.py

@@ -0,0 +1,96 @@
+import os.path as osp


i think this should be a subpackage of rllab.envs.mujoco, as it's only relevant to MuJoCo environments.

Done. Should I replace the new rllab.env.mujoco import for all packages in rllab?

I'm not sure what you're asking. Can you clarify?

To address this issue, I added osp in the modules of the package rllab.env.mujoco, so now we can import osp as from rllab.envs.mujoco import osp.
I noticed other modules in rllab import osp as import os.path as osp. My question is if I should replace those imports withfrom rllab.envs.mujoco import osp.

Do you mean the dynamics_randomization package should be moved to rllab.envs.mujoco.dynamics_randomization?

ryanjulian · 2018-05-31T23:18:28Z

rllab/dynamics_randomization/randomize_env.py

+        for v in variations.get_list():
+            e = self._model.find(v.xpath)
+            if not e:
+                raise AttributeError("Can't find node in xml")


This is a ValueError, not an AttributeError.

Your error messages would be more helpful if they told me which node could not be found in which file.

ryanjulian · 2018-05-31T23:20:31Z

rllab/dynamics_randomization/randomize_env.py

+            v.elem = e
+
+            if v.attrib not in e.attrib:
+                raise KeyError("Attribute doesn't exist")


Despite leading to a KeyError, this is probably still a ValueError, not a KeyError.

This would be more helpful if it told me which attribute could not be found on which node in which file.

ryanjulian · 2018-05-31T23:23:34Z

rllab/dynamics_randomization/randomize_env.py

+        self._model = etree.parse(self._file_path)
+
+        for v in variations.get_list():
+            e = self._model.find(v.xpath)


Be careful to check for multiple results here.

This is an API question: what if an XPath matches multiples nodes
Should we:

Raise an error

Attempt to modify all matching nodes

Something else?

I vote for (1) but I'm interested in your opinion. The problem I have with (2) is that it is hard to tell the difference between matching multiple nodes on accident (due to a malformed XPath) versus someone legitimately wanting to address multiple nodes.

Working on it

I read the lxml document. The ElementTree library in lxml supports:findall() returns a list of matching Elements, find() efficiently returns only the first match. So if we use find(), there will be no error.

ryanjulian · 2018-05-31T23:48:07Z

rllab/dynamics_randomization/test_dynamics_rand.py

+
+variations = Variations()
+variations.randomize().\
+        attribute("gear").\


Usually when people format method chains onto multiple lines, they break before the .
e.g.

variations.randomize() \ .attribute("gear")

Working on it.

ryanjulian · 2018-06-05T00:39:43Z

rllab/dynamics_randomization/variation.py

+from enum import Enum
+
+
+class VariationMethod(Enum):


You can just call this Method since its use is clear inside the package.

ryanjulian · 2018-06-05T00:39:51Z

rllab/dynamics_randomization/variation.py

+    ABSOLUTE = 2
+
+
+class VariationDistribution(Enum):


You can just call this Distribution since its use is clear inside the package.

ryanjulian · 2018-06-05T00:42:32Z

rllab/dynamics_randomization/mujoco_model_gen.py

+import atexit
+from queue import Queue
+from threading import Event
+from threading import Thread


It will be important to benchmark this to ensure threads (as opposed to processes) actually speed things up.

Python has a design detail called the GIL (global interpreter lock), which means that usually only one thread inside a Python application may run at once. This does not necessarily apply to parts of Python programs which are implemented using native code (e.g. mujoco), but whether that code is blocked by the GIL is very implementation-specific.

ryanjulian · 2018-06-05T00:51:36Z

rllab/dynamics_randomization/mujoco_model_gen.py

+        self._mujoco_model = None
+        # Communicates the calling thread with the worker thread by awaking
+        # the worker thread so as to generate a new model.
+        self._model_requested = Event()


This is a great implementation, but I think you can make it even simpler by using the maxsize parameter of Queue

When maxsize != 0, put() blocks when the queue is full, until get() is called. By default, get() blocks when the queue is empty, until something is put() onto the queue.

CatherineSue · 2018-06-05T16:24:32Z

Benchmarks of running trpo_swimmer

multi-thread

$ cat bench_multi_itr40.txt | head -40
         27602992 function calls (27506126 primitive calls) in 64.712 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   2469/1    0.108    0.000   64.868   64.868 {built-in method builtins.exec}
        1    0.000    0.000   64.868   64.868 trpo_swimmer.py:1(<module>)
        1    0.004    0.004   56.190   56.190 batch_polopt.py:114(train)
       40    0.000    0.000   45.798    1.145 batch_polopt.py:24(obtain_samples)
       40    0.000    0.000   45.796    1.145 parallel_sampler.py:98(sample_paths)
       40    0.002    0.000   45.786    1.145 stateful_pool.py:121(run_collect)
      320    0.293    0.001   45.761    0.143 parallel_sampler.py:92(_worker_collect_one_path)
      320    1.198    0.004   45.467    0.142 utils.py:10(rollout)
   160000    1.889    0.000   18.428    0.000 normalized_env.py:77(step)
   160763    9.610    0.000   18.323    0.000 function_module.py:743(__call__)
   160000    1.703    0.000   17.649    0.000 gaussian_mlp_policy.py:124(get_action)
      205    0.003    0.000   16.644    0.081 __init__.py:1(<module>)
   160000    0.132    0.000   15.236    0.000 randomized_env.py:46(step)
   160000    1.980    0.000   15.104    0.000 swimmer_env.py:32(step)
       40    0.001    0.000    9.903    0.248 npo.py:101(optimize_policy)
       40    0.006    0.000    8.754    0.219 conjugate_gradient_optimizer.py:229(optimize)
        6    0.000    0.000    8.112    1.352 ext.py:123(compile_function)
        6    0.000    0.000    8.112    1.352 function.py:74(function)
        6    0.000    0.000    8.109    1.351 pfunc.py:283(pfunc)
        6    0.000    0.000    8.079    1.346 function_module.py:1765(orig_function)
      440    0.010    0.000    6.604    0.015 conjugate_gradient_optimizer.py:49(eval)
   160000    1.690    0.000    6.388    0.000 mujoco_env.py:177(forward_dynamics)
       40    0.013    0.000    6.264    0.157 krylov.py:7(cg)
        6    0.001    0.000    6.227    1.038 function_module.py:1428(__init__)
        6    0.000    0.000    6.138    1.023 opt.py:102(__call__)
    485/6    0.002    0.000    6.138    1.023 opt.py:85(optimize)
     24/6    0.001    0.000    6.138    1.023 opt.py:223(apply)
   487207    0.758    0.000    5.508    0.000 op.py:891(rval)
  2600/70    0.013    0.000    5.354    0.076 <frozen importlib._bootstrap>:966(_find_and_load)
  2600/70    0.010    0.000    5.352    0.076 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
 3453/202    0.003    0.000    5.335    0.026 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
 2259/137    0.010    0.000    5.297    0.039 <frozen importlib._bootstrap>:659(_load_unlocked)
      538    0.003    0.000    5.248    0.010 op.py:912(make_thunk)
      420    0.005    0.000    5.233    0.012 op.py:826(make_c_thunk)
  2005/72    0.007    0.000    5.171    0.072 <frozen importlib._bootstrap_external>:659(exec_module)

single thread

$ cat bench_single_itr40.txt | head -40
         27753481 function calls (27656615 primitive calls) in 55.331 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   2469/1    0.107    0.000   55.475   55.475 {built-in method builtins.exec}
        1    0.000    0.000   55.475   55.475 trpo_swimmer_single_thread.py:1(<module>)
        1    0.006    0.006   46.967   46.967 batch_polopt.py:114(train)
       40    0.000    0.000   36.747    0.919 batch_polopt.py:24(obtain_samples)
       40    0.000    0.000   36.745    0.919 parallel_sampler.py:98(sample_paths)
       40    0.002    0.000   36.734    0.918 stateful_pool.py:121(run_collect)
      320    0.237    0.001   36.711    0.115 parallel_sampler.py:92(_worker_collect_one_path)
      320    0.946    0.003   36.474    0.114 utils.py:10(rollout)
      205    0.003    0.000   16.372    0.080 __init__.py:1(<module>)
   160000    1.750    0.000   14.331    0.000 normalized_env.py:77(step)
   160761    7.742    0.000   14.002    0.000 function_module.py:743(__call__)
   160000    1.334    0.000   12.020    0.000 gaussian_mlp_policy.py:124(get_action)
   160000    0.115    0.000   11.403    0.000 randomized_env_single_thread.py:69(step)
   160000    1.733    0.000   11.288    0.000 swimmer_env.py:32(step)
       40    0.001    0.000    9.754    0.244 npo.py:101(optimize_policy)
       40    0.006    0.000    8.689    0.217 conjugate_gradient_optimizer.py:229(optimize)
        6    0.000    0.000    8.038    1.340 ext.py:123(compile_function)
        6    0.000    0.000    8.038    1.340 function.py:74(function)
        6    0.000    0.000    8.035    1.339 pfunc.py:283(pfunc)
        6    0.000    0.000    8.006    1.334 function_module.py:1765(orig_function)
      440    0.010    0.000    6.592    0.015 conjugate_gradient_optimizer.py:49(eval)
      320    0.001    0.000    6.279    0.020 normalized_env.py:51(reset)
      320    1.087    0.003    6.278    0.020 randomized_env_single_thread.py:36(reset)
        6    0.000    0.000    6.272    1.045 function_module.py:1428(__init__)
       40    0.013    0.000    6.256    0.156 krylov.py:7(cg)
        6    0.000    0.000    6.046    1.008 opt.py:102(__call__)
    485/6    0.002    0.000    6.046    1.008 opt.py:85(optimize)
     24/6    0.001    0.000    6.046    1.008 opt.py:223(apply)
  2600/70    0.013    0.000    5.351    0.076 <frozen importlib._bootstrap>:966(_find_and_load)
  2600/70    0.010    0.000    5.349    0.076 <frozen importlib._bootstrap>:939(_find_and_load_unlocked)
 3453/202    0.004    0.000    5.331    0.026 <frozen importlib._bootstrap>:214(_call_with_frames_removed)
 2259/137    0.010    0.000    5.290    0.039 <frozen importlib._bootstrap>:659(_load_unlocked)
  2005/72    0.007    0.000    5.158    0.072 <frozen importlib._bootstrap_external>:659(exec_module)
      538    0.003    0.000    5.116    0.010 op.py:912(make_thunk)
       48    0.073    0.002    5.101    0.106 opt.py:2394(apply)

ryanjulian · 2018-06-05T17:40:37Z

rllab/dynamics_randomization/trpo_swimmer.py

+from rllab.policies import GaussianMLPPolicy
+from rllab.dynamics_randomization import RandomizedEnv
+from rllab.dynamics_randomization import Variations
+from rllab.dynamics_randomization import VariationMethod


I propose we make this a wrapper which looks a lot like normalized().

So we put these classes in rllab.envs.mujoco.dynamics_randomization. At the bottom of dynamics_randomization.py you can put

# dynamics_randomization.py randomize_dynamics = RandomizedDynamicsEnv

So that users can use it in a launcher like this

# in your launcher from rllab.envs.mujoco.dynamics_randomization import randomize_dynamics # currently RandomizedEnv from rllab.envs.mujoco.dynamics_randomization import Variations from rllab.envs.mujoco.dynamics_randomization import Method # currently VariationMethod from rllab.envs.mujoco.dynamics_randomization import Distribution # currently VariationDistribution variations = Variations() variations.randomize()\ .at_xpath(".//geom[@name='torso']")\ .attribute("density")\ .with_method(Method.COEFFICIENT)\ .sampled_from(Distribution.UNIFORM)\ .with_range(0.5, 1.5) env = randomize_dynamics(SwimmerEnv(), variations)

Err, after trying the module I actually agree with some of your original naming. This module is general enough to randomize any part of an environment (good job!)

So the new vocabulary would be randomize = RandomizedEnv

ryanjulian · 2018-06-05T17:40:58Z

rllab/dynamics_randomization/tosser.xml

@@ -0,0 +1,86 @@
+<!--Author: vikash@openai.com-->


This can be removed, it was only in my sample for testing purposes.

If you want to use it in a test, you should be able to find it directly in the mujoco_py package. Or you can use some other mujoco XML file already in our own tree (preferable).

You can also just define the XML inline in a string at the top of the file. That's probably the easiest option.

ryanjulian · 2018-06-05T17:43:07Z

rllab/dynamics_randomization/test_dynamics_rand.py

+        with_method(VariationMethod.COEFFICIENT).\
+        sampled_from(VariationDistribution.UNIFORM).\
+        with_range(0.5, 1.5).\
+        randomize().\


Generally you split new variations into different lines (e.g. no calling randomize twice in one statement). This is part of why I suggested that there be an add() method which is void to finish the statement.

See the builder pattern

ryanjulian · 2018-06-05T17:49:14Z

rllab/dynamics_randomization/variation.py

+
+    @method.setter
+    def method(self, method):
+        self._method = method


For how to implement it, see builder pattern.

To solve the problem with v.elem = e, the solution is to keep a local cache of those elems in MujocoModelGenerator, rather than storing the data inside variations. That would be the proper way to do it--I just used the dict in my example as a shortcut.

Note that you can use the variation object directly as a key in your dictionary, i.e.

elem_cache = {} # caching elems for v in variations: # find elem, = e elem_cache[v] = e # making models for v in variations: e = elem_cache[v]

ryanjulian · 2018-06-05T17:49:57Z

rllab/dynamics_randomization/variation.py

+            self._list[-1].distribution = distribution
+        return self
+
+    def with_range(self, low, high):


what stops me from using this method along with with_distribution()?

Two ideas to fix this. First one is to raise an error during the runtime. e.g.

variations = Variations() variations.randomize() \ .attribute("gear") \ .at_xpath(".//motor[@name='a1']") \ .with_method(Method.COEFFICIENT) \ .sampled_from(Distribution.UNIFORM) \ .with_mean_std(0.5, 1.5) \ .add()

will raise

ValueError: Need to call with_range when sampled from Uniform

The current commit implements this idea.

Another idea is implement something like this

There is no with_mean_std when user calls sampled_from(UNIFORM)
I tried using mixin to achieve this but failed. Because using inheritance will return either VariationsGaussian or VariationsUniform in sampled_from. There's no way to tell which one should be called before running time.

All the python files have file names in lowercase. To keep this standard, refactor RandomizeEnv.py to randomized_env.py.

Add error handling in constructor and reset(). Remove variables that doesn't depend on self._wrapped_env.sim in reset(). Reuse MODEL_DIR in mujoco_env.py Alphabetize imports.

Fix wrong AttributeError raising in constructor when there is element in xml. Add error handling towards the Variation.range attribute. When the range shape isn't the same as the attribute value shape, raise an AttributeError.

Finish the thread when the simulaton is interrupted.

Other miscellaneous changes include: - Rename classes VariationsMethods and VariationDistributions to VariationsMethod and VariationDistribution respectively. - The parsing of the XML string and fetch of the dynamic parameters to randomize is now done within the worker thread. - The file randomize_env.py was renamed to randomized_env.py

Before this commit, when there is an error raised when loading the xml object, only the worker_thread terminates. This commit fixes this bug by terminating all the processes. Fix some typo in the last commit.

Create an 10-length queue in MujocoModelGenerator to store the mujoco_models.

- Renamed classes VariationMethod and VariationDistribution to Method and Distribution. - Enforced the use of methods exclusive for uniform or normal distributions in the fluent interface pattern provided in class Variations by splitting the class into VariationsBase, VariationsGaussian and VariationsUniform. - Included the module os.path.osp in rllab.envs.mujoco modules. - Changed error types and improved messages for two errors in class MujocoModelGenerator.

- Delete unused threading.Event in MujocoModelGenerator. - Correct error types in MujocoModelGenerator. - Renamed classes RandomizedEnv to RandomizedDynamicsEnv. - Delete wrong try-except in RandomizedDynamicsEnv. - Use randomize_dynamics() in the launcher. - Format method chains onto multiple lines. - Correct wrong param name in Variation.

Solve the problem with v.elem=e, which calls the setter method in Variation. Replace this with a local cache of elements. Same with v.default in MujocoModelGenerator.

- Remove the setter in Variation - Add check of parameter shape in MujocoModelGenerator - Fix some typo

- Add more detailed information in handling the shape of the sampled value with the default value - Add timeout in the Queue.get() in MujocoModelGenerator so the main thread will catch error raised in worker thread

The cached property action_space found in mujoco_env.py produces an error in Linux for dynamics randomization. The idea behind the cached property is to avoid doing an expensive computation several times, so for regular execution, action_space is obtained from the model that is used for the entire training once, improving the performance. However, for dynamics randomization there's a new model for each episode, and that requires that the action_space is updated accordingly, but that does not happen because it's cached. To update action_space and not make an invasive change, the attribute is invalidated for each reset in the RandomizedEnv class.

- Move the dynamics_randomization package to rllab.envs.mujoco. - Delete tosser.xml, use xml in rllab/vendor/mujoco_model for test - The old test_dynamics_rand.py only tests for the Variations API, so rewrite it to test for both Variations API and RandomizedEnv. - Reorder imports. - Delete import os.path as osp in rllab/envs/mujoco/__init__.py. Previously added by mistake.

Package names should follow class names.

test_dynamics_rand.py is enough for testing. Remove trpo_swimmer.py

The code to initialize the variations and to generate the randomized parameters was moved into the Variations class. This will keep all the current code related to variations in the same file to improve the API of dynamics randomization, and will enable a more modular code for further features in the module.

There is a bottleneck at function load_model_from_xml from mujoco-py when using multi threading. In a single thread, a call to this function takes units of milliseconds, while in multi threading it takes tens of milliseconds. Maybe this is due to internal data structures that are required for both loading the model in the worker thread and performing the simulations in the main thread, causing the delay in load_model_from_xml and other functions that can be perceived in the cumulative time obtained by the profiler by running test_dynamics_rand.py. Due to this poor performance, the file mujoco_model_gen.py was removed since it serves no purpose now that the variations.py file contains methods to process the XML file, and the calls to obtain the randomized model are done directly in the class RandomizedEnv.

The changes in this commit include: - The modules in the __ini__.py file were sorted alphabetically. - The new line at the end of file was added in randomized_env.py. - Default values were assigned for VariationSpec, specifically for fields method, distribution, mean_std and var_range. Fields xpath, attrib, elem and default are specific to the model in the XML file provided by the user, so they cannot be default parameters. Further more, elem and default are obtained by parsing the XML file, so the user won't set them.

ryanjulian · 2018-06-11T04:54:29Z

Please reopen this PR against https://github.com/rlworkgroup/garage

ghost · 2018-06-14T21:33:07Z

Imported to garage. See rlworkgroup/garage#51

ghost assigned CatherineSue and ghost May 30, 2018

ghost requested review from ryanjulian, eric-heiden, CatherineSue and hejia-zhang May 30, 2018 21:17

CatherineSue previously approved these changes May 30, 2018

View reviewed changes

ryanjulian reviewed May 30, 2018

View reviewed changes

hejia-zhang approved these changes May 31, 2018

View reviewed changes

ryanjulian reviewed May 31, 2018

View reviewed changes

CatherineSue force-pushed the dynamics_randomization branch from 2b3d7be to 9a9b71f Compare May 31, 2018 21:30

ghost force-pushed the dynamics_randomization branch from 9a9b71f to 60ea59f Compare May 31, 2018 21:44

ghost commented May 31, 2018

View reviewed changes

ghost requested a review from jonashen May 31, 2018 23:27

jonashen reviewed Jun 2, 2018

View reviewed changes

ghost force-pushed the dynamics_randomization branch from 930f894 to 83c2d48 Compare June 4, 2018 23:27

CatherineSue force-pushed the dynamics_randomization branch 2 times, most recently from 3b6f4a8 to 9fb3106 Compare June 5, 2018 00:31

ryanjulian reviewed Jun 5, 2018

View reviewed changes

Chang and others added 24 commits June 7, 2018 21:00

Rename RandomizeEnv.py to randomized_env.py

9eba2da

All the python files have file names in lowercase. To keep this standard, refactor RandomizeEnv.py to randomized_env.py.

Add features and remove variables in RandomizedEnv

6fc661b

Add error handling in constructor and reset(). Remove variables that doesn't depend on self._wrapped_env.sim in reset(). Reuse MODEL_DIR in mujoco_env.py Alphabetize imports.

Fix bugs in error handling

e4ce9b1

Fix wrong AttributeError raising in constructor when there is element in xml. Add error handling towards the Variation.range attribute. When the range shape isn't the same as the attribute value shape, raise an AttributeError.

Add thread to generate Mujoco models

28d73e8

Add thread terminate in mujoco_model_gen

21e42ee

Add interrupt handler to mujoco model gen

009ddbf

Finish the thread when the simulaton is interrupted.

Fix bug in MujocoModelGenerator

4442f12

Before this commit, when there is an error raised when loading the xml object, only the worker_thread terminates. This commit fixes this bug by terminating all the processes. Fix some typo in the last commit.

Change multi-thread MujocoModelGen to n-length queue

4a96217

Create an 10-length queue in MujocoModelGenerator to store the mujoco_models.

Fix PEP8 formatting in file mujoco_model_gen.py

cf45d0c

Add local cache of elems in MujocoModelGenerator

25db0fe

Solve the problem with v.elem=e, which calls the setter method in Variation. Replace this with a local cache of elements. Same with v.default in MujocoModelGenerator.

Add miscellaneous changes to improve the code

22d69af

- Remove the setter in Variation - Add check of parameter shape in MujocoModelGenerator - Fix some typo

Fix PEP formatting with YAPF

a4c8442

Add error handling to MujocoModelGenerator

43594ea

- Add more detailed information in handling the shape of the sampled value with the default value - Add timeout in the Queue.get() in MujocoModelGenerator so the main thread will catch error raised in worker thread

Rename RandomizedDynamicsEnv to RandomizedEnv

4221002

Rename mujoco_model_gen to mujoco_model_generator

f18780e

Package names should follow class names.

Delete trpo_swimmer in dynamics_randomization

62331f2

test_dynamics_rand.py is enough for testing. Remove trpo_swimmer.py

ghost force-pushed the dynamics_randomization branch from bd93226 to ac0d449 Compare June 8, 2018 04:01

Fix PEP8 formatting in randomized_env.py

56d2b60

ryanjulian approved these changes Jun 9, 2018

View reviewed changes

ghost closed this Jun 14, 2018

This pull request was closed.

Dynamics randomization for MuJoCo #109

Dynamics randomization for MuJoCo #109

Conversation

ghost commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CatherineSue commented May 31, 2018

CatherineSue commented May 31, 2018

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Jun 1, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Jun 4, 2018

ryanjulian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CatherineSue commented Jun 5, 2018

Benchmarks of running trpo_swimmer

multi-thread

single thread

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryanjulian commented Jun 11, 2018

ghost commented Jun 14, 2018

ghost left a comment •

edited by ghost

Loading