In this work, we introduce task selection based on prior experience into a meta-learning algorithm by conceptualizing the learner and the active meta-learning setting using a probabilistic latent variable model.
This repository implements the models and algorithms necessary to reproduce experiments (i)-(iii).
To reproduce the results, you can run batch.sh
(please do not change default values for certain parameters in run.py
).
The core components of the repository are:
run.py
: script to run the PAML algorithm including all parametersenv
: directory for configuring and observing environmentscontrols.py
: generates control signalsenvironment_configurator.py
: configuresdm_control
environmentsto.py
: observes trajectories of the environments given controls
models
: directory for the PAML modelmeta_learner.py
: trains the model and infers latent task variablesmlgp.py
: the meta-learning (sparse variational) gaussian process modeltp.py
: predicts trajectories (for evaluation)
utility_functions
: directory for the in the paper used utility functions and baselinespaml.py
: PAMLlhs.py
: Latin Hypercube Samplinguni.py
: Uniform sampling
utils
: directory for miscellaneous toolsalgorithm_utils.py
: separated key steps of the PAML algorithmdataset.py
: stores and prepares trajectory observationsevaluation.py
: evaluates the model's performance on test tasks
This code was tested in Python 3.7
.
The dependencies can be found in requirements.txt
.
- Download and install MuJoCo Pro 2.00
- You need a license and you can request a trial license for 30 days
- At installation time,
dm_control
, looks for the MuJoCo headers in~/.mujoco/mujoco200_$PLATFORM/include
- At runtime,
dm_control
looks for the MuJoCo license key file at~/.mujoco/mjkey.txt
- Install all dependencies with
pip install -r requirements.txt
# Under-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --under_specified_system True --observed_config_space_dim=1
# Fully-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1
# Fully-specified pendubot environment
python3 run.py --env_name="pendubot" --utility_function="PAML" --seed=1
# Fully-specified cart-double-pole environment
python3 run.py --env_name="cartdoublepole" --utility_function="PAML" --seed=1
# Over-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --over_specified_system True --observed_config_space_dim=3 --config_space_dim=2
Parameters that require string values:
--env_name
:'cartpole'
,'cartdoublepole'
,'pendubot'
--utility_function
:'PAML'
,'LHS'
,'UNI'
--policy
:'ALTERNATE'
--initial_training_configurations
:'LHS'
,'UNI'
Parameters that require boolean values:
--verbose
: printing additional information--evaluation
: evaluation of the MLGP on a test task grid--under_specified_system
: enables an unobserved, stochastic configuration dimension--oracle
: initial training on the test task grid--data_normalization
: normalization of training data over all dimensions
The task paramater interval can be specified through the console, e.g.,
# By default, the following command runs an experiment with cart-pole tasks with pendulum mass in [0.5, 3.0] kg
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --config_interval_lower_bound_dim_1=0.5 --config_interval_upper_bound_dim_1=3.0
In order to change the environment's parameterization (e.g., which configuration interval dimension corresponds to mass, length, radius, etc.), please have a look at env/environment_configurator.py
@inproceedings{kaddour2020paml,
title={Probabilistic Active-Meta Learning},
author={Kaddour, Jean and Saemundsson, Steindor and Deisenroth, Marc Peter},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}