Skip to content

Commit

Permalink
Add more info on const_args vs. tunable_params; start the DEVNOTES (m…
Browse files Browse the repository at this point in the history
…icrosoft#630)

Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
  • Loading branch information
motus and bpkroth authored Jan 16, 2024
1 parent 769e43e commit 8d5239c
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 0 deletions.
56 changes: 56 additions & 0 deletions mlos_bench/mlos_bench/DEVNOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Notes to the developer

This document is a developer's perspective of the `mlos_bench` framework.
It is work in progress; we will keep extending it as we develop the code.

## Environment

At the center of the `mlos_bench` framework is the `Environment` class.
The environment implement the `.setup()`, `.run()`, and `.teardown()` stages of a trial, and also encapsulates the configuration and the tunable parameters of the benchmarking environment.

For most of the use cases, there is no need to implement custom `Environment` classes, as `mlos_bench` already has a library of ready to use `Environment` implementations, e.g., for running setup/run/teardown scripts locally or on the remote host.

At runtime, `mlos_bench` instantiates the `Environment` objects from config files.
Each `Environment` config is a JSON5 file with the following structure:

```javascript
{
"name": "Mock environment", // Environment name / ID
"class": "mlos_bench.environments.mock_env.MockEnv", // A class to instantiate

"config": {
"tunable_params": [
// Groups of variable parameters passed to .setup() on each trial
// (e.g., suggested by the optimizer):
"linux-kernel-boot",
"linux-scheduler",
"linux-swap",
// ...
],
"const_args": {
// Additional .setup() parameters that do not change from trial to trial:
"foo": "bar",
// ...
}
// Environment constructor parameters
// (specific to the Environment class being instantiated):
"seed": 42,
// ...
}
}
```

### `const_args` and `tunable_params`

Note that in the config above we have three groups of parameters.
`tunable_params` are the configuration parameters that are passed to the `.setup()` call on each trial.
They are external to the environment, and are usually either suggested by the optimizer, or specified explicitly by the user (e.g., when benchmarking a certain configuration).
`const_args` is a loose collection of key/value pairs that complement the `tunable_params` values.
These values do not change from one trial to the next (though could change from one `mlos_bench` `run.py` to another via environment variable consumption), but they also appear as input parameters for each `Environment.setup()` call to use in their `setup` and `run` scripts, for instance.
Other config parameters, like `seed`, are class-specific and appear as the constructor arguments during the class instantiation.

## Service

Some functionality is shared across several environments, so it makes sense to factor it out in separate classes and configs.
Environments can include configs for the `Service` classes, and access the methods of the services internally.
`Service` classes provide generic APIs to certain cloud functionality; e.g., an `AzureVMService` for provisioning and managing VMs on Azure can have a drop-in replacement for the analogous functionality on AWS, etc.
11 changes: 11 additions & 0 deletions mlos_bench/mlos_bench/config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,17 @@ In general the `config` directory layout follows that of the `mlos_bench` module

Full end-to-end examples are provided in the [`cli`](./cli/) directory, and typically and make use of the root [`CompositeEnvironments`](./environments/root/) to combine multiple [`Environments`](./environments/), also referencing [`Services`](./services/), [`Storage`](./storage/), and [`Optimizer`](./optimizers/) configs, into a single [`mlos_bench`](../run.py) run.

## Config parameters

An `Environment` configuration can have two sections, `const_args` and `tunable_params`.
At runtime, at each trial the data from both sections will be merged and passed to the environment as a single dictionary of key/value pairs.
The difference between `const_args` and `tunable_params` is that the `tunable_params` values change from one trial to the next (e.g., when the optimizer proposes a new configuration), whereas values of the `const_args` usually stay constant or get their values from sources other than the optimizer.
Having these two sections allows the user not only define which environment parameters to tune, but also customize the environment without changing the Python code or shell scripts.
Sometimes values for the `const_args` come from outside of the Environment configuration, e.g., from global parameters (see the next section) or storage.
To enforce the presence of such parameters, the user can declare their IDs in the `required_args` section of the configuration.
The system will make sure that each parameter in the `required_args` has a value (either specified in the `const_args` or provided externally) and report an error if the required parameters are missing.
The storage system will only save the values of the `tunable_params` for each trial; we rely on git to preserve the config files along with the `const_args` and `required_args` values, and save the commit hash of the configs for each experiment.

## Globals

As mentioned in the [mlos_bench/README.md](../../README.md), a general rule is that the parameters from the global configs like `global_config_azure.jsonc` and/or `experiment_MyAppBench.jsonc` override the corresponding parameters in other configurations.
Expand Down

0 comments on commit 8d5239c

Please sign in to comment.