Skip to content

Commit

Permalink
Merge pull request #46 from automl/yaml_config_space
Browse files Browse the repository at this point in the history
Enable option to create Pipeline_space via yaml file
  • Loading branch information
danrgll authored Dec 27, 2023
2 parents eca37d5 + 676ac4d commit a14e716
Show file tree
Hide file tree
Showing 19 changed files with 1,199 additions and 21 deletions.
158 changes: 158 additions & 0 deletions docs/pipeline_space.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Initializing the Search Space

In NePS, defining the Search Space is one of two essential tasks. You can define it either through a Python dictionary
,YAML file or ConfigSpace. This section provides examples and instructions for both methods.

## Option 1: Using a Python Dictionary

To define the Search Space using a Python dictionary, follow these steps:

Create a Python dictionary that specifies the parameters and their respective ranges. For example:

```python
search_space = {
"learning_rate": neps.FloatParameter(lower=0.00001, upper=0.1, log=True),
"num_epochs": neps.IntegerParameter(lower=3, upper=30, is_fidelity=True),
"optimizer": neps.CategoricalParameter(choices=["adam", "sgd", "rmsprop"]),
"dropout_rate": neps.FloatParameter(value=0.5),
}
```

## Option 2: Using a YAML File

Create a YAML file (e.g., search_space.yaml) with the parameter definitions following this structure.

```yaml
search_space: # important to start with
learning_rate:
lower: 2e-3
upper: 0.1
log: true

num_epochs:
type: int # or "integer"
lower: 3
upper: 30
is_fidelity: True

optimizer:
choices: ["adam", "sgd", "rmsprop"]

dropout_rate:
value: 0.5
...
```

Ensure your YAML file starts with `search_space:`.
This is the root key under which all parameter configurations are defined.

## Option 3: Using ConfigSpace

For users familiar with the ConfigSpace library, can also define the Search Space through
ConfigurationSpace()

```python
from configspace import ConfigurationSpace, UniformFloatHyperparameter

configspace = ConfigurationSpace()
configspace.add_hyperparameter(
UniformFloatHyperparameter("learning_rate", 0.00001, 0.1, log=True)
)
```

For additional information on ConfigSpace and its features, please visit the following link:
https://github.com/automl/ConfigSpace

## Supported Hyperparameter Types using a YAML File

### Float/Integer Parameter

- **Expected Arguments:**
- `lower`: The minimum value of the parameter.
- `upper`: The maximum value of the parameter.
- Accepted Values: Int or Float depending on the specific parameter type one wishes to use.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'int', 'integer', or 'float'.
- Note: If type is not specified e notation gets converted to float
- `log`: Boolean that indicates if the parameter uses a logarithmic scale (default: False)
- [Details on how YAML interpret Boolean Values](#important-note-on-yaml-string-and-boolean-interpretation)
- `is_fidelity`: Boolean that marks the parameter as a fidelity parameter (default: False).
- `default`: Sets a prior central value for the parameter (default: None).
- Note: Currently, if you define a prior for one parameter, you must do so for all your variables.
- `default_confidence`: Specifies the confidence level of the default value,
indicating how strongly the prior
should be considered (default: "low").
- Accepted Values: 'low', 'medium', or 'high'.

### Categorical Parameter

- **Expected Arguments:**
- `choices`: A list of discrete options(int | float | str) that the parameter can take.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'cat' or 'categorical'.
- `is_fidelity`: Marks the parameter as a fidelity parameter (default: False).
- [Details on how YAML interpret Boolean Values](#important-note-on-yaml-string-and-boolean-interpretation)
- `default`: Sets a prior central value for the parameter (default: None).
- Note: Currently, if you define a prior for one parameter, you must do so for all your variables.
- `default_confidence`: Specifies the confidence level of the default value,
indicating how strongly the prior
should be considered (default: "low").

### Constant Parameter

- **Expected Arguments:**
- `value`: The fixed value(int | float | str) for the parameter.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'const' or 'constant'.
- `is_fidelity`: Marks the parameter as a fidelity parameter (default: False).

### Important Note on YAML Data Type Interpretation

When working with YAML files, it's essential to understand how the format interprets different data types:

1. **Strings in Quotes:**

- Any value enclosed in single (`'`) or double (`"`) quotes is treated as a string.
- Example: `"true"`, `'123'` are read as strings.

2. **Boolean Interpretation:**

- Specific unquoted values are interpreted as booleans. This includes:
- `true`, `True`, `TRUE`
- `false`, `False`, `FALSE`
- `on`, `On`, `ON`
- `off`, `Off`, `OFF`
- `yes`, `Yes`, `YES`
- `no`, `No`, `NO`

3. **Numbers:**

- Unquoted numeric values are interpreted as integers or floating-point numbers, depending on their format.
- Example: `123` is an integer, `4.56` is a float, `1e3` can be either an integer or a floating-point number,
depending on the type specified by the user. By default, 1e3 is treated as a floating-point number.
This interpretation is unique to our system.

4. **Empty Strings:**

- An empty string `""` or a key with no value is always treated as `null` in YAML.

5. **Unquoted Non-Boolean, Non-Numeric Strings:**

- Unquoted values that don't match boolean patterns or numeric formats are treated as strings.
- Example: `example` is a string.

Remember to use appropriate quotes and formats to ensure values are interpreted as intended.

## Supported ArchitectureParameter Types

**Note**: The definition of Search Space from a YAML file is limited to supporting only Hyperparameter Types.

If you are interested in exploring Architecture, particularly Hierarchical parameters, you can find detailed examples and usage in the following resources:

- [Basic Usage Examples](https://github.com/automl/neps/tree/master/neps_examples/basic_usage) - Basic usage
examples that can help you understand the fundamentals of Architecture parameters.

- [Experimental Examples](https://github.com/automl/neps/tree/master/neps_examples/experimental) - For more advanced and experimental use cases, including Hierarchical parameters, check out this collection of examples.
14 changes: 10 additions & 4 deletions neps/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,11 @@
from .optimizers import BaseOptimizer, SearcherMapping
from .plot.tensorboard_eval import tblogger
from .search_spaces.parameter import Parameter
from .search_spaces.search_space import SearchSpace, pipeline_space_from_configspace
from .search_spaces.search_space import (
SearchSpace,
pipeline_space_from_configspace,
pipeline_space_from_yaml,
)
from .status.status import post_run_csv
from .utils.common import get_searcher_data, get_value
from .utils.result_utils import get_loss
Expand Down Expand Up @@ -94,9 +98,8 @@ def write_loss_and_config(file_handle, loss_, config_id_, config_):
def run(
run_pipeline: Callable,
root_directory: str | Path,
pipeline_space: dict[str, Parameter | CS.ConfigurationSpace]
| CS.ConfigurationSpace
| None = None,
pipeline_space: dict[str, Parameter | CS.ConfigurationSpace] | str | Path |
CS.ConfigurationSpace | None = None,
overwrite_working_directory: bool = False,
post_run_summary: bool = False,
development_stage_id=None,
Expand Down Expand Up @@ -311,6 +314,9 @@ def _run_args(
# Support pipeline space as ConfigurationSpace definition
if isinstance(pipeline_space, CS.ConfigurationSpace):
pipeline_space = pipeline_space_from_configspace(pipeline_space)
# Support pipeline space as YAML file
elif isinstance(pipeline_space, (str, Path)):
pipeline_space = pipeline_space_from_yaml(pipeline_space)

# Support pipeline space as mix of ConfigurationSpace and neps parameters
new_pipeline_space: dict[str, Parameter] = dict()
Expand Down
5 changes: 1 addition & 4 deletions neps/search_spaces/hyperparameters/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@

import random
from copy import copy, deepcopy
from typing import Iterable
from typing import Iterable, Literal

import numpy as np
import numpy.typing as npt
from typing_extensions import Literal

from ..parameter import Parameter

Expand All @@ -32,9 +31,7 @@ def __init__(
self.upper = default
self.default_confidence_score = CATEGORICAL_CONFIDENCE_SCORES[default_confidence]
self.has_prior = self.default is not None

self.is_fidelity = is_fidelity

self.choices = list(choices)
self.num_choices = len(self.choices)
self.probabilities: list[npt.NDArray] = list(
Expand Down
3 changes: 1 addition & 2 deletions neps/search_spaces/hyperparameters/float.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

import math
from copy import deepcopy
from typing import Literal

import numpy as np
import scipy.stats
from typing_extensions import Literal

from .numerical import NumericalParameter

Expand Down Expand Up @@ -37,7 +37,6 @@ def __init__(

if self.lower >= self.upper:
raise ValueError("Float parameter: bounds error (lower >= upper).")

self.log = log

if self.log:
Expand Down
Loading

0 comments on commit a14e716

Please sign in to comment.