SMAC optimizer does not support mixed input space #666

ephoris · 2024-02-05T17:14:00Z

Using the SMAC optimizer in conjunction with a mix of different parameter space types will throw an error. Minimum reproducible example is attached.

import pandas as pd
import mlos_core.optimizers
import ConfigSpace as CS


def objective(x: pd.DataFrame):
    out = x.values
    out = out.sum()

    return out


def run_optimization(optimizer: mlos_core.optimizers.BaseOptimizer):
    suggested_value = optimizer.suggest()
    target_value = objective(suggested_value)
    print(f"{suggested_value}")
    print(f"{target_value=}")
    optimizer.register(suggested_value, pd.Series([target_value]))

    return


if __name__ == "__main__":
    parameter_space = CS.ConfigurationSpace(seed=0)
    dummy_val_one = CS.Float("dummy_one", (1, 10.0), default=5.0)
    dummy_val_two = CS.Integer("dummy_two", (1, 10), default=5)
    parameter_space.add_hyperparameters([dummy_val_one, dummy_val_two])

    optimizer = mlos_core.optimizers.SmacOptimizer(parameter_space=parameter_space)

    n_iterations = 10
    for _ in range(n_iterations):
        run_optimization(optimizer)

Results in an error

Traceback (most recent call last):
  File "*/tmp/mlos_tmp.py", line 33, in <module>
    run_optimization(optimizer)
  File "*/tmp/mlos_tmp.py", line 18, in run_optimization
    optimizer.register(suggested_value, pd.Series([target_value]))
  File "*/anaconda3/lib/python3.9/site-packages/mlos_core/optimizers/optimizer.py", line 91, in register
    return self._register(configurations, scores, context)
  File "*/anaconda3/lib/python3.9/site-packages/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py", line 247
, in _register
    for config, score in zip(self._to_configspace_configs(configurations), scores.tolist()):
  File "*/anaconda3/lib/python3.9/site-packages/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py", line 337
, in _to_configspace_configs
    return [
  File "*/anaconda3/lib/python3.9/site-packages/mlos_core/optimizers/bayesian_optimizers/smac_optimizer.py", line 338
, in <listcomp>
    ConfigSpace.Configuration(self.optimizer_parameter_space, values=config.to_dict())
  File "*/anaconda3/lib/python3.9/site-packages/ConfigSpace/configuration.py", line 90, in __init__
    raise IllegalValueError(hp, value)
ConfigSpace.exceptions.IllegalValueError: Value 1.0: (<class 'float'>) is not allowed for hyperparameter dummy_two, Type: UniformInteger, Ran
ge: [1, 10], Default: 5

The SMAC optimizer register function results in a call to SmacOptimizer._to_configspace_configs which calls a Dataframe.iterrows() method. As per the pandas documentation on pandas.Dataframe.iterrows, the iterrows method does not preserve dtypes when consumed (see note 1).

Unless I am using the API wrong, the change seems to be a simple fix to the SmacOptimizer._to_configspace_configs method to use NamedTuples. I can submit a PR if this issue is real.

The text was updated successfully, but these errors were encountered:

bpkroth · 2024-02-05T17:27:44Z

Hi @ephoris , thanks so much for reporting this. This does indeed look like a bug, though I'm surprised we missed that test case. We'd be happy to accept a PR. Could you please include a test case for it as well? Thanks!

bpkroth · 2024-02-05T17:28:57Z

@motus - fyi
I seem to recall you mentioning something about that converter recently.

bpkroth · 2024-02-05T17:31:49Z

We should probably replace all instances of iterrows with itertuples if it's easy. There's only 4 cases atm.

ephoris · 2024-02-05T19:40:23Z

I know that itertuples will instead return a NamedTuple. To prevent too many downstream changes, NamedTuples can be converted back to dictionaries using the _asdict method, albeit it's a bit slow.

I wrote a small test following the format of some of the others, seems the bug only applies if you have multiple numeric types. If you add in a CategoricalHyperParameter all dtypes are preserved.

I created a test but my fix doesn't seem to work in the pytest infrastructure. I can take a closer look to see what's going on.

bpkroth · 2024-02-05T20:25:42Z

Please at least submit a draft PR for now. Then we can try and help take a look. Thanks!

Addressing issue discussed in #666 --------- Co-authored-by: Sergiy Matusevych <sergiy.matusevych@gmail.com> Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>

bpkroth added bug Something isn't working tests Add or fix unit tests mlos-core labels Feb 5, 2024

ephoris mentioned this issue Feb 5, 2024

Fix mixed numeric datatypes for optimizers #667

Merged

bpkroth linked a pull request Feb 6, 2024 that will close this issue

Fix mixed numeric datatypes for optimizers #667

Merged

bpkroth closed this as completed in #667 Feb 7, 2024

bpkroth added a commit that referenced this issue Feb 7, 2024

Fix mixed numeric datatypes for optimizers (#667)

9175d18

Addressing issue discussed in #666 --------- Co-authored-by: Sergiy Matusevych <sergiy.matusevych@gmail.com> Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SMAC optimizer does not support mixed input space #666

SMAC optimizer does not support mixed input space #666

ephoris commented Feb 5, 2024

bpkroth commented Feb 5, 2024

bpkroth commented Feb 5, 2024

bpkroth commented Feb 5, 2024 •

edited

Loading

ephoris commented Feb 5, 2024

bpkroth commented Feb 5, 2024

SMAC optimizer does not support mixed input space #666

SMAC optimizer does not support mixed input space #666

Comments

ephoris commented Feb 5, 2024

bpkroth commented Feb 5, 2024

bpkroth commented Feb 5, 2024

bpkroth commented Feb 5, 2024 • edited Loading

ephoris commented Feb 5, 2024

bpkroth commented Feb 5, 2024

bpkroth commented Feb 5, 2024 •

edited

Loading