Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HyperOpt hyperparameter space conflicts with ray.tune #1

Closed
PanyiDong opened this issue Apr 8, 2022 · 2 comments
Closed

HyperOpt hyperparameter space conflicts with ray.tune #1

PanyiDong opened this issue Apr 8, 2022 · 2 comments

Comments

@PanyiDong
Copy link
Owner

PanyiDong commented Apr 8, 2022

Problem

In general case, different methods may contain same hyperparameter (for kNN-style imputation methods, a hyerparameter k is critical). For ray.tune, good thing is different hyperparameters (from different methods with same names) will be automatically recognized and distinguished. However, for HyperOpt, the name of hypeparameter is identified by dictionary keys and also a unique hyperparameter name. So, when defining the default hyperparameter space, for example, imbalance_threshold in SimpleRandomOverSampling and imbalance_threshold from SimpleRandomUnderSampling can be distinguished as following:

{
        "balancing": "SimpleRandomOverSampling",
        "imbalance_threshold": hp.uniform(
            "SimpleRandomOverSampling_imbalance_threshold", 0.8, 1
        ),
},
{
        "balancing": "SimpleRandomUnderSampling",
        "imbalance_threshold": hp.uniform(
            "SimpleRandomUnderSampling_imbalance_threshold", 0.8, 1
        ),
},

However, for general purpose, I designed a hyperparameter space under ray.tune style which does not allow such naming structure, but defined as following:

{
        "balancing": "SimpleRandomOverSampling",
        "imbalance_threshold": tune.uniform(0.8, 1),
 },
{
        "balancing": "SimpleRandomUnderSampling",
        "imbalance_threshold": tune.uniform(0.8, 1),
},

So, when using Grid Search/Random Search, no error will raise since it's supported by ray.tune. However, to call search algorithm HyperOpt, the problem of duplicate label error will occur. For above case, both imbalance_threshold will be identified as balancing/imbalance_threshold and cause HyperOpt unable to properly read hyperparameter space.

Reproduction of the problem

Here, I provide a simple example to demonstrate how the problem can occur:

from ray import tune
from ray.tune.suggest.basic_variant import BasicVariantGenerator
from ray.tune.suggest.hyperopt import HyperOptSearch

space = [
    {
        "balancing": "SimpleRandomOverSampling",
        "imbalance_threshold": tune.uniform(0.8, 1),
    },
    {
        "balancing": "SimpleRandomUnderSampling",
        "imbalance_threshold": tune.uniform(0.8, 1),
    },
]


def eval(config):

    _config = config["balancing"]
    loss = _config["imbalance_threshold"]

    tune.report(loss=loss)


analysis1 = tune.run(
    eval,
    config={"balancing": tune.choice(space)},
    num_samples=5,
    mode="min",
    metric="loss",
    search_alg=BasicVariantGenerator(),
)

analysis2 = tune.run(
    eval,
    config={"balancing": tune.choice(space)},
    num_samples=5,
    mode="min",
    metric="loss",
    search_alg=HyperOptSearch(),
)

At analysis1, the search works smoothly and raise a DuplicateLabel balancing/imbalance_threshold error at analysis2.

Current Idea on Solution

Since the problem occurs when converting ray.tune space to hyperopt space, I think when defining the default hyperparameter space, the methods can be added in front of hyperparameter names. And when call the methods, we can remove these prefixes to use the actual hyperparameter names so the hyperparameters can be called properly.

I'm still working on the problem. For now, the GridSearch/RandomSearch option for search algorithm should be fine.

@PanyiDong
Copy link
Owner Author

actual commit should be 4f67da9

@PanyiDong
Copy link
Owner Author

Solution

For above case, the hyperparameter space is defined in the current version as :

{
        "balancing_1": "SimpleRandomOverSampling",
        "SimpleRandomOverSampling_imbalance_threshold": tune.uniform(0.8, 1),
 },
{
        "balancing_2": "SimpleRandomUnderSampling",
        "SimpleRandomUnderSampling_imbalance_threshold": tune.uniform(0.8, 1),
},

When config hyperparameter search space, the keys are all unique, which all can be distinguished by HyperOpt and in the training phase, the redundant prefix ("SimpleRandomUnderSampling_", etc.) and suffix ("_1", etc.) are removed for dict/arguments matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant