Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Optimizer in BO Suggestion only for the first run #1057

Merged

Conversation

andreyvelich
Copy link
Member

@andreyvelich andreyvelich commented Feb 17, 2020

Fixes: #1039.

In this PR I added is_first_run flag.
We must create Optimizer instance using search_space only for the first GetSuggestions call.

/assign @gaocegege @johnugeorge


This change is Reviewable

@andreyvelich
Copy link
Member Author

See comment: #1039 (comment)

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

It is a great optimization.

@andreyvelich
Copy link
Member Author

@gaocegege Thanks!

@andreyvelich
Copy link
Member Author

/hold
Hold until I test it on long-term experiment

@k8s-ci-robot k8s-ci-robot removed the lgtm label Feb 18, 2020
@andreyvelich
Copy link
Member Author

andreyvelich commented Feb 18, 2020

After testing BO on big amount of Trials I found few problems.

  1. Since we are using one Optimizer instance during whole experiment, each GetSuggestion call will run tell method which will add information about all Trials to Optimizer. Xi and yi will have duplicate values (See attributes here: https://scikit-optimize.github.io/modules/generated/skopt.Optimizer.html#skopt.Optimizer) which is not correct and slow performance.
    To prevent it, I add recorded_trials dict, where key = loss value and value = List of Trial Assignment list for this value. Before running tell method, I am checking if this information already contains in the dict and I don't run tell method for the recorded Trials.

  2. We don't need to iterate over trials if count of succeeded Trials is the same. For some cases, Katib controller can run GetSuggestion without new succeeded trials (For Example, some trials failed). We don't need to run tell method in that situation.

  3. I tried to change ask method with n_points=request_number, see (https://scikit-optimize.github.io/modules/generated/skopt.Optimizer.html#skopt.Optimizer.ask). It indicates how many Trials do we want to create. Unfortunately, after 35 Trials this method takes so much time to run (more than 1 minute). I am not sure how we can fix it. Any thoughts @gaocegege ?

Also, I added few logging prints in BO.

/hold cancel

/cc @johnugeorge

@andreyvelich
Copy link
Member Author

andreyvelich commented Feb 18, 2020

According to the first point, instead of map it is easier to have list of recorded trials name, since they are unique.
I just forgot that we send Trial name to the Suggestion.

@andreyvelich
Copy link
Member Author

/retest

1 similar comment
@johnugeorge
Copy link
Member

/retest

@johnugeorge johnugeorge reopened this Feb 19, 2020
@gaocegege
Copy link
Member

/lgtm

@ever-cheng
Copy link

Doesn't SkoptService exist as a singleton instance for the system, to accept all GetSuggestions request?
How can it accept experiment change if it was bound to a fixed base_service?
if self.is_first_run:
self.base_service = BaseSkoptService(...)
...
self.is_first_run = False

@johnugeorge
Copy link
Member

@ever-cheng No. algorithm service is deployed per experiment

@andreyvelich andreyvelich mentioned this pull request Feb 21, 2020
@johnugeorge
Copy link
Member

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit ed0778e into kubeflow:master Feb 21, 2020
@andreyvelich andreyvelich deleted the issue-1039-bo-suggestion-first-run branch October 6, 2021 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bayesian optimization experiment suggests same hyperparameters
5 participants