Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaussian Process regressor does not fit all the points #513

Closed
jorisparet opened this issue Jul 30, 2024 · 4 comments
Closed

Gaussian Process regressor does not fit all the points #513

jorisparet opened this issue Jul 30, 2024 · 4 comments

Comments

@jorisparet
Copy link

Hello,

I am quite new to the package, so it is likely that I am missing something obvious (my sincerest apologies if this is the case). I have not been able to find an explanation for what I observed, though, so it is probably better to bring it up just in case.

The bug
The Gaussian Process regressor in optimizer._gp does not fit all the points sampled by the optimizer.maximize method. The last point seems to be left out.

Reproduction
I basically ran an example adapted from the documentation (before the recent acquisition function API redesign in #447). Here is the code I used:

from bayes_opt import BayesianOptimization
from bayes_opt.util import UtilityFunction
import numpy as np
import matplotlib.pyplot as plt

def target(x):
    return np.exp(-(x - 2)**2) + np.exp(-(x - 6)**2/10) + 1/ (x**2 + 1)

# Optimization
pbounds = {'x': (-2, 10)}
optimizer = BayesianOptimization(f=target,
                                 pbounds=pbounds,
                                 random_state=27)
acquisition_function = UtilityFunction(kind='ucb',
                                       kappa=5)
optimizer.maximize(init_points=5,
                   n_iter=5,
                   acquisition_function=acquisition_function)

# Observed samples
x_obs = np.array([[res["params"]["x"]] for res in optimizer.res])
y_obs = np.array([res["target"] for res in optimizer.res])

# Plot the results
x = np.linspace(-2, 10, 100).reshape(-1, 1)
y_true = target(x)
y_gp, sigma = optimizer._gp.predict(x, return_std=True)
plt.plot(x, y_true, label='Truth')
plt.plot(x, y_gp, label='GP', c='tab:orange')
plt.fill_between(x.flatten(), y_gp + sigma, y_gp - sigma, alpha=0.25, color='tab:orange')
plt.scatter(x_obs, y_obs, c='tab:orange')
plt.legend()
plt.show()

Expected behavior
The predicted mean from the GP should pass through all the points in x_obs. The point at $x=5.199$ is not taken into account by the GP. Note that this is the last one sampled from optimizer.minimize. I tried with different values of random_state and n_iter and it seems to be systematic.

Screenshot
image

Environment (please complete the following information):

  • OS: Red Hat Enterprise Linux 9.4
  • python version 3.10.9
  • numpy version 1.26.4
  • scipy version 1.14.0
  • bayesian-optimization version 1.5.1

Thank you for your help, cheers.

@till-m
Copy link
Member

till-m commented Jul 30, 2024

Hi @jorisparet,

fitting the GP is expensive. If you don't continue with the optimization then fitting the GP is not necessary -- unless you manually use it the way you do (but the maximize loop can't know that). Hence, the GP is only fitted before the suggest step and it will always miss the last point.
You can aways fit manually instead.

Hope that helps

@jorisparet
Copy link
Author

Hi @till-m,

OK, I understand the idea. Although, if the optimizer prints out the last evaluated value of the target function during the maximize loop, I would (naively) expect it to be taken into account in the GP as well, since one may want to use the GP as a surrogate in addition to only finding the maximum of the target function.

Again, I'm new to the package, so perhaps my view is a bit different from someone more familiar with Bayesian optimization, but it might be useful to mention it at least in the docs that maximize does not automatically fit the last point.

I'll let you close the issue, in case you think an action is needed (e.g. add a comment in the docs) or not.

Thanks for your help and the quick reply. 🙂

@till-m
Copy link
Member

till-m commented Jul 31, 2024

Hey @jorisparet,

IIRC this is not the first time some confusion has been caused by the last point being unfitted. I think you have a good point in that comment in the documentation would be helpful. I'll leave this issue open for now to track that.

@till-m
Copy link
Member

till-m commented Sep 7, 2024

Added, see here :)

@till-m till-m closed this as completed Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants