Gaussian Process regressor does not fit all the points #513

jorisparet · 2024-07-30T15:20:27Z

Hello,

I am quite new to the package, so it is likely that I am missing something obvious (my sincerest apologies if this is the case). I have not been able to find an explanation for what I observed, though, so it is probably better to bring it up just in case.

The bug
The Gaussian Process regressor in optimizer._gp does not fit all the points sampled by the optimizer.maximize method. The last point seems to be left out.

Reproduction
I basically ran an example adapted from the documentation (before the recent acquisition function API redesign in #447). Here is the code I used:

from bayes_opt import BayesianOptimization
from bayes_opt.util import UtilityFunction
import numpy as np
import matplotlib.pyplot as plt

def target(x):
    return np.exp(-(x - 2)**2) + np.exp(-(x - 6)**2/10) + 1/ (x**2 + 1)

# Optimization
pbounds = {'x': (-2, 10)}
optimizer = BayesianOptimization(f=target,
                                 pbounds=pbounds,
                                 random_state=27)
acquisition_function = UtilityFunction(kind='ucb',
                                       kappa=5)
optimizer.maximize(init_points=5,
                   n_iter=5,
                   acquisition_function=acquisition_function)

# Observed samples
x_obs = np.array([[res["params"]["x"]] for res in optimizer.res])
y_obs = np.array([res["target"] for res in optimizer.res])

# Plot the results
x = np.linspace(-2, 10, 100).reshape(-1, 1)
y_true = target(x)
y_gp, sigma = optimizer._gp.predict(x, return_std=True)
plt.plot(x, y_true, label='Truth')
plt.plot(x, y_gp, label='GP', c='tab:orange')
plt.fill_between(x.flatten(), y_gp + sigma, y_gp - sigma, alpha=0.25, color='tab:orange')
plt.scatter(x_obs, y_obs, c='tab:orange')
plt.legend()
plt.show()

Expected behavior
The predicted mean from the GP should pass through all the points in x_obs. The point at $x=5.199$ is not taken into account by the GP. Note that this is the last one sampled from optimizer.minimize. I tried with different values of random_state and n_iter and it seems to be systematic.

Screenshot

Environment (please complete the following information):

OS: Red Hat Enterprise Linux 9.4
python version 3.10.9
numpy version 1.26.4
scipy version 1.14.0
bayesian-optimization version 1.5.1

Thank you for your help, cheers.

The text was updated successfully, but these errors were encountered:

till-m · 2024-07-30T15:41:42Z

Hi @jorisparet,

fitting the GP is expensive. If you don't continue with the optimization then fitting the GP is not necessary -- unless you manually use it the way you do (but the maximize loop can't know that). Hence, the GP is only fitted before the suggest step and it will always miss the last point.
You can aways fit manually instead.

Hope that helps

jorisparet · 2024-07-31T08:54:33Z

Hi @till-m,

OK, I understand the idea. Although, if the optimizer prints out the last evaluated value of the target function during the maximize loop, I would (naively) expect it to be taken into account in the GP as well, since one may want to use the GP as a surrogate in addition to only finding the maximum of the target function.

Again, I'm new to the package, so perhaps my view is a bit different from someone more familiar with Bayesian optimization, but it might be useful to mention it at least in the docs that maximize does not automatically fit the last point.

I'll let you close the issue, in case you think an action is needed (e.g. add a comment in the docs) or not.

Thanks for your help and the quick reply. 🙂

till-m · 2024-07-31T09:03:04Z

Hey @jorisparet,

IIRC this is not the first time some confusion has been caused by the last point being unfitted. I think you have a good point in that comment in the documentation would be helpful. I'll leave this issue open for now to track that.

till-m · 2024-09-07T11:57:22Z

Added, see here :)

jorisparet added bug enhancement labels Jul 30, 2024

till-m mentioned this issue Aug 9, 2024

PR 1/2: Add versioned docs #509

Merged

till-m closed this as completed Sep 7, 2024

till-m mentioned this issue Mar 8, 2025

Prediction for the last point incorrect #550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gaussian Process regressor does not fit all the points #513

Gaussian Process regressor does not fit all the points #513

jorisparet commented Jul 30, 2024

till-m commented Jul 30, 2024

jorisparet commented Jul 31, 2024

till-m commented Jul 31, 2024

till-m commented Sep 7, 2024

Gaussian Process regressor does not fit all the points #513

Gaussian Process regressor does not fit all the points #513

Comments

jorisparet commented Jul 30, 2024

till-m commented Jul 30, 2024

jorisparet commented Jul 31, 2024

till-m commented Jul 31, 2024

till-m commented Sep 7, 2024