Reproducibility: different result on first run with gpu_hist on single GPU #8820

cstefansen · 2023-02-16T19:11:03Z

Based on #5023 it seems like XGBoost aims to guarantee reproducibility for single GPU training with gpu_hist. That is, training again on the same hardware with the same data and the same seed should give precisely the same model bit for bit.

However, I am consistently seeing different results on the very first training run (on a freshly started Python interpreter - this is important) for the following code:

import xgboost


n_rows = 25_000
n_features = 1_000
n_rows_val = 12_500

np.random.seed(0)

x = np.random.normal(-3.0e-05, 0.5, (n_rows, n_features))
y = np.random.normal(-0.05, 1.0, n_rows)
w = np.clip(np.random.normal(7.25e+06, 1.25e+07, n_rows), 0.0, None)

x_val = np.random.normal(-3.0e-05, 0.5, (n_rows_val, n_features))
y_val = np.random.normal(-0.05, 1.0, n_rows_val)
w_val = np.clip(np.random.normal(7.25e+06, 1.25e+07, n_rows_val), 0.0, None)


print(f'XGBoost version {xgboost.__version__}')

models = []

for i in range(3):
    xgbr = xgboost.XGBRegressor(
        gpu_id=0,
        tree_method='gpu_hist',
        sampling_method='gradient_based',
        verbosity=0,
        booster='gbtree',
        n_jobs=1,
        nthreads=1,
        random_state=np.random.RandomState(0),
        seed=0,
        single_precision_histogram=False,
        max_delta_step = 0,
        colsample_bylevel = 1.0,
        scale_pos_weight = 1.0,
        base_score = 0.0,
        colsample_bynode=0.5,
        colsample_bytree=0.13,
        gamma=7_500, 
        objective='reg:squarederror',
        learning_rate=0.007,
        max_depth=6,
        min_child_weight=30_000,
        n_estimators=2_500,
        reg_alpha=8.0,
        reg_lambda=0.5,
        subsample=0.45,
    )

    xgbr.fit(x, y, sample_weight=w)
    score = xgbr.score(x_val, y_val, sample_weight=w_val)
    models.append(xgbr)
    print(i, score)

which results in

0 -0.009656077054927659
1 -0.010103391088486235
2 -0.010103391088486235

This is on Linux with a Tesla T4 and CUDA 11.7.

Is there another seed that needs to be set to ensure that the first run works off of the same seed as the subsequent runs? Or is this potentially a bug?

The text was updated successfully, but these errors were encountered:

trivialfis · 2023-02-16T20:05:00Z

I think it's caused by the global random engine used inside xgboost. The booster trained in the second iteration is affected by the one from the first iteration as they share the same random engine.

trivialfis · 2023-02-16T20:11:40Z

Similar to calling
x = np.random.normal(-3.0e-05, 0.5, (n_rows, n_features))
twice, x is different even if the np seed is specified.

cstefansen · 2023-02-21T12:18:14Z

@trivialfis, to stay with your analogy, it is possible to get reproducible results by saying:

np.random.seed(0)
x1 = np.random.normal(-3.0e-05, 0.5, (n_rows, n_features))

np.random.seed(0)
x2 = np.random.normal(-3.0e-05, 0.5, (n_rows, n_features))

np.testing.assert_equal(x1, x2)

Is there a way to achieve the same reproducibility for XGBoost?

The example in the original repro does in fact produce reproducible (i.e., identical) results in each iteration when run on a CPU. However, when run on a single GPU, the first run is always different from the subsequent runs (I ran this with 1000 iterations, and a got 999 identical models after the first one.)

This seems to be like a seed/rng initialization within the GPU code because once the Python interpreter has run the repro once, it will produce identical results when run again, and to repro I have to restart the Python interpreter to get it to produce a different model on the first iteration.

mingli-ts · 2023-03-15T20:49:38Z

@trivialfis just want to bump this up. Do you know if there is a way to set the global random engine seed you mentioned? As the example above shows, setting np.random.seed does not fix the issue. The weird part is only the first run is non-deterministic. The following runs all give same results.

trivialfis · 2023-03-15T22:17:12Z

Let me take another look later

marcofavoritobi mentioned this issue Mar 18, 2023

Reproducibility issue of XGBoostSampler results for Windows and Linux bancaditalia/black-it#49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibility: different result on first run with gpu_hist on single GPU #8820

Reproducibility: different result on first run with gpu_hist on single GPU #8820

cstefansen commented Feb 16, 2023 •

edited

Loading

trivialfis commented Feb 16, 2023 •

edited

Loading

trivialfis commented Feb 16, 2023

cstefansen commented Feb 21, 2023

mingli-ts commented Mar 15, 2023

trivialfis commented Mar 15, 2023

Reproducibility: different result on first run with gpu_hist on single GPU #8820

Reproducibility: different result on first run with gpu_hist on single GPU #8820

Comments

cstefansen commented Feb 16, 2023 • edited Loading

trivialfis commented Feb 16, 2023 • edited Loading

trivialfis commented Feb 16, 2023

cstefansen commented Feb 21, 2023

mingli-ts commented Mar 15, 2023

trivialfis commented Mar 15, 2023

cstefansen commented Feb 16, 2023 •

edited

Loading

trivialfis commented Feb 16, 2023 •

edited

Loading