Unable to reproduce model on GPU using 'gpu_hist', but works with 'hist' #5458

Zethson · 2020-03-31T13:07:35Z

Dear everyone,

I have read all discussions on #5023 and the related issues, but do not seem to find a solution that works.
My problem is that my merror differs between runs, but only when running 'gpu_hist' for the tree method.

Code (including dataset):

#!/home/user/miniconda/envs/xgboost-1.0.2-cuda-10.1/bin/python
import xgboost as xgb
import numpy as np
from sklearn.datasets import fetch_covtype
from sklearn.model_selection import train_test_split
import time
import random

# Fetch dataset using sklearn
cov = fetch_covtype()
X = cov.data
y = cov.target

np.random.seed(0)
random.seed(0) # Python general seed

# Create 0.75/0.25 train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, train_size=0.75, random_state=0)

# Leave most parameters as default
param = {'objective': 'multi:softmax',
         'num_class': 8,
         'seed': 0,
        # 'single_precision_histogram': True
         }

# Convert input data from numpy to XGBoost format
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Specify sufficient boosting iterations to reach a minimum
num_round = 100

# GPU Training
param['tree_method'] = 'gpu_hist'
gpu_res = {}
tmp = time.time()
xgb.train(param, dtrain, num_round, evals=[(dtest, 'test')], evals_result=gpu_res)
print("GPU Training Time: %s seconds" % (str(time.time() - tmp)))

I have set the numpy seed, the python random seed, the train test split seed and the XGBoost seed. Furthermore, I tried the single_precision_histogram.

It is reproducible using the 'hist' tree_method.

My environment:
Docker container based on Cuda 10.1, python 3.7 and the following environment:

name: xgboost-1.0.2-cuda-10.1
channels:
    - conda-forge
    - defaults
dependencies:
    - defaults::cudatoolkit=10.1
    - conda-forge::graphviz=2.40.1
    - conda-forge::python-graphviz=0.13.2
    - conda-forge::scikit-learn==0.22.1
    - pip
    - pip:
      - xgboost==1.0.2

I hope that I am not replicating an already existing issue or am overseeing something trivial.

Thank you very much!

The text was updated successfully, but these errors were encountered:

trivialfis · 2020-03-31T13:10:36Z

Right now you need master branch for reproducible GPU hist.

Zethson · 2020-03-31T13:11:24Z

So I need to wait for the next release or install the current master branch manually?

trivialfis · 2020-03-31T13:12:34Z

Yes. See: #5337

Zethson · 2020-03-31T13:13:38Z

All right. Thank you for your swift help.

Zethson closed this as completed Mar 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce model on GPU using 'gpu_hist', but works with 'hist' #5458

Unable to reproduce model on GPU using 'gpu_hist', but works with 'hist' #5458

Zethson commented Mar 31, 2020

trivialfis commented Mar 31, 2020

Zethson commented Mar 31, 2020

trivialfis commented Mar 31, 2020

Zethson commented Mar 31, 2020

Unable to reproduce model on GPU using 'gpu_hist', but works with 'hist' #5458

Unable to reproduce model on GPU using 'gpu_hist', but works with 'hist' #5458

Comments

Zethson commented Mar 31, 2020

trivialfis commented Mar 31, 2020

Zethson commented Mar 31, 2020

trivialfis commented Mar 31, 2020

Zethson commented Mar 31, 2020