You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The error happens if the following conditions are met:
input is fp32
n_rows >= 65536
normalize=True solver parameter is used.
Symptoms:
The input feature matrix X has NaN values here. This results the initial correlations to be NaN.
Since the initial correlations are NaN, we exit at the first iteration.
Things to note:
The error only happens if all the three conditions above are true. This might indicate a problem during interaction of the CuPy preprocessing and our cpp solver.
A cpp unit test with large input works properly
Adding print("X number of nans ", cp.sum(cp.isnan(X))) before and after the call to the cpp solver shows that there are 0 NaN in X according to CuPy.
[E] [11:27:28.775891] Correlation is not finite, aborting.
[D] [11:27:28.776140] /mydata/cuml_lars/cpp/src/solver/lars_impl.cuh:781 Iteration 0, selected feature 0 with correlation nan
Expected behavior
Fit the model correctly. Here is a sample output from fp64 fit:
If X.shape[0] > 65535 than x_scale becomes fp64, otherwise it is fp32.
The data type of X after scaling will be the same of x_scale.
The pointer to X was passed to the cpp layer by casting it to fp32 data type.
The cpp solver tried to use X assuming fp32 type, but the actual data was fp64. This produced NaNs and in general invalid values.
The fix is trivial, an explicit type cast: X.dtype.type(X.shape[0]). Additionally type checks were added before the pointers are passed to the cpp solver. The fix was implemented in bda29cc
Describe the bug
The error happens if the following conditions are met:
normalize=True
solver parameter is used.Symptoms:
Things to note:
print("X number of nans ", cp.sum(cp.isnan(X)))
before and after the call to the cpp solver shows that there are 0 NaN in X according to CuPy.Steps/Code to reproduce bug
fp32 support is disabled, one needs to enable it by changing the source of the Cython wrappers.
Output:
Expected behavior
Fit the model correctly. Here is a sample output from fp64 fit:
The last value is the score, it should be very close to 1 even in fp32.
The text was updated successfully, but these errors were encountered: