Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While the error that resulted in this PR was raised by
auroc()
, the issue stems from thepredict()
function.Lack of more explicit warning against near zero variance features in
block.splsda()
will be address in separate PR.For framework presented in reprex in associated GitHub Issue (here).
Take a given feature in one of the predictor blocks. If it's all 0s:
block.splsda()
. Results in same all zero vector as center = 0, scale = 0.object$X
used asnewdata
parameter forpredict()
call inauroc()
predict()
, 0 values have 0 subtracted from them (centered) are divided by 0 (scaling), resulting inNaN
in those predictor values. (In R,0/0 == NaN
)NaN
s can be handled safely (ignored) by the remainder of the function, resulting in valid predictions.If that feature are all the same non-zero value (eg. all equal to 1):
block.splsda()
. Results in all zero vector but center = 1, scale = 0. Stored inobject$X
object$X
used asnewdata
parameter forpredict()
call inauroc()
predict()
, 0 values have 1 subtracted from them (centered) are divided by 0 (scaling), resulting inInf
in those predictor values. (In R,1/0 == Inf
)Inf
s CANNOT be handled safely by the remainder of the function, causing all predictions made on that block to beNaN
. This results in downstream issues (like the error raised by `auroc() -> statauc() -> roc.default() -> roc.utils.perfs.fast.all.threshold() -> cut()Hence, when the
newdata
parameter is centered and scaled using attributes ofobject$X
, function now checks if any of the values are not finite. If so, thenInf
or-Inf
are replaced byNaN