Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict() produces all NaNs for a given block #266

Closed
Max-Bladen opened this issue Nov 15, 2022 · 0 comments · Fixed by #267
Closed

predict() produces all NaNs for a given block #266

Max-Bladen opened this issue Nov 15, 2022 · 0 comments · Fixed by #267
Assignees
Labels
bug Something isn't working

Comments

@Max-Bladen
Copy link
Collaborator


🐞 Describe the bug:

When applying auroc() to block.splsda objects, if any of the predictors variance equal to 0 and a center not equal to 0, following error is raised:

Error in cut.default(cases, thresholds) : invalid number of intervals.

If the zero-variance feature(s)'s center is 0, NO error is raised. If it not 0, then error raised.


🔍 reprex results from reproducible example including sessioninfo():

X1 <- data.frame(matrix(rnorm(100000, 5, 5), nrow = 100))
X2 <- data.frame(matrix(rnorm(150000, 5, 5), nrow = 100))
Y <- c(rep("A", 50), rep("B", 50))

X <- list(block1=X1, block2=X2)

list.keepX <- list(block1=c(15, 15), block2=c(30,30))


## NO ERROR RAISED
X$block1[,1] <- rep(0, 100)
model = block.splsda(X = X, Y = Y, ncomp = 2,
                          keepX = list.keepX, design = "full")
#> Warning in cor(A[[k]], variates.A[[k]]): the standard deviation is zero
auc.splsda = auroc(model)

    #> $block1
    #> $block1$comp1
    #>           AUC   p-value
    #> A vs B 0.8364 6.723e-09
    #> 
    #> $block1$comp2
    #>           AUC   p-value
    #> A vs B 0.9104 1.515e-12
    #> 
    #> 
    #> $block2
    #> $block2$comp1
    #>           AUC   p-value
    #> A vs B 0.9404 3.197e-14
    #> 
    #> $block2$comp2
    #>           AUC p-value
    #> A vs B 0.9812       0

## ERROR RAISED
X$block1[,1] <- rep(1, 100)
model = block.splsda(X = X, Y = Y, ncomp = 2,
                     keepX = list.keepX, design = "full")
    #> Warning in cor(A[[k]], variates.A[[k]]): the standard deviation is zero
auc.splsda = auroc(model)
    #> Error in cut.default(cases, thresholds): invalid number of intervals

Created on 2022-11-15 with reprex v2.0.2

Note: when first feature of first block are all 0s, auroc() functions as normal. If feature is all 1s, auroc() fails.


🤔 Expected behavior:

Features with zero variance should not be allowed - this will be addressed in its own branch and pull request.

A fail safe should be implemented as temporary fix


💡 Possible solution:

Requires exploration

@Max-Bladen Max-Bladen added the bug Something isn't working label Nov 15, 2022
@Max-Bladen Max-Bladen self-assigned this Nov 15, 2022
@Max-Bladen Max-Bladen added the wip work-in-progress label Nov 15, 2022
Max-Bladen added a commit that referenced this issue Nov 15, 2022
fix: added fail safe for when `Inf` or `-Inf` are found in transformed `newdata` data frame.

Changes them to `NaN` which can be safely handled by downstream functions
Max-Bladen added a commit that referenced this issue Nov 15, 2022
tests: added test to maintain coverage
Max-Bladen added a commit that referenced this issue Nov 15, 2022
* Fix for Issue #266

fix: added fail safe for when `Inf` or `-Inf` are found in transformed `newdata` data frame.

Changes them to `NaN` which can be safely handled by downstream functions

* Fix for Issue #266

tests: added test to maintain coverage
@Max-Bladen Max-Bladen linked a pull request Nov 15, 2022 that will close this issue
@Max-Bladen Max-Bladen removed the wip work-in-progress label Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant