Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Issue #268 #269

Merged
merged 4 commits into from
Nov 17, 2022
Merged

Fix for Issue #268 #269

merged 4 commits into from
Nov 17, 2022

Conversation

Max-Bladen
Copy link
Collaborator

Adjusted lines relating to nzr in Check.entry.wrapper.mint.block() function. If the nzr$Position object had non-zero length, features would be removed if it wasn't a DA framework AND if it was operating on the Y dataframe - ie. if block.(s)pls Y dataframe. This means that it wasn't applied to X blocks in block.(s)plsda contexts. The nzv filtering should only NOT be applied to the Y dataframe in DA frameworks.

This was changed to checking if there were any nzr features - if not then block is skipped. If its a DA framework AND its the Y dataframe, the block is skipped. Otherwise, the filtering is applied.

This introduced downstream issue in predict() called via auroc(). nzr filtering is applied to newdata, which by default is equal to object$X. If nzr is non-null for a block, the filtering is applied for newdata. This could result in filtering being applied twice with unadjusted indices, meaning high variance features may be removed accidentally.

Hence, a check was implemented in predict(). The feature names of a block are checked against the feature names in the nzr object. If the nzr features are not found in block, filtration is NOT applied.

Additional check at the end of Check.entry.wrapper.mint.block() added for safety. Ensures there are no zero variance features remaining. If so, function is stopped.

fix: improved nzv feature handling for block contexts, particularly via `auroc()`

Filtration applied more consistently via `Check.entry.wrapper.mint.block()` . Additional failsafe added here for zero variance features. `predict()` also now checks to see if filtration has been applied to prevent it applying filtering twice.
@Max-Bladen Max-Bladen added enhancement-request New feature or request wip work-in-progress labels Nov 16, 2022
@Max-Bladen Max-Bladen self-assigned this Nov 16, 2022
fix: improved nzv feature handling for block contexts, particularly via `auroc()`

Filtration applied more consistently via `Check.entry.wrapper.mint.block()` . Additional failsafe added here for zero variance features. `predict()` also now checks to see if filtration has been applied to prevent it applying filtering twice.
fix: adjusted new test to ensure it passes
@Max-Bladen Max-Bladen merged commit 3869fb0 into master Nov 17, 2022
@Max-Bladen Max-Bladen deleted the issue-268 branch November 17, 2022 00:59
@Max-Bladen Max-Bladen removed the wip work-in-progress label Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement-request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check.entry... and predict() have poor handling of near zero var features
1 participant