Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit n_neighbors to n_samples before matching #38

Merged
merged 1 commit into from
Jul 18, 2022

Conversation

mmdanziger
Copy link
Contributor

Closes #37 .

Because of how matching is executed based on sklearn, both directions are matched even if only one is requested. The filtering by direction happens when the results are reported, not at match time. Therefore you can have a situation as reported in the linked issue in which one direction has enough neighbors to match while the other direction does not and neither directions work.

The proposed solution is to enable one direction of matching even when the other has too few samples. In reality it does execute the matching in both directions, but the problematic direction is limited to n_samples == n_neighbors and a warning is printed to screen.

Failing unit test which is reproduces the problem and is fixed by this code is added in this PR as well.

@CLAassistant
Copy link

CLAassistant commented Jul 13, 2022

CLA assistant check
All committers have signed the CLA.

this is to enable one direction of matching even when the other
direction is not well-defined, in response to
BiomedSciAI#37

Signed-off-by: Michael M Danziger <Michael.Danziger@ibm.com>
@mmdanziger mmdanziger force-pushed the matching-n-neighbors-fix branch from 4713b06 to b64bb18 Compare July 18, 2022 10:09
@ehudkr ehudkr self-requested a review July 18, 2022 12:17
Copy link
Collaborator

@ehudkr ehudkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks

@ehudkr ehudkr merged commit bd949ad into BiomedSciAI:master Jul 18, 2022
@mmdanziger mmdanziger deleted the matching-n-neighbors-fix branch July 20, 2022 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Matching more neighbors than the number of examples in the treatment group
3 participants