Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: help in constructing custom match_fun's. #65

Open
dmurdoch opened this issue Feb 21, 2020 · 0 comments
Open

Suggestion: help in constructing custom match_fun's. #65

dmurdoch opened this issue Feb 21, 2020 · 0 comments

Comments

@dmurdoch
Copy link

dmurdoch commented Feb 21, 2020

It would be nice to have a simple way to get the match_fun as used in stringdist_join() so that a custom match_fun could be built using it. For example, this SO post https://stackoverflow.com/q/60336083/2554330 wants fuzzy matching, but only if the first letter is an exact match. It would be nice to write a solution like this:

fuzzy_match <- stringdist_match(max_dist = 2, ...) # with args from stringdist_join that are 
                                                   # used in its match_fun

first_letter_match <- function(col1, col2) 
                                    sub("(^.).*", "\\1", col1) == sub("(^.).*", "\\1", col2)
custom_match <- function(col1, col2) 
                                    first_letter_match(col1, col2) & fuzzy_match(col1, col2)

fuzzy_inner_join(df1, df2, by = "name", match_fun = custom_match)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant