-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improved string normalization for deduplication #1370
Conversation
I staged this to take a look and there were three notable changes:
I had a look and it seems as though they are actually positive but marked as failing due to brittle test cases.
|
Looks good to me, is there anything else to do here before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
54d1c34
to
8d34ffa
Compare
I fixed some test issues and also now use our fancy new This is good to merge once the failing test cases I mentioned above have been investigated. |
8d34ffa
to
d7f7b16
Compare
d7f7b16
to
8b5875f
Compare
force pushed to rebase |
before
after
There is one notable test failure but I switched to
|
The three failing test cases mentioned in the previous comment are no longer failing. |
this PR improves the normalization function used for deduplication: