-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dedupe attributions that are substrings of others #2453
Dedupe attributions that are substrings of others #2453
Conversation
(This isn't a full resolution of 1804, just a short-term one that works for, e.g., the Satellite Streets style provided in Studio) |
@@ -49,6 +49,15 @@ Attribution.prototype = util.inherit(Control, { | |||
} | |||
} | |||
|
|||
// remove any entries that are duplicates or strict substrings of another entry. | |||
attributions.sort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you sort the attributions before filtering them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucaswoj i figured sorting the list first means that, in the filter function for the i-th
item, you can safely start the loop at i+1
. Without this, you have to check every item, and I think it gets a little weird to correctly filter out exact duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether you sort or not, you're comparing every attribution string to every other attribution string. I don't think the sort is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're comparing every attribution string to every other attribution string
Agreed, but it's also directional, since we're looking for a substring match and want to keep the longer of the two if there's a match.
I agree the sort isn't necessary, but removing it does make the logic in the filter function a little more particular... I think it would then need to be something like:
attribution = attribution.filter(function (a, i) {
// check if `a` is a substring of another string in the list
var isSubsumedByAnother = attribution.some(function (b, j) {
// skip if we're checking the same element
if (i === j) { return false }
// if a === b, then only call it a match if a's index is higher; otherwise
// we would filter out both elements when there's an exact match
if (a === b) { return i > j }
// check if a is a substring of b
if (b.indexOf(a) >= 0) { return true }
})
return !isSubsumedByAnother
})
@lucaswoj I considered this, and ended up feeling that both approaches were equally bad... lemme know if you find this way clearer and I can change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but it's also directional, since we're looking for a substring match and want to keep the longer of the two if there's a match.
That's a good point. We should sort by length not alphabetic ordering, then.
@lucaswoj Updated:
|
I'm wondering why the builds fail here... The error is not very telling, something about query tests. |
@mourner I found that this happened on at least one other PR, and I wondered if it could be some kind of Circle config issue relating to this branch being on a fork rather than the main repo? |
@mourner Oh yeah, I remember now--here's the message that made me think that last time:
|
The build is failing because the branch exists in @anandthakker's repo, not the @mapbox repo. It appears that Coveralls doesn't work with pull requests originating in other repos. We should fail silently in this case. I'll cut a separate ticket. |
I'm feeling 👍 on merging this. Let's wait a day to give @tmcw @jfirebaugh @ansis @planemad a chance to weigh in. |
looks good to me |
Refs #1804