Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedupe attributions that are substrings of others #2453

Merged
merged 2 commits into from
Apr 20, 2016

Conversation

anandthakker
Copy link
Contributor

Refs #1804

@anandthakker
Copy link
Contributor Author

(This isn't a full resolution of 1804, just a short-term one that works for, e.g., the Satellite Streets style provided in Studio)

@@ -49,6 +49,15 @@ Attribution.prototype = util.inherit(Control, {
}
}

// remove any entries that are duplicates or strict substrings of another entry.
attributions.sort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you sort the attributions before filtering them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucaswoj i figured sorting the list first means that, in the filter function for the i-th item, you can safely start the loop at i+1. Without this, you have to check every item, and I think it gets a little weird to correctly filter out exact duplicates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether you sort or not, you're comparing every attribution string to every other attribution string. I don't think the sort is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're comparing every attribution string to every other attribution string

Agreed, but it's also directional, since we're looking for a substring match and want to keep the longer of the two if there's a match.

I agree the sort isn't necessary, but removing it does make the logic in the filter function a little more particular... I think it would then need to be something like:

attribution = attribution.filter(function (a, i) {
  // check if `a` is a substring of another string in the list
  var isSubsumedByAnother = attribution.some(function (b, j) {
    // skip if we're checking the same element
    if (i === j) { return false }
    // if a === b, then only call it a match if a's index is higher; otherwise
    // we would filter out both elements when there's an exact match
    if (a === b) { return i > j }
    // check if a is a substring of b
    if (b.indexOf(a) >= 0) { return true }
  })
  return !isSubsumedByAnother
})

@lucaswoj I considered this, and ended up feeling that both approaches were equally bad... lemme know if you find this way clearer and I can change it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but it's also directional, since we're looking for a substring match and want to keep the longer of the two if there's a match.

That's a good point. We should sort by length not alphabetic ordering, then.

@anandthakker
Copy link
Contributor Author

@lucaswoj Updated:

  • Fixed sort to be by length
  • Pulled out the source -> final attribution string into a separate method on Attribution itself to make it testable without needing a Map instance
  • Added test for deduping logic

@mourner
Copy link
Member

mourner commented Apr 19, 2016

I'm wondering why the builds fail here... The error is not very telling, something about query tests.

@anandthakker
Copy link
Contributor Author

@mourner I found that this happened on at least one other PR, and I wondered if it could be some kind of Circle config issue relating to this branch being on a fork rather than the main repo?

@anandthakker
Copy link
Contributor Author

@mourner Oh yeah, I remember now--here's the message that made me think that last time:

/home/ubuntu/mapbox-gl-js/node_modules/coveralls/bin/coveralls.js:18
        throw err;
        ^
Bad response: 422 {"message":"Couldn't find a repository matching this job.","error":true}

@lucaswoj
Copy link
Contributor

The build is failing because the branch exists in @anandthakker's repo, not the @mapbox repo. It appears that Coveralls doesn't work with pull requests originating in other repos. We should fail silently in this case. I'll cut a separate ticket.

@lucaswoj
Copy link
Contributor

I'm feeling 👍 on merging this.

Let's wait a day to give @tmcw @jfirebaugh @ansis @planemad a chance to weigh in.

@ansis
Copy link
Contributor

ansis commented Apr 19, 2016

looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants