Skip to content
This repository has been archived by the owner on Oct 30, 2021. It is now read-only.

phrasematch performance improvements #22

Closed
wants to merge 4 commits into from
Closed

Conversation

yhahn
Copy link
Member

@yhahn yhahn commented May 20, 2015

Some nice speedups.

  • Test in carmen + IRL

Roughly the goal is to return a list of max-scored phrases from a set. Previously this amounted to two loops:

  • Loop once to determine what the highest score of all the phrases is
  • Loop a second time to collect phrases that match that highest score

The optimization is to contribute to up to 6 potential lists of phrases as the first loop occurs -- one for each potential highest score value (if the highest score is 5 add to phrases5 list, 4 phrases 4 list, etc.) with early exits as we know sooner that various scores are not going to be the highest (e.g. as soon as we encounter a score 4 we know we don't need to consider phrases with score < 4).

@@ -984,31 +993,36 @@ void _phrasematchPhraseRelev(uv_work_t* req) {

// get relev back to float-land.
relev = relev / total;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best practice here would be:

relev = relev / static_cast<double>(total);

Just do make explicit that you are going to doubles. If relev were not already a double you might suffer from integer division bugs: mapbox/mapbox-gl-native#1406

@yhahn
Copy link
Member Author

yhahn commented May 20, 2015

This approach is too strict, in particular the very early exit for when relev = 5. Closing this out for now.

@yhahn yhahn closed this May 20, 2015
@yhahn yhahn deleted the phrasematchperf branch May 20, 2015 19:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants