Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header compiler args resolution heuristic doesn't work well for projects with separate include and source directories #441

Closed
betasheet opened this issue Jul 17, 2019 · 3 comments

Comments

@betasheet
Copy link

If used with compile_commands.json, ccls needs to infer compiler arguments for a header file from a related entry in the compilation database. To choose this entry, it uses a "similar file name" heuristic (ComputeGuessScore() in src/project.cc).

This heuristic currently computes a similarity score by comparing the prefix + suffix and counting the number of different directories between two files. This works fine for headers that are close to their respective sources (e.g. in same directory), but not so well for projects with separate include and source directories, e.g. "inc/long/path/to/my_file.h" and "src/long/path/to/my_file.cc", because neither prefix nor suffix match.

Would it be possible to update the heuristic to take into account the matching substring in the middle?

For example, I modified ComputeGuessScore locally as below, which works well for my project with separate src/include directories:

// Computes a score based on how well |a| and |b| match. This is used for
// argument guessing.
int ComputeGuessScore(std::string_view a, std::string_view b) {
  // Increase score based on common prefix and suffix. Prefixes are prioritized.
  if (a.size() > b.size())
    std::swap(a, b);
  size_t i = std::mismatch(a.begin(), a.end(), b.begin()).first - a.begin();
  size_t j = std::mismatch(a.rbegin(), a.rend(), b.rbegin()).first - a.rbegin();
  int score = 10 * i + j;
  if (i + j < a.size())
    score -= 100 * (std::count(a.begin() + i, a.end() - j, '/') +
                    std::count(b.begin() + i, b.end() - j, '/'));
  // Increase score for a common substring in the middle (e.g. in case of
  // separate source and include dirs.
  size_t max_middle = 0;
  size_t max_middle_dirs = 0;
  for (; i < a.size(); ++i) {
    for (size_t k = j; k < b.size(); ++k) {
      size_t matching = std::mismatch(a.begin() + i, a.end(), b.begin() + k, b.end()).first - (a.begin() + i);
      size_t dirs = std::count(a.begin() + i, a.begin() + i + matching, '/');
      max_middle = std::max(max_middle, matching);
      max_middle_dirs = std::max(max_middle_dirs, dirs);
    }
  }
  score += 10 * max_middle;
  score += 100 * max_middle_dirs;
  return score;
}
@MaskRay
Copy link
Owner

MaskRay commented Jul 20, 2019

for (; i < a.size(); ++i) {
for (size_t k = j; k < b.size(); ++k) {
size_t matching = std::mismatch(a.begin() + i, a.end(), b.begin() + k, b.end()).first - (a.begin() + i);

The snippet is O(n^3). We should be a bit careful with the performance here.

Can you try #445?

@betasheet
Copy link
Author

#445 seems to work well with one small modification :) b5ee0f7#r305709023

Thanks!

@MaskRay
Copy link
Owner

MaskRay commented Jul 22, 2019

Merged #445

@MaskRay MaskRay closed this as completed Jul 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants