-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve sorting for the file picker #1066
Conversation
So I was imagining this would be more like a composite score where closer files are given a small boost to their match score. (Though I think the algorithm we use now already does this by giving a bonus to matches at the start of the string, if I search for "ca" the root Similarly when I'm using the picker I want my query to matter the most, because that's what I'm searching for -- the current implementation will prioritize bad matches just because they happen to sit in the workspace root. |
I think there is a misunderstanding: What I implemented doesn't change anything if the current directory is the workspace root (except fix a bug where previously the entries were sometimes ordered arbitrarily when they had the same match score). The only thing it changes, is that if the current directory is not the workspace root, it prioritizes files that are (directly or indirectly) contained in the current directory.
The current master branch actually prioritizes There are some cases where this leads to (probably) undesired sorting: Searching "mod" in the above example will display |
Yeah this was my point. You don't want to do strict sorting (first sort by A, then B then C), instead you want a composite weighed score where the fuzzy matcher score is ~70% of the score, and the directory is 30% of the score. |
That's because helix will automatically try and find a workspace root and use it instead (we could make this configurable). I had the opposite experience of accidentally opening the editor in a subfolder and wondering why I can only see a part of my codebase. |
Okay that makes sense. If that's the direction we want to go, I can experiment a bit with a composite score. I wanted to avoid it because then we rely on the details of the fuzzy matcher (for example if they change the default values of SkimScoreConfig, we could suddenly get weird results, even though the order that the scores induce stays the same), but maybe that's the only way to get good results. |
I've changed it to just add a score boost of 5 to files in the file picker that are in the current working directory. 5 should be fairly conservative, let me know what you think, though. The only arguably weird thing I noticed is that files in the root directory are still prioritized over files in the current directory:
|
@@ -2783,6 +2786,7 @@ fn symbol_picker(cx: &mut Context) { | |||
)); | |||
Some((path, line)) | |||
}, | |||
|_, l_score, _, r_score| l_score.cmp(&r_score).reverse(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this everywhere? Why not just have it sorted in the order it will be displayed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort function specifies the order it is displayed in. If we'd replace this with a different sort function, the matches would no longer be sorted by their matching score, so we need this line to mimick the old behavior.
I thought about making the sort function parameter an Option and using this as a default if None is given. I don't think that's a great idea though, especially because most pickers can probably be improved by adding additional criteria to their sort functions.
I have this idea as well. Nice to see you implementing it. But what @archseer said is correct, it should be a composite score, it should depends on multiple factor with each factor having a different weight rather than getting score by order. |
Yes, the composite score definitely worked better than my first attempt at it. In the code, the composite score is modelled by adding a constant bonus (5) to the score, but this is equivalent to for example having a composite score where being in the current directory has a score of 1 and then composing the scores with weights 5/6 and 1/6. The advantage is that this way we only have one constant instead of 2 redundant ones. The timestamp is just a tie-breaker, basically the same as before. I don't think there's a reasonable way to assign a score bonus to a timestamp difference so the old behavior is good here. |
Implement sorting for the file picker such that: * Sorting is primarily based on the matching score with files that are in the current directory being given a small boost * Recency of access/modification/creating is used as a tie-breaker, similar to the previous implementation
adb0672
to
11bf9a8
Compare
I believe you might want to put in some comments regarding the current algorithm for ranking. The one you just explained should be good enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (not tested). I think this is good but I think I would need to have thumbs up from another maintainer.
I don't have time to test/review right now so apologies if this is a non-existent concern, but has this been tested with symlinks ? I don't know what the expected behavior should be, but I wonder how it currently behaves. |
The file picker in helix doesn't follow symlinks so there is not really anything to test. I'm also not sure in what scenario you'd want the file picker to follow symlinks. |
That's ok, I never tested the picker with symlinks so I wondered how it behaved. Not following them at all is fine to me, thanks for the answer ! |
The main motivation for this change is that files under the current working directory should be prioritized over files that are somewhere else in the same workspace.
Matches of the file picker are now sorted by:
This sort of sorting could now similiarly be implemented for other pickers (for example for buffer access recency).
EDIT: I've previously used a strictly hierarchical sorting method which didn't work as well, I've updated this text to avoid any confusion.