-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort doc search results by their total module path length #16735
Sort doc search results by their total module path length #16735
Conversation
If I type in Relevance is the most important metric here. Short paths are a rather poor approximation of relevance. I think it's an important signal to use for determining relevance, but it's not at all the most important. |
I concur with kballard here. At very least, exact matches should always be prioritized first. After that I guess this heuristic can subsort the results? |
Ah I see! Easy enough to try-- I updated the code on https://carols10cents.github.io/rustdoc-playground/std/index.html?search=read as well. Now the search results for Wdyt? |
Seems maybe a bit odd for the results to change dramatically going from "rea" to "read". I don't really know though. Maybe it's ok. What I'd really like to see is simply more intelligent searching. If I search for "ptr::re" I get lots of results that don't even have "ptr" in the name. My personal expectation of how the search should work is basically finding all results that match case-insensitively if you turn the search string into a glob pattern by inserting a Search is a hard problem. |
cc @cmr, any thoughts? |
The behavior you observe helps with misspellings. |
I have no particular feelings on search, though the other comments seem good. |
With the assumption that more commonly used items are going to be higher in the module hierarchy and have shorter names, so this is a good heuristic for more important and more relevant to the person searching the docs.
HAH, oops, my original code had a bug where it wasn't including the name. I've put it back to have the new heuristic before the exact match heuristic again, and now |
@steveklabnik do you mean the current behavior of the search helps with misspellings? Do you have an example? |
@carols10cents The search results incorporate levenshtein distance into their ordering as of this PR: #15385 |
@gankro yup, and it's currently ranking all non-levenshtein matches above all levenshtein. I've based my changes and tests from master; this change does not interfere with the levenshtein distance as it is on master. |
@carols10cents Ah, yep, you're right. (Also wow, how over-engineered is that sort function?) |
@gankro hah, yep, it's a doozy :) At least it's commented...? |
This has the primary advantage of not interfering with browser default behavior for links like being able to cmd/ctrl+click on a result to open the result in a new tab but leave the current page as-is (previous behavior both opened a new tab and changed the current tab's location to the result's). I've done my best to keep the rest of the behavior and the appearance the same-- the whole row still highlights, still has a hand cursor, still moves to the result page with a normal click, arrows+enter still work. If the result is on the current page, the search is simply hidden. The biggest difference in behavior is that people using tab to navigate through the links will have to hit tab twice for each row, since each cell has its own `a` tag.. I could fix this by switching to `div`s and `span`s instead of a table, but that's potentially more CSS finicky? The biggest difference in appearance is probably that all the text in the search results is Fira Sans now, instead of just the method name with the rest of the text in Source Serif Pro. I can put this appearance back, but it looks like all links anywhere on the page are Fira Sans. Only the name was in an `a` tag before, but the whole row was ACTING like a link, so I think this is actually more consistent. [I've pushed these changes to a gh-pages repo](https://carols10cents.github.io/rustdoc-playground/std/index.html?search=t) if you'd like to take a look at the effects; note that I also have my changes for PR #16735 there too so the search results will be sorted differently than on master.
This PR has gotten pretty stale. What's the consensus here? Is this something we want to do? |
I fiddled around with some searches to see how the results compare. For unambiguous searches it's a total wash. Minor shuffling. For ambiguous searches there's some pretty big shifts, but it's not clear if the results are better or worse as a result. I think the strongest thing I can say about this functionality is that it visually smooths out the search results, making them easier to parse, in my subjective opinion. I'm overall positive on the feature, it's just that the gains are just very marginal if real at all. That said, it's a small, easily-reversible, addition to what is already an abomination of a function. Thankfully it's not my call 😅 |
@carols10cents First, I want to apologize for the very long wait on hearing anything definitive here. We try not to let PRs fall through the cracks, but this one has. Please feel free to reach out to any of the Rust developers on IRC if you're not hearing back. Second, I want to thank you for making the PR in the first place. All that said, I think I'm going to close this PR for now. While I agree that our current sorting is not ideal, I don't think this is a clear enough improvement to warrant the churn. I think @kballard made some good suggestions about a deeper change here that might be worth pursuing. Finally, don't hesitate to reach out if you'd like guidance on contributions! |
@aturon Thanks for the response!! To be frank, I forgot about this PR too :) Thank you to everyone on this issue for your feedback, I might take another crack at this sometime and I totally understand the decision not to merge this change. 💖 |
minor: Remove outdated comment about static SmolStrs Closes rust-lang/rust-analyzer#16734
Hi! This is my first attempt at a rust contribution (:fireworks:)! I think this could improve the documentation search results.
Using the assumption that more commonly used items are going to be higher in the module hierarchy and have shorter names, I've added a sort using the total module path length where longer will be ranked later. I think this is a good heuristic for more important and more relevant to the person searching the docs.
I've put the docs with my change up here, if you'd like to play with it to see the difference: https://carols10cents.github.io/rustdoc-playground/std/index.html?search=read
Here are some screenshots of search results before and after this change, where before is the nightly docs as of today on rust-lang.org. I haven't been using rust long enough to be able to say this is always going to be the better ranking, but it doesn't seem to make the results in general any worse, and in some cases makes them dramatically better.
An added bonus is that you can see a bit better how the results are being sorted; it just looks less random when they're ordered (in groups) shortest to longest.
I'd love to hear what more experienced rust documentation users think!