Result deduplication #8569
-
I don't have access to the recently announced code search preview yet, but I figured I should chime in with my main problem with the existing public version of code search. The most common thing to slow me down when doing code searches is duplication. For example: https://github.com/search?q=ntsettimerresolution&type=Code When I do a search like this, I want to see something used in context so I can understand common pitfalls and avoid them. But in the case of the above, almost all the results are for various forks/copies of the ReactOS source tree. But if I've seen one of those, I've seen them all, basically -- they might have differences but they're not as notable as finding examples of the same search term in different contexts. It would be awesome if code search could deduplicate results by either:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
That's good feedback. Indeed, showing a representative set of results is quite a challenge, especially if indexing forks and branches -- lots of duplication there. We definitely try to make this more useful with the new code search. Here's the paths for the first 20 files returned -- you can see there's no duplicate paths reported. We'd love to hear any suggestions for further improvement you have, especially once you have a chance to play with it! |
Beta Was this translation helpful? Give feedback.
That's good feedback. Indeed, showing a representative set of results is quite a challenge, especially if indexing forks and branches -- lots of duplication there. We definitely try to make this more useful with the new code search. Here's the paths for the first 20 files returned -- you can see there's no duplicate paths reported.
We'd love to hear any suggestions for further improvement you have, especially once you have a chance to play with it!