Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relevance for Lucene needs to be tweaked #411

Closed
half-ogre opened this issue Feb 15, 2012 · 7 comments
Closed

Relevance for Lucene needs to be tweaked #411

half-ogre opened this issue Feb 15, 2012 · 7 comments
Assignees

Comments

@half-ogre
Copy link
Contributor

We need to tweak the relevance weights for the search. As an example, searching for 'Entity' should put EntityFramework at the top: it's a partial ID match, partial title match, and has a crazy amount of downloads. This might be as easy as adding the download count as a factor.

I don't think we need to get this perfect from day 1 (relevance is something we might always tweak), but this is a good example of one we should be able to get right.

@half-ogre
Copy link
Contributor Author

I said "at the top", but I really mean "closer to the top". It's currently 7th on the list even though it has way more downloads than the other search result. While I think we have to be careful not to give download count too much weight in a relevance search, I think giving it some weight is appropriate. I also think part of the problem might be that we need to give partial ID even more weight compared to partial title, and that the relative weight of description might be too high.

@pranavkm
Copy link
Contributor

There's a way to add weights to records as whole, we could try experimenting with that. However, the only trouble with it is we'd necessarily have to recreate the index ever so often (perhaps a couple of days?).

@osbornm
Copy link
Contributor

osbornm commented Feb 20, 2012

Just playing devil's advocate here but we should make sure that its easy for new packages to get seen. I would want to make sure exact ID matches always outweighed download count so a new package could get exposure.

@half-ogre
Copy link
Contributor Author

@osbornm Yes, I agree that exact ID matches should definitely have the strongest weight, and should be at the top.

@pranavkm
Copy link
Contributor

It does that today, unfortunately they are strong matches i.e. EntityFramework would result in an exact match but not Entity Framework (with a space). I could tweak around with that.

@BenPhegan
Copy link

@half-ogre +1. We run the same commit of NuGetGallery as nuget.org internally, and a lot of our packages are something like "Organisation.NameSpace.ArchitectCrap.Widget.Library". Searching by id is almost impossible with the existing toolset (Gallery or Nuget) as it seems to default now to a lucene search on a '.' split keyword or something similar (really bad with a lot of packages like the one above). It would be nice for an explicit, exact id match to only result in 1 package (even if it has to be quoted), or at the very least be the one at the top.

Today, if you search for "NuGet.Extensions" on http://nuget.org, it turns up mid-way down the second page, despite the exact package id being entered.

@ghost ghost assigned howarddierking Sep 12, 2012
@analogrelay
Copy link
Contributor

This specific issue is fixed. We're always reviewing our lucene stuff though.

joelverhagen pushed a commit that referenced this issue Jul 29, 2024
#411)

* Function for adding new package source mappings to nuget.config files.

* Defaulting to `$PWD` for relative paths

* Array of patterns, so we save once.
Package source mapping section presence check.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants