Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marketplace search improvements #154

Closed
14 of 17 tasks
isidorn opened this issue Aug 19, 2021 · 47 comments
Closed
14 of 17 tasks

Marketplace search improvements #154

isidorn opened this issue Aug 19, 2021 · 47 comments
Assignees
Labels
enhancement New feature or request search

Comments

@isidorn
Copy link
Collaborator

isidorn commented Aug 19, 2021

Hi VS Code PM here 👋

I understand that there are already search issues opened in this repository, however I wanted to have one super issue which has all the items linked.
Improving search is the number one thing we would want for the Marketplace team to improve 💯

If needed we can provide many more examples where search can be improved, but here's a list of issues to start. These examples are mostly from the Extensions view in VS Code which uses Marketplace search API.

@jeff-hykin
Copy link

As one of the last people who commented on one of the main downstream issues 2 years ago: I'm really glad to see this being opened! 👍

@christianshay
Copy link

There are special characters (for example "/") that are not needed and not documented that are messing up the tags.

This tag link is bringing up any extension that includes "PL" or "SQL" instead of those tagged "PL/SQL":
https://marketplace.visualstudio.com/search?term=tag%3APL%2FSQL&target=VSCode&category=All%20categories&sortBy=Relevance

@isidorn
Copy link
Collaborator Author

isidorn commented Mar 22, 2022

One more example:

image

@bpasero
Copy link
Member

bpasero commented Mar 22, 2022

Just to clarify on the above example: even if I put a 100% matching search for the name of the extension, the extension I search for is not on the top:

image

@prashantvc
Copy link
Contributor

prashantvc commented May 16, 2022

We at VS Marketplace started the work to improve the search relevancy.
As part of this effort, we need your help to curate the search results to define baseline KPI

How can you help?

  1. Visit https://ms-extensions.azurewebsites.net/
  2. Search for an extension (any of your favourite extensions)
  3. Observe the search results, do they make sense?
  4. Rearrange results, you can drag and drop, any item need to be on the top? Ranked lower
  5. Fill in the feedback, rational behind the rearranging the results
  6. Hit submit and repeat for different search terms

@jeff-hykin
Copy link

jeff-hykin commented May 16, 2022

There's lots of examples from previous posts that are still unfixed but sure I'll oblige and redo the search/screenshots

Example 1

Prolog Syntax
Screen Shot 2022-05-16 at 3 29 28 PM

Want to see & Justification

(The "Prolog" extension being at the top is fine)

  1. My C++ syntax extension shouldn't be the 2nd option
  2. A theme called "syntax" shouldn't be the 3rd option
  3. My "Better Prolog Syntax" extension should be above them considering it has:
  • a title with 100% of the search query
  • keywords that match 100% of the search query
  • been updated more recently
  • and isn't being drug-down by missing or low ratings (5/5 stars)

Me wanting the the "Prolog Language" extension higher needs more justification though:

Justification/Expected Feature: Uncommon Word Importance

If someone searches for "Code Entschuldigung" showing the most-popular extension with "Code" in the name is like showing an the most-popular extension with the word "The" in the name, because "Code" is absurdly common. In contrast, the other term "Entschuldigung" is extremely uncommon. If there is an extension with an extremely uncommon word that matches, then its almost certain thats what the user is looking for. This applies to "Prolog Syntax" search, "Prolog" is uncommon "Syntax" is very common.

The equation for this is simple, its just Bayes Rule:

  • How many times does a word appear across all extensions?
    • "Language", "Code", "Syntax" will have really high numbers
    • terms like "Prolog" or "Entschuldigung" will have extremely low
  • Inversely weight a word based on its frequency across all extensions

I mean come on this is search-101 methodology

Example 2

Prolog Language
Screen Shot 2022-05-16 at 3 45 22 PM

Want to see

The one with arrow above Angular

Justification/Expected Feature: Negative Relevance

  • If an extension title contains niche topics (high Bayes score) like "Japanese" and "Angular" that are NOT in the search, that should be penalty.
    => If someone was searching for Angular or Japanese they would type "Angular" or "Japanese" not "Prolog"

Example 3

code eol
Screen Shot 2022-05-16 at 3 33 41 PM

(Keywords of the "Code-eol (Line Endings)" extension)
Screen Shot 2022-05-16 at 3 34 50 PM

Want to see

I'm going to assume the reader is intelligent and gets the idea.

As a side note I would expect word2vec similarity metrics, but I think VS Code Marketplace needs to start with the basics before I can request that.

@prashantvc
Copy link
Contributor

@jeff-hykin Thanks a lot for taking time to write up a detailed answer. We are doing data driven, and iterative process to improve the search relevancy. Yes, we are starting with basics; you can expect better tokanisation and word2vec will eventually make into the marketpalce

@lramos15
Copy link
Member

lramos15 commented Jun 3, 2022

Here's a gif of searching for Azure Repos. I need to type the full name before it is even in view.
Recording 2022-06-03 at 12 09 58

@benibenj
Copy link

Here a comparison between the old and new Search results. I entered the exact same name of an extension. In the old version it shows up at second place which is fine (I think it should be first on exact match). In the new version it's placed a lot further down!

Old results:

New results:

@prashantvc
Copy link
Contributor

@benibenj thanks for a lot for the report! We will investigate it.
But generally, how do you like the new search service?

@benibenj
Copy link

It seems that smaller extensions (less downloads) are harder to find if they have multiple words in their name (Python C++ Debugger for example).

@hediet
Copy link
Member

hediet commented Oct 13, 2022

GitLens should be first when searching for Git Lens:

image

Version Lens should be below, as it does not even mention git.

@kj0171
Copy link

kj0171 commented Oct 20, 2022

We have done some bug fixes and improvements on the marketplace search service. The changes are live in vscode insider. Do try them out and share feedback :)

Search.Relevancy.Update.Slow.mp4

@hediet
Copy link
Member

hediet commented Oct 20, 2022

Nice work on the update!

However, I find this problematic:

image

The GitLens extension does not show up here.

@isidorn
Copy link
Collaborator Author

isidorn commented Oct 21, 2022

@hediet thanks for the feedback, we are looking into this exact case. The issue is that we are doing prefix matching instead of fuzzy matching.

@lramos15
Copy link
Member

Maybe an edge case but when searching by publisher i.e. Matt Bierner the top result doesn't make sense and the third result is a bit of a weaker match I'd say.
image

@isidorn
Copy link
Collaborator Author

isidorn commented Oct 25, 2022

@lramos15 good catch, thanks for reporting this.
@SaiKanth007 @kj0171 let's double check this one once we do the improvements we agreed on.

@kj0171 kj0171 self-assigned this Oct 31, 2022
@pcjmfranken
Copy link

Simple filtering options for statistical properties such as last updated date, download count, verification status, etc. Preferably multiple of such filters could be applied simultaneously.

This would, for example, allow me to display only extensions updated within the last 30 days, with a download count of at least 2500, and only by verified publishers.

This data is already available to the marketplace search results page (took a quick peek at the devtools network panel), so why not use it?

@kj0171
Copy link

kj0171 commented Nov 18, 2022

Simple filtering options for statistical properties such as last updated date, download count, verification status, etc. Preferably multiple of such filters could be applied simultaneously.

This would, for example, allow me to display only extensions updated within the last 30 days, with a download count of at least 2500, and only by verified publishers.

This data is already available to the marketplace search results page (took a quick peek at the devtools network panel), so why not use it?

Thanks for the feedback. We are aware of this. Will try to address this in near future.

@kj0171
Copy link

kj0171 commented Nov 18, 2022

We have rolled out search enhancements (Details here: #154 (comment)).

Please share feedback and reopen if necessary.

@kj0171 kj0171 closed this as completed Nov 18, 2022
@isidorn
Copy link
Collaborator Author

isidorn commented Nov 18, 2022

Thanks @kj0171

To clarify, most of the improvements can be seen in VS Code Insiders.
While we plan to have these improvements in VS Code Stable soon.
Try them out in VS Code Insiders and let us know what you think.

@alefragnani
Copy link

Just noticed something on Insiders release that could lead to an impersonating issue discussed here https://vscode-dev-community.slack.com/archives/C74CB59NE/p1673358662096609, based on a post in https://blog.aquasec.com/can-you-trust-your-vscode-extensions

The marketplace search does not respect exact match if you use the publisher field.

I search for an extension, clicked in the publisher name to see other extensions of that publisher (myself), but the new search also returns extensions from authors with similar names.

image

Based on this comment it seems only well-known publishers are handled, but I would argue that this change should be revisited.

Also, if you misstype the search, using publisher:"microsoft instead (yes, just missing the final double quotes), the result is not good at all. This error, on the other hand, happens on both, Stable and Insiders releases.

image

Thank you

@kj0171
Copy link

kj0171 commented Jan 11, 2023

Just noticed something on Insiders release that could lead to an impersonating issue discussed here https://vscode-dev-community.slack.com/archives/C74CB59NE/p1673358662096609, based on a post in https://blog.aquasec.com/can-you-trust-your-vscode-extensions

The marketplace search does not respect exact match if you use the publisher field.

I search for an extension, clicked in the publisher name to see other extensions of that publisher (myself), but the new search also returns extensions from authors with similar names.

image

Based on this comment it seems only well-known publishers are handled, but I would argue that this change should be revisited.

Also, if you misstype the search, using publisher:"microsoft instead (yes, just missing the final double quotes), the result is not good at all. This error, on the other hand, happens on both, Stable and Insiders releases.

image

Thank you

Thank you for pointing out the issue. For a few reasons, we aren't able to take this fix forward. We are aware of this issue and are working towards this. It will soon be fixed. We will keep you updated.

@kj0171 kj0171 reopened this Jan 11, 2023
@prashantvc
Copy link
Contributor

prashantvc commented Jan 11, 2023

We deployed the search improvements.
Note: This is an ongoing effort, and more improvements will follow. We appriciate if you have any feedback for the team

You will start seeing more relevant results, and support for following features:

  • Inclusion and Exclusion (+, -)
  • Enforce the search results to include or exclude search terms
  • Special Word Support (AND, OR, NOT)
  • Overall relevancy improvements

Light Theme - Previous Search
Dark Theme - New and Improved Search

Inclusion and Exclusion (+, -) [#20]
image

Multi-word search
image

AND/OR
You can now use AND/OR operation with multiword searches

Screenshot 2023-01-11 at 15 39 04

NOT
Exclude unwanted search terms
Screenshot 2023-01-11 at 15 39 39

@prashantvc
Copy link
Contributor

The marketplace search does not respect exact match if you use the publisher field.

@alefragnani This is quite an unique problem, thanks a lot for reporting. There are two opinions within the team/VS Code users.
We are still figuring out the intent behind the query. Did user mean to apply a filter? Did they try search extensions by publisher name!? It may take awhile conclude this.

The missing double quote can be handled better, it's on our list to fix it.

@prashantvc
Copy link
Contributor

Hey All,

We have deployed (7th Feb) number of changes to the VS Code Insiders improving relevancy, especially for multi-word searches and overrall search experience. The improvements will make their way into VS Code Stable in couple of weeks

Please give it try and let us know what you think. We will continue to work on improving search in VS Code as well as in the Marketplace; your continued support and feedback will help us make it better for the community.

Thank you all for participating in the discusstion. Please feel free to leave comments or contact me directly, we can chat about ideas and possible improvents (Booking Link)

I am closing this issue, and continue discussion in the open issues listed in the description

@isidorn
Copy link
Collaborator Author

isidorn commented Feb 9, 2023

I created this follow up issue to make sure @alefragnani publisher bug is still captured
#580

@SaiKanth007 mentioned to me that this should be fixed end of March.

@xavierdecoster xavierdecoster removed this from the Planned milestone Mar 28, 2023
@gerroon
Copy link

gerroon commented Oct 18, 2024

This is not working for me. Is this implemented?

@enabled NOT @category:"themes"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request search
Projects
None yet
Development

No branches or pull requests