Update fetchLinkSuggestions to sort results by relevancy #62397

noisysocks · 2024-06-07T06:18:36Z

What?

Updates fetchLinkSuggestions, which is used by LinkControl, which is used by the popover that appears when you insert a link, to sort its results by relevancy to the search query that the user typed in.

Why?

When inserting a link, the user can search for posts, pages, tags, categories, post formats, and media. The search is done via the /wp/v2/search REST API endpoint.

Unfortunately WordPress can only search one database table at a time. For example it can't search wp_posts and wp_terms using one SQL query. The REST API does not hide this limitation and forces you to call/wp/v2/search with just one type param.

To get around this limitation, fetchLinkSuggestions currently makes 4 requests (posts, taxonomies, post formats, media) at once using a Promise.all, and then concatenates the results together. There is a max of 20 results per request for a combined max of 80 results. The first 20 results of the combined results are shown to the user.

There's a problem with this approach. Say the user is searching for a category named "Travel Tips" and that there are 20 pages or posts containing the words "travel" or "tips". The "Travel Tips" category will never be shown because it is crowded out by the 20 posts that appear in the combined results before any single category appears.

How?

My fix for this issue is to sort the combined results before we take the first 20 and show them to the user.

I'm sorting the results by how similar the title is to the search query that the user provided. This is done using cosine similarity. We treat the search query as one document and the result title as another document. Then, we build term frequency map vectors for each document. Finally, we calculate the relevancy score by taking the cosine similarity of the two vectors.

This PR now sorts the results by scoring each result where the score is the number of tokens in the title that are also in the search query, divided by the total number of tokens in the title. This gives us a score between 0 and 1, where 1 is a perfect match. This achieves good enough results but is much simpler to understand than the previous approach described above.

Alternative approaches

Ideally this would all be handled at the database level using full text search. I don't think we can assume that every WordPress installation's MySQL database has full text search enabled, however.

We could look at doing logic similar to what's in this PR at the server level. I'm not sure if we should, though. The REST API is arguably correct to not encapsulate this limitation. Not all clients will want to order and combine results the same way.

Doing this in the client doesn't prevent us from moving the logic to the server in the future, so it's a good place to start.

I tried simple keyword matching as an alternative algorithm for ranking results. This is where you award 1 point per keyword that the title and search term have in common. It didn't work as well for me in testing, though, because cosine similarity will give smaller titles that are similar to the query an edge over long titles that contain lots of irrelevant words in addition to the query.

House keeping

This PR also contains some house keeping while I'm in this part of the codebase:

~~Rename __experimentalFetchLinkSuggestions to fetchLinkSuggestions.~~
Rewrite fetchLinkSuggestions in TypeScript. (It was already partially JSDoc typed.)

Testing Instructions

You really don't notice this bug unless you're testing with real data and have more than 20 posts, tags, etc.

If you don't have any real data, you use WP CLI to create some realistic-enough posts, pages, tags, and categories. Here's some test data that I had ChatGPT spit out:

Test data

# If you're using wp-env, this alias is helpful.
alias wp='npx wp-env run cli wp'

wp post create --post_type=post --post_title="Exploring the Streets of Paris" --post_status=publish
wp post create --post_type=post --post_title="A Weekend in Rome: What to See and Do" --post_status=publish
wp post create --post_type=post --post_title="Hidden Beaches of Thailand" --post_status=publish
wp post create --post_type=post --post_title="Hiking Trails in the Swiss Alps" --post_status=publish
wp post create --post_type=post --post_title="Cultural Delights of Tokyo" --post_status=publish
wp post create --post_type=post --post_title="Road Trip Through the Australian Outback" --post_status=publish
wp post create --post_type=post --post_title="Discovering Ancient Ruins in Greece" --post_status=publish
wp post create --post_type=post --post_title="Foodie Adventures in Mexico City" --post_status=publish
wp post create --post_type=post --post_title="Safari Experiences in Kenya" --post_status=publish
wp post create --post_type=post --post_title="Island Hopping in the Philippines" --post_status=publish
wp post create --post_type=post --post_title="Exploring the Markets of Marrakech" --post_status=publish
wp post create --post_type=post --post_title="Cityscape Views from New York" --post_status=publish
wp post create --post_type=post --post_title="The Nightlife of Berlin" --post_status=publish
wp post create --post_type=post --post_title="Adventure Sports in New Zealand" --post_status=publish
wp post create --post_type=post --post_title="Luxury Travel in Dubai" --post_status=publish
wp post create --post_type=post --post_title="Historical Sites in Egypt" --post_status=publish
wp post create --post_type=post --post_title="Wine Tasting in Napa Valley" --post_status=publish
wp post create --post_type=post --post_title="Cruising the Caribbean" --post_status=publish
wp post create --post_type=post --post_title="Exploring National Parks in the USA" --post_status=publish
wp post create --post_type=post --post_title="City Guide to Barcelona" --post_status=publish

wp post create --post_type=page --post_title="Top Travel Destinations 2024" --post_status=publish
wp post create --post_type=page --post_title="Ultimate Packing Guide for Travelers" --post_status=publish
wp post create --post_type=page --post_title="How to Plan a Budget Trip" --post_status=publish
wp post create --post_type=page --post_title="Travel Safety Tips for Solo Travelers" --post_status=publish
wp post create --post_type=page --post_title="Family Travel: Best Destinations" --post_status=publish
wp post create --post_type=page --post_title="Romantic Getaways Around the World" --post_status=publish
wp post create --post_type=page --post_title="Top Cultural Festivals to Attend" --post_status=publish
wp post create --post_type=page --post_title="Best Road Trips in the USA" --post_status=publish
wp post create --post_type=page --post_title="Luxury Hotels and Resorts" --post_status=publish
wp post create --post_type=page --post_title="Backpacking Tips for Beginners" --post_status=publish
wp post create --post_type=page --post_title="How to Travel Sustainably" --post_status=publish
wp post create --post_type=page --post_title="Best Cruise Lines for 2024" --post_status=publish
wp post create --post_type=page --post_title="Guide to Adventure Travel" --post_status=publish
wp post create --post_type=page --post_title="Exploring Local Cuisines" --post_status=publish
wp post create --post_type=page --post_title="Top Travel Apps You Need" --post_status=publish
wp post create --post_type=page --post_title="Best Beaches in the World" --post_status=publish
wp post create --post_type=page --post_title="Traveling with Pets: What You Need to Know" --post_status=publish
wp post create --post_type=page --post_title="City Guides: Where to Go and What to See" --post_status=publish
wp post create --post_type=page --post_title="Travel Insurance: Do You Need It?" --post_status=publish
wp post create --post_type=page --post_title="How to Travel with Kids" --post_status=publish

wp term create category "European Adventures" --description="Explore the best of Europe."
wp term create category "Asian Escapades" --description="Discover the wonders of Asia."
wp term create category "African Safaris" --description="Experience the wildlife of Africa."
wp term create category "American Road Trips" --description="Best road trips in the USA."
wp term create category "Oceania Discoveries" --description="Explore Australia and New Zealand."
wp term create category "South American Journeys" --description="Adventure through South America."
wp term create category "City Guides" --description="Top cities to visit around the world."
wp term create category "Beach Holidays" --description="Best beach destinations."
wp term create category "Cultural Experiences" --description="Immerse in different cultures."
wp term create category "Historical Travels" --description="Visit historical sites."
wp term create category "Luxury Travels" --description="Travel in luxury."
wp term create category "Budget Travels" --description="Travel on a budget."
wp term create category "Family Vacations" --description="Best destinations for families."
wp term create category "Romantic Getaways" --description="Perfect destinations for couples."
wp term create category "Adventure Travel" --description="For the adventurous souls."
wp term create category "Food and Travel" --description="Explore culinary delights."
wp term create category "Sustainable Travel" --description="Eco-friendly travel tips."
wp term create category "Solo Travel" --description="Tips for traveling alone."
wp term create category "Travel Tips" --description="General travel advice."
wp term create category "Cruise Holidays" --description="Best cruises to take."

wp term create post_tag "Travel Tips" --description="Essential tips for travelers."
wp term create post_tag "Adventure" --description="For the thrill-seekers."
wp term create post_tag "Beaches" --description="Best beaches around the world."
wp term create post_tag "Cultural" --description="Cultural experiences and festivals."
wp term create post_tag "Historical" --description="Visit historical sites and monuments."
wp term create post_tag "Luxury" --description="Luxury travel experiences."
wp term create post_tag "Budget" --description="How to travel on a budget."
wp term create post_tag "Family" --description="Best destinations for families."
wp term create post_tag "Romantic" --description="Romantic getaways for couples."
wp term create post_tag "Solo" --description="Tips for solo travelers."
wp term create post_tag "Foodie" --description="Explore local cuisines."
wp term create post_tag "Nature" --description="Nature and wildlife experiences."
wp term create post_tag "Urban" --description="City travel guides."
wp term create post_tag "Road Trip" --description="Best road trips."
wp term create post_tag "Cruise" --description="Cruise holidays."
wp term create post_tag "Hiking" --description="Best hiking trails."
wp term create post_tag "Beach" --description="Top beach destinations."
wp term create post_tag "Mountain" --description="Mountain adventures."
wp term create post_tag "Island" --description="Island hopping experiences."
wp term create post_tag "Festival" --description="Top festivals to attend."

Now:

Edit a template or post.
Insert a link. A good place to test this is in the Navigation block.
Search for something that's not a post, e.g. a tag or a category.

Screenshots or screencast

Before:

Kapture.2024-06-07.at.16.16.16.mp4

After:

Kapture.2024-06-07.at.16.17.18.mp4

github-actions · 2024-06-07T06:18:54Z

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @scrobbleme.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Unlinked contributors: scrobbleme.

Co-authored-by: noisysocks <noisysocks@git.wordpress.org>
Co-authored-by: ellatrix <ellatrix@git.wordpress.org>
Co-authored-by: ntsekouras <ntsekouras@git.wordpress.org>
Co-authored-by: ramonjd <ramonopoly@git.wordpress.org>
Co-authored-by: andrewserong <andrewserong@git.wordpress.org>
Co-authored-by: skorasaurus <skorasaurus@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

ellatrix · 2024-06-07T06:30:19Z

Is this really a bug? Should this be backported for WP 6.6? If so, please add the Backport to WP Beta/RC label. Thanks!

packages/core-data/src/fetch/index.js

noisysocks · 2024-06-07T06:37:52Z

Is this really a bug? Should this be backported for WP 6.6? If so, please add the Backport to WP Beta/RC label. Thanks!

I think so. The existing experience is broken. I don't think it needs to be backported to 6.6.

skorasaurus · 2024-06-07T19:07:52Z

Thanks for this; As I understand, this would also close #39964

- Rename __experimentalFetchLinkSuggestions to fetchLinkSuggestions. - Rewrite fetchLinkSuggestions in TypeScript. - Sort results by relevancy using cosine similiarty between term frequency vectors.

github-actions · 2024-06-12T06:07:28Z

Size Change: +922 B (+0.05%)

Total Size: 1.76 MB

Filename	Size	Change
`build/block-editor/index.min.js`	262 kB	-242 B (-0.09%)
`build/block-editor/style-rtl.css`	15.6 kB	-63 B (-0.4%)
`build/block-editor/style.css`	15.5 kB	-62 B (-0.4%)
`build/block-library/index.min.js`	219 kB	+224 B (+0.1%)
`build/core-data/index.min.js`	72.6 kB	+68 B (+0.09%)
`build/edit-site/index.min.js`	207 kB	-90 B (-0.04%)
`build/edit-site/posts-rtl.css`	6.35 kB	+25 B (+0.4%)
`build/edit-site/posts.css`	6.35 kB	+25 B (+0.4%)
`build/edit-site/style-rtl.css`	11.7 kB	+32 B (+0.27%)
`build/edit-site/style.css`	11.7 kB	+30 B (+0.26%)
`build/editor/index.min.js`	98 kB	+35 B (+0.04%)
`build/patterns/index.min.js`	7.22 kB	+755 B (+11.67%)	⚠️
`build/patterns/style-rtl.css`	687 B	+93 B (+15.66%)	⚠️
`build/patterns/style.css`	685 B	+92 B (+15.51%)	⚠️

ℹ️ View Unchanged

Filename	Size
`build/a11y/index.min.js`	951 B
`build/annotations/index.min.js`	2.26 kB
`build/api-fetch/index.min.js`	2.31 kB
`build/autop/index.min.js`	2.12 kB
`build/blob/index.min.js`	579 B
`build/block-directory/index.min.js`	7.31 kB
`build/block-directory/style-rtl.css`	1.02 kB
`build/block-directory/style.css`	1.02 kB
`build/block-editor/content-rtl.css`	4.57 kB
`build/block-editor/content.css`	4.57 kB
`build/block-editor/default-editor-styles-rtl.css`	394 B
`build/block-editor/default-editor-styles.css`	394 B
`build/block-library/blocks/archives/editor-rtl.css`	61 B
`build/block-library/blocks/archives/editor.css`	60 B
`build/block-library/blocks/archives/style-rtl.css`	90 B
`build/block-library/blocks/archives/style.css`	90 B
`build/block-library/blocks/audio/editor-rtl.css`	149 B
`build/block-library/blocks/audio/editor.css`	151 B
`build/block-library/blocks/audio/style-rtl.css`	125 B
`build/block-library/blocks/audio/style.css`	125 B
`build/block-library/blocks/audio/theme-rtl.css`	126 B
`build/block-library/blocks/audio/theme.css`	126 B
`build/block-library/blocks/avatar/editor-rtl.css`	115 B
`build/block-library/blocks/avatar/editor.css`	115 B
`build/block-library/blocks/avatar/style-rtl.css`	104 B
`build/block-library/blocks/avatar/style.css`	104 B
`build/block-library/blocks/button/editor-rtl.css`	310 B
`build/block-library/blocks/button/editor.css`	310 B
`build/block-library/blocks/button/style-rtl.css`	538 B
`build/block-library/blocks/button/style.css`	538 B
`build/block-library/blocks/buttons/editor-rtl.css`	336 B
`build/block-library/blocks/buttons/editor.css`	336 B
`build/block-library/blocks/buttons/style-rtl.css`	328 B
`build/block-library/blocks/buttons/style.css`	328 B
`build/block-library/blocks/calendar/style-rtl.css`	240 B
`build/block-library/blocks/calendar/style.css`	240 B
`build/block-library/blocks/categories/editor-rtl.css`	113 B
`build/block-library/blocks/categories/editor.css`	112 B
`build/block-library/blocks/categories/style-rtl.css`	124 B
`build/block-library/blocks/categories/style.css`	124 B
`build/block-library/blocks/code/editor-rtl.css`	53 B
`build/block-library/blocks/code/editor.css`	53 B
`build/block-library/blocks/code/style-rtl.css`	121 B
`build/block-library/blocks/code/style.css`	121 B
`build/block-library/blocks/code/theme-rtl.css`	122 B
`build/block-library/blocks/code/theme.css`	122 B
`build/block-library/blocks/columns/editor-rtl.css`	108 B
`build/block-library/blocks/columns/editor.css`	108 B
`build/block-library/blocks/columns/style-rtl.css`	420 B
`build/block-library/blocks/columns/style.css`	420 B
`build/block-library/blocks/comment-author-avatar/editor-rtl.css`	124 B
`build/block-library/blocks/comment-author-avatar/editor.css`	124 B
`build/block-library/blocks/comment-content/style-rtl.css`	90 B
`build/block-library/blocks/comment-content/style.css`	90 B
`build/block-library/blocks/comment-template/style-rtl.css`	200 B
`build/block-library/blocks/comment-template/style.css`	199 B
`build/block-library/blocks/comments-pagination-numbers/editor-rtl.css`	122 B
`build/block-library/blocks/comments-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/comments-pagination/editor-rtl.css`	221 B
`build/block-library/blocks/comments-pagination/editor.css`	211 B
`build/block-library/blocks/comments-pagination/style-rtl.css`	234 B
`build/block-library/blocks/comments-pagination/style.css`	231 B
`build/block-library/blocks/comments-title/editor-rtl.css`	75 B
`build/block-library/blocks/comments-title/editor.css`	75 B
`build/block-library/blocks/comments/editor-rtl.css`	832 B
`build/block-library/blocks/comments/editor.css`	832 B
`build/block-library/blocks/comments/style-rtl.css`	632 B
`build/block-library/blocks/comments/style.css`	631 B
`build/block-library/blocks/cover/editor-rtl.css`	668 B
`build/block-library/blocks/cover/editor.css`	669 B
`build/block-library/blocks/cover/style-rtl.css`	1.62 kB
`build/block-library/blocks/cover/style.css`	1.6 kB
`build/block-library/blocks/details/editor-rtl.css`	65 B
`build/block-library/blocks/details/editor.css`	65 B
`build/block-library/blocks/details/style-rtl.css`	86 B
`build/block-library/blocks/details/style.css`	86 B
`build/block-library/blocks/embed/editor-rtl.css`	314 B
`build/block-library/blocks/embed/editor.css`	314 B
`build/block-library/blocks/embed/style-rtl.css`	411 B
`build/block-library/blocks/embed/style.css`	411 B
`build/block-library/blocks/embed/theme-rtl.css`	126 B
`build/block-library/blocks/embed/theme.css`	126 B
`build/block-library/blocks/file/editor-rtl.css`	326 B
`build/block-library/blocks/file/editor.css`	326 B
`build/block-library/blocks/file/style-rtl.css`	278 B
`build/block-library/blocks/file/style.css`	279 B
`build/block-library/blocks/file/view.min.js`	324 B
`build/block-library/blocks/footnotes/style-rtl.css`	198 B
`build/block-library/blocks/footnotes/style.css`	197 B
`build/block-library/blocks/form-input/editor-rtl.css`	229 B
`build/block-library/blocks/form-input/editor.css`	229 B
`build/block-library/blocks/form-input/style-rtl.css`	342 B
`build/block-library/blocks/form-input/style.css`	342 B
`build/block-library/blocks/form-submission-notification/editor-rtl.css`	344 B
`build/block-library/blocks/form-submission-notification/editor.css`	341 B
`build/block-library/blocks/form-submit-button/style-rtl.css`	69 B
`build/block-library/blocks/form-submit-button/style.css`	69 B
`build/block-library/blocks/form/view.min.js`	470 B
`build/block-library/blocks/freeform/editor-rtl.css`	2.6 kB
`build/block-library/blocks/freeform/editor.css`	2.6 kB
`build/block-library/blocks/gallery/editor-rtl.css`	958 B
`build/block-library/blocks/gallery/editor.css`	962 B
`build/block-library/blocks/gallery/style-rtl.css`	1.71 kB
`build/block-library/blocks/gallery/style.css`	1.71 kB
`build/block-library/blocks/gallery/theme-rtl.css`	108 B
`build/block-library/blocks/gallery/theme.css`	108 B
`build/block-library/blocks/group/editor-rtl.css`	402 B
`build/block-library/blocks/group/editor.css`	402 B
`build/block-library/blocks/group/style-rtl.css`	103 B
`build/block-library/blocks/group/style.css`	103 B
`build/block-library/blocks/group/theme-rtl.css`	79 B
`build/block-library/blocks/group/theme.css`	79 B
`build/block-library/blocks/heading/style-rtl.css`	188 B
`build/block-library/blocks/heading/style.css`	188 B
`build/block-library/blocks/html/editor-rtl.css`	346 B
`build/block-library/blocks/html/editor.css`	347 B
`build/block-library/blocks/image/editor-rtl.css`	890 B
`build/block-library/blocks/image/editor.css`	889 B
`build/block-library/blocks/image/style-rtl.css`	1.52 kB
`build/block-library/blocks/image/style.css`	1.51 kB
`build/block-library/blocks/image/theme-rtl.css`	137 B
`build/block-library/blocks/image/theme.css`	137 B
`build/block-library/blocks/image/view.min.js`	1.54 kB
`build/block-library/blocks/latest-comments/style-rtl.css`	355 B
`build/block-library/blocks/latest-comments/style.css`	354 B
`build/block-library/blocks/latest-posts/editor-rtl.css`	204 B
`build/block-library/blocks/latest-posts/editor.css`	204 B
`build/block-library/blocks/latest-posts/style-rtl.css`	509 B
`build/block-library/blocks/latest-posts/style.css`	510 B
`build/block-library/blocks/list/style-rtl.css`	104 B
`build/block-library/blocks/list/style.css`	104 B
`build/block-library/blocks/media-text/editor-rtl.css`	304 B
`build/block-library/blocks/media-text/editor.css`	303 B
`build/block-library/blocks/media-text/style-rtl.css`	506 B
`build/block-library/blocks/media-text/style.css`	504 B
`build/block-library/blocks/more/editor-rtl.css`	427 B
`build/block-library/blocks/more/editor.css`	427 B
`build/block-library/blocks/navigation-link/editor-rtl.css`	663 B
`build/block-library/blocks/navigation-link/editor.css`	664 B
`build/block-library/blocks/navigation-link/style-rtl.css`	192 B
`build/block-library/blocks/navigation-link/style.css`	191 B
`build/block-library/blocks/navigation-submenu/editor-rtl.css`	295 B
`build/block-library/blocks/navigation-submenu/editor.css`	294 B
`build/block-library/blocks/navigation/editor-rtl.css`	2.2 kB
`build/block-library/blocks/navigation/editor.css`	2.21 kB
`build/block-library/blocks/navigation/style-rtl.css`	2.25 kB
`build/block-library/blocks/navigation/style.css`	2.24 kB
`build/block-library/blocks/navigation/view.min.js`	1.03 kB
`build/block-library/blocks/nextpage/editor-rtl.css`	392 B
`build/block-library/blocks/nextpage/editor.css`	392 B
`build/block-library/blocks/page-list/editor-rtl.css`	378 B
`build/block-library/blocks/page-list/editor.css`	378 B
`build/block-library/blocks/page-list/style-rtl.css`	175 B
`build/block-library/blocks/page-list/style.css`	175 B
`build/block-library/blocks/paragraph/editor-rtl.css`	236 B
`build/block-library/blocks/paragraph/editor.css`	236 B
`build/block-library/blocks/paragraph/style-rtl.css`	341 B
`build/block-library/blocks/paragraph/style.css`	340 B
`build/block-library/blocks/post-author/style-rtl.css`	175 B
`build/block-library/blocks/post-author/style.css`	176 B
`build/block-library/blocks/post-comments-form/editor-rtl.css`	96 B
`build/block-library/blocks/post-comments-form/editor.css`	96 B
`build/block-library/blocks/post-comments-form/style-rtl.css`	506 B
`build/block-library/blocks/post-comments-form/style.css`	506 B
`build/block-library/blocks/post-content/editor-rtl.css`	74 B
`build/block-library/blocks/post-content/editor.css`	74 B
`build/block-library/blocks/post-date/style-rtl.css`	62 B
`build/block-library/blocks/post-date/style.css`	62 B
`build/block-library/blocks/post-excerpt/editor-rtl.css`	71 B
`build/block-library/blocks/post-excerpt/editor.css`	71 B
`build/block-library/blocks/post-excerpt/style-rtl.css`	141 B
`build/block-library/blocks/post-excerpt/style.css`	141 B
`build/block-library/blocks/post-featured-image/editor-rtl.css`	729 B
`build/block-library/blocks/post-featured-image/editor.css`	726 B
`build/block-library/blocks/post-featured-image/style-rtl.css`	341 B
`build/block-library/blocks/post-featured-image/style.css`	341 B
`build/block-library/blocks/post-navigation-link/style-rtl.css`	215 B
`build/block-library/blocks/post-navigation-link/style.css`	214 B
`build/block-library/blocks/post-template/editor-rtl.css`	99 B
`build/block-library/blocks/post-template/editor.css`	98 B
`build/block-library/blocks/post-template/style-rtl.css`	399 B
`build/block-library/blocks/post-template/style.css`	398 B
`build/block-library/blocks/post-terms/style-rtl.css`	96 B
`build/block-library/blocks/post-terms/style.css`	96 B
`build/block-library/blocks/post-time-to-read/style-rtl.css`	70 B
`build/block-library/blocks/post-time-to-read/style.css`	70 B
`build/block-library/blocks/post-title/style-rtl.css`	100 B
`build/block-library/blocks/post-title/style.css`	100 B
`build/block-library/blocks/preformatted/style-rtl.css`	125 B
`build/block-library/blocks/preformatted/style.css`	125 B
`build/block-library/blocks/pullquote/editor-rtl.css`	134 B
`build/block-library/blocks/pullquote/editor.css`	134 B
`build/block-library/blocks/pullquote/style-rtl.css`	342 B
`build/block-library/blocks/pullquote/style.css`	342 B
`build/block-library/blocks/pullquote/theme-rtl.css`	167 B
`build/block-library/blocks/pullquote/theme.css`	167 B
`build/block-library/blocks/query-pagination-numbers/editor-rtl.css`	121 B
`build/block-library/blocks/query-pagination-numbers/editor.css`	118 B
`build/block-library/blocks/query-pagination/editor-rtl.css`	220 B
`build/block-library/blocks/query-pagination/editor.css`	208 B
`build/block-library/blocks/query-pagination/style-rtl.css`	287 B
`build/block-library/blocks/query-pagination/style.css`	283 B
`build/block-library/blocks/query-title/style-rtl.css`	64 B
`build/block-library/blocks/query-title/style.css`	64 B
`build/block-library/blocks/query/editor-rtl.css`	502 B
`build/block-library/blocks/query/editor.css`	502 B
`build/block-library/blocks/query/view.min.js`	958 B
`build/block-library/blocks/quote/style-rtl.css`	238 B
`build/block-library/blocks/quote/style.css`	238 B
`build/block-library/blocks/quote/theme-rtl.css`	221 B
`build/block-library/blocks/quote/theme.css`	225 B
`build/block-library/blocks/read-more/style-rtl.css`	138 B
`build/block-library/blocks/read-more/style.css`	138 B
`build/block-library/blocks/rss/editor-rtl.css`	101 B
`build/block-library/blocks/rss/editor.css`	101 B
`build/block-library/blocks/rss/style-rtl.css`	288 B
`build/block-library/blocks/rss/style.css`	287 B
`build/block-library/blocks/search/editor-rtl.css`	183 B
`build/block-library/blocks/search/editor.css`	183 B
`build/block-library/blocks/search/style-rtl.css`	684 B
`build/block-library/blocks/search/style.css`	683 B
`build/block-library/blocks/search/theme-rtl.css`	113 B
`build/block-library/blocks/search/theme.css`	113 B
`build/block-library/blocks/search/view.min.js`	475 B
`build/block-library/blocks/separator/editor-rtl.css`	100 B
`build/block-library/blocks/separator/editor.css`	100 B
`build/block-library/blocks/separator/style-rtl.css`	248 B
`build/block-library/blocks/separator/style.css`	248 B
`build/block-library/blocks/separator/theme-rtl.css`	195 B
`build/block-library/blocks/separator/theme.css`	195 B
`build/block-library/blocks/shortcode/editor-rtl.css`	286 B
`build/block-library/blocks/shortcode/editor.css`	286 B
`build/block-library/blocks/site-logo/editor-rtl.css`	806 B
`build/block-library/blocks/site-logo/editor.css`	803 B
`build/block-library/blocks/site-logo/style-rtl.css`	218 B
`build/block-library/blocks/site-logo/style.css`	218 B
`build/block-library/blocks/site-tagline/editor-rtl.css`	87 B
`build/block-library/blocks/site-tagline/editor.css`	87 B
`build/block-library/blocks/site-title/editor-rtl.css`	123 B
`build/block-library/blocks/site-title/editor.css`	123 B
`build/block-library/blocks/site-title/style-rtl.css`	71 B
`build/block-library/blocks/site-title/style.css`	71 B
`build/block-library/blocks/social-link/editor-rtl.css`	338 B
`build/block-library/blocks/social-link/editor.css`	338 B
`build/block-library/blocks/social-links/editor-rtl.css`	676 B
`build/block-library/blocks/social-links/editor.css`	675 B
`build/block-library/blocks/social-links/style-rtl.css`	1.5 kB
`build/block-library/blocks/social-links/style.css`	1.5 kB
`build/block-library/blocks/spacer/editor-rtl.css`	346 B
`build/block-library/blocks/spacer/editor.css`	346 B
`build/block-library/blocks/spacer/style-rtl.css`	48 B
`build/block-library/blocks/spacer/style.css`	48 B
`build/block-library/blocks/table/editor-rtl.css`	394 B
`build/block-library/blocks/table/editor.css`	394 B
`build/block-library/blocks/table/style-rtl.css`	640 B
`build/block-library/blocks/table/style.css`	639 B
`build/block-library/blocks/table/theme-rtl.css`	145 B
`build/block-library/blocks/table/theme.css`	145 B
`build/block-library/blocks/tag-cloud/style-rtl.css`	266 B
`build/block-library/blocks/tag-cloud/style.css`	265 B
`build/block-library/blocks/template-part/editor-rtl.css`	393 B
`build/block-library/blocks/template-part/editor.css`	393 B
`build/block-library/blocks/template-part/theme-rtl.css`	113 B
`build/block-library/blocks/template-part/theme.css`	113 B
`build/block-library/blocks/term-description/style-rtl.css`	108 B
`build/block-library/blocks/term-description/style.css`	108 B
`build/block-library/blocks/text-columns/editor-rtl.css`	95 B
`build/block-library/blocks/text-columns/editor.css`	95 B
`build/block-library/blocks/text-columns/style-rtl.css`	165 B
`build/block-library/blocks/text-columns/style.css`	165 B
`build/block-library/blocks/verse/style-rtl.css`	98 B
`build/block-library/blocks/verse/style.css`	98 B
`build/block-library/blocks/video/editor-rtl.css`	553 B
`build/block-library/blocks/video/editor.css`	554 B
`build/block-library/blocks/video/style-rtl.css`	186 B
`build/block-library/blocks/video/style.css`	186 B
`build/block-library/blocks/video/theme-rtl.css`	126 B
`build/block-library/blocks/video/theme.css`	126 B
`build/block-library/classic-rtl.css`	179 B
`build/block-library/classic.css`	179 B
`build/block-library/common-rtl.css`	1.11 kB
`build/block-library/common.css`	1.11 kB
`build/block-library/editor-elements-rtl.css`	75 B
`build/block-library/editor-elements.css`	75 B
`build/block-library/editor-rtl.css`	12 kB
`build/block-library/editor.css`	11.9 kB
`build/block-library/elements-rtl.css`	54 B
`build/block-library/elements.css`	54 B
`build/block-library/reset-rtl.css`	470 B
`build/block-library/reset.css`	470 B
`build/block-library/style-rtl.css`	14.6 kB
`build/block-library/style.css`	14.6 kB
`build/block-library/theme-rtl.css`	698 B
`build/block-library/theme.css`	703 B
`build/block-serialization-default-parser/index.min.js`	1.12 kB
`build/block-serialization-spec-parser/index.min.js`	2.87 kB
`build/blocks/index.min.js`	52.2 kB
`build/commands/index.min.js`	15.2 kB
`build/commands/style-rtl.css`	955 B
`build/commands/style.css`	952 B
`build/components/index.min.js`	223 kB
`build/components/style-rtl.css`	12 kB
`build/components/style.css`	12 kB
`build/compose/index.min.js`	12.9 kB
`build/core-commands/index.min.js`	2.74 kB
`build/customize-widgets/index.min.js`	10.9 kB
`build/customize-widgets/style-rtl.css`	1.35 kB
`build/customize-widgets/style.css`	1.35 kB
`build/data-controls/index.min.js`	641 B
`build/data/index.min.js`	8.99 kB
`build/date/index.min.js`	18 kB
`build/deprecated/index.min.js`	458 B
`build/dom-ready/index.min.js`	325 B
`build/dom/index.min.js`	4.65 kB
`build/edit-post/classic-rtl.css`	578 B
`build/edit-post/classic.css`	580 B
`build/edit-post/index.min.js`	12.4 kB
`build/edit-post/style-rtl.css`	2.31 kB
`build/edit-post/style.css`	2.31 kB
`build/edit-widgets/index.min.js`	17.6 kB
`build/edit-widgets/style-rtl.css`	4.19 kB
`build/edit-widgets/style.css`	4.19 kB
`build/editor/style-rtl.css`	9.22 kB
`build/editor/style.css`	9.22 kB
`build/element/index.min.js`	4.83 kB
`build/escape-html/index.min.js`	537 B
`build/format-library/index.min.js`	8.1 kB
`build/format-library/style-rtl.css`	494 B
`build/format-library/style.css`	493 B
`build/hooks/index.min.js`	1.54 kB
`build/html-entities/index.min.js`	445 B
`build/i18n/index.min.js`	3.58 kB
`build/interactivity/debug.min.js`	16.5 kB
`build/interactivity/file.min.js`	447 B
`build/interactivity/image.min.js`	1.68 kB
`build/interactivity/index.min.js`	13.4 kB
`build/interactivity/navigation.min.js`	1.16 kB
`build/interactivity/query.min.js`	742 B
`build/interactivity/router.min.js`	2.8 kB
`build/interactivity/search.min.js`	615 B
`build/is-shallow-equal/index.min.js`	526 B
`build/keyboard-shortcuts/index.min.js`	1.31 kB
`build/keycodes/index.min.js`	1.46 kB
`build/list-reusable-blocks/index.min.js`	2.17 kB
`build/list-reusable-blocks/style-rtl.css`	846 B
`build/list-reusable-blocks/style.css`	846 B
`build/media-utils/index.min.js`	2.92 kB
`build/modules/importmap-polyfill.min.js`	12.3 kB
`build/notices/index.min.js`	946 B
`build/nux/index.min.js`	1.58 kB
`build/nux/style-rtl.css`	749 B
`build/nux/style.css`	745 B
`build/plugins/index.min.js`	1.81 kB
`build/preferences-persistence/index.min.js`	2.06 kB
`build/preferences/index.min.js`	2.89 kB
`build/preferences/style-rtl.css`	715 B
`build/preferences/style.css`	715 B
`build/primitives/index.min.js`	829 B
`build/priority-queue/index.min.js`	1.54 kB
`build/private-apis/index.min.js`	994 B
`build/react-i18n/index.min.js`	630 B
`build/react-refresh-entry/index.min.js`	9.47 kB
`build/react-refresh-runtime/index.min.js`	6.76 kB
`build/redux-routine/index.min.js`	2.69 kB
`build/reusable-blocks/index.min.js`	2.72 kB
`build/reusable-blocks/style-rtl.css`	256 B
`build/reusable-blocks/style.css`	256 B
`build/rich-text/index.min.js`	10.1 kB
`build/router/index.min.js`	1.95 kB
`build/server-side-render/index.min.js`	1.94 kB
`build/shortcode/index.min.js`	1.4 kB
`build/style-engine/index.min.js`	2.01 kB
`build/token-list/index.min.js`	579 B
`build/url/index.min.js`	3.85 kB
`build/vendors/react-dom.min.js`	42.8 kB
`build/vendors/react-jsx-runtime.min.js`	560 B
`build/vendors/react.min.js`	2.65 kB
`build/viewport/index.min.js`	965 B
`build/warning/index.min.js`	250 B
`build/widgets/index.min.js`	7.19 kB
`build/widgets/style-rtl.css`	1.16 kB
`build/widgets/style.css`	1.16 kB
`build/wordcount/index.min.js`	1.03 kB

_{compressed-size-action}

noisysocks · 2024-06-12T07:00:57Z

Rebased this and fixed the tests. It's ready for review.

I tried simple keyword matching as an alternative algorithm for ranking results. This is where you award 1 point per keyword that the title and search term have in common. It didn't work as well for me in testing, though, because cosine similarity will give smaller titles that are similar to the query an edge over long titles that contain lots of irrelevant words in addition to the query.

The simpler approach achieves good enough results so I am happy to switch to it if cosine similarity is too difficult to understand or more than we need. Let me know, I won't be offended.

ramonjd

I've been testing this with lots and lots of taxonomy terms, posts and pages in English, German and Japanese.

It's hard to see the benefit of these changes at first — trunk behaves mostly the same until you have 100s of pages/posts with similar keywords.

Here's me looking for "Paris"

Kapture.2024-06-13.at.14.37.58.mp4

Great stuff - overall I think it's a big improvement to have tags/cats surfaced this way.

packages/core-data/src/fetch/fetch-link-suggestions.ts

ramonjd · 2024-06-13T05:16:06Z

The simpler approach achieves good enough results so I am happy to switch to it if cosine similarity is too difficult to understand or more than we need. Let me know, I won't be offended.

Chat-GPT and "ELI5" set me straight 😄

But it is very clever, so I paused at whether this will be accessible to folks who want to iterate on the feature.

Is the "simpler" approach less code? Does it perform as well? Easier to read?

If the answer is "yes" to these questions I'd probably consider using the simpler approach, or at least stating a convincing reason to go with cosine similarity, e.g., it produces much better results more of the time.

What do you think?

andrewserong

Great work here, and thanks for the detailed explanations for how this works, and for the test wp cli commands 👍

It didn't work as well for me in testing, though, because cosine similarity will give smaller titles that are similar to the query an edge over long titles that contain lots of irrelevant words in addition to the query.

IMO I think the advantage of smaller titles that more directly match the query could be worth it, especially if you wind up having a simple category like Travel where you'd expect it to be at the top. With this PR applied, it's working very nicely for me:

Whereas on trunk I get pages first of all, and categories and tags are way down at the bottom of the list.

Trunk top of list	Trunk bottom of list

Just left a few questions, but overall I think this is a big improvement, and I also like the idea of stabilising the API for it. The function has been around for a long time and it seems general purpose enough to be useful in situations outside of Gutenberg to me (i.e. I could imagine a plugin in an admin area wanting a quick way of fetching link suggestions).

Would it be worth getting a second opinion / see if anyone objects to the more complex cosine similarity approach?

packages/core-data/src/fetch/fetch-link-suggestions.ts

noisysocks · 2024-06-14T05:18:14Z

I tried a scoring approach where we simply divide the number of tokens in the title that are also in the search query, divided by the total number of tokens in the title. This means shorter titles receive higher scores, fixing the problem I noted in #62397 (comment).

It seems to work well enough in my testing and is much simpler to understand. I'd appreciate if you can test it with various search terms, etc. again though 🙂

noisysocks · 2024-06-14T05:57:32Z

The more I look at the API of fetchLinkSuggestions the less I like it 😅 so I'm going to keep it experimental in this PR and come back to stabilising it / cleaning it up in a follow-up PR.

andrewserong · 2024-06-14T05:57:52Z

This is still testing nicely for me after the latest change 👍

Keep experimental for now

Is this since we're still iterating on the logic? Sounds reasonable to defer stabilising it for now.

noisysocks · 2024-06-14T06:04:24Z

Updated PR description. This is altogether a much simpler PR now 😅

Is this since we're still iterating on the logic? Sounds reasonable to defer stabilising it for now.

No but that's not a bad reason either.

andrewserong

This is still testing nicely for me, and I like that the logic is simpler, easier to read and to maintain. Not for now, but another potential future enhancement could be to look at partial matches with tokens / fuzzy search, too, as I need to type the whole word "travel" before the travel category gets to the top of the list:

"trave"

"travel"

This is already a big improvement, though, so just jotting this down as a thought, nothing to worry about for now.

LGTM! 🚀

ramonjd

Ran through similar tests as before. 500+ database entities with different languages.
Tags and categories are surfaced as expected. Snappy results.

Kapture.2024-06-14.at.16.22.11.mp4

Very nice. 🚢

noisysocks · 2024-06-14T06:31:48Z

Thanks for bearing 🐻 with me while I ventured unnecessarily deep into rabbit 🐰 holes!

…2397) * Update fetchLinkSuggestions to sort results by relevancy - Rename __experimentalFetchLinkSuggestions to fetchLinkSuggestions. - Rewrite fetchLinkSuggestions in TypeScript. - Sort results by relevancy using cosine similiarty between term frequency vectors. * Fix tsc errors * Update @wordpress/core-data imports * Make tokenize unicode aware * Remove unnecessary mutation * Add tests for all the helper functions * Simpler scoring function * Keep experimental for now Unlinked contributors: scrobbleme. Co-authored-by: noisysocks <noisysocks@git.wordpress.org> Co-authored-by: ellatrix <ellatrix@git.wordpress.org> Co-authored-by: ntsekouras <ntsekouras@git.wordpress.org> Co-authored-by: ramonjd <ramonopoly@git.wordpress.org> Co-authored-by: andrewserong <andrewserong@git.wordpress.org> Co-authored-by: skorasaurus <skorasaurus@git.wordpress.org>

noisysocks added [Type] Bug An existing feature does not function as intended [Block] Navigation Affects the Navigation Block [Feature] Link Editing Link components (LinkControl, URLInput) and integrations (RichText link formatting) labels Jun 7, 2024

noisysocks requested a review from nerrad as a code owner June 7, 2024 06:18

ellatrix reviewed Jun 7, 2024

View reviewed changes

packages/core-data/src/fetch/index.js Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

noisysocks self-assigned this Jun 11, 2024

noisysocks added the No Core Sync Required Indicates that any changes do not need to be synced to WordPress Core label Jun 12, 2024

noisysocks mentioned this pull request Jun 12, 2024

Improve the search for tags when a substring doesn't appear in the first 20 results. #39964

Closed

noisysocks added 3 commits June 12, 2024 15:46

Update fetchLinkSuggestions to sort results by relevancy

74e2277

- Rename __experimentalFetchLinkSuggestions to fetchLinkSuggestions. - Rewrite fetchLinkSuggestions in TypeScript. - Sort results by relevancy using cosine similiarty between term frequency vectors.

Fix tsc errors

84cc16e

Update @wordpress/core-data imports

16628ee

noisysocks force-pushed the fix/link-suggestions branch from 2b7a770 to 16628ee Compare June 12, 2024 05:56

noisysocks requested review from draganescu, adamziel and kevin940726 as code owners June 12, 2024 05:56

ramonjd reviewed Jun 13, 2024

View reviewed changes

packages/core-data/src/fetch/fetch-link-suggestions.ts Outdated Show resolved Hide resolved

andrewserong reviewed Jun 13, 2024

View reviewed changes

packages/core-data/src/fetch/fetch-link-suggestions.ts Outdated Show resolved Hide resolved

packages/core-data/src/fetch/fetch-link-suggestions.ts Outdated Show resolved Hide resolved

packages/core-data/src/fetch/fetch-link-suggestions.ts Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

noisysocks added 2 commits June 13, 2024 16:28

Make tokenize unicode aware

567fa16

Remove unnecessary mutation

c69f553

This comment was marked as outdated.

Sign in to view

Add tests for all the helper functions

fdaa886

Simpler scoring function

cf4ab37

Keep experimental for now

a163f5b

andrewserong approved these changes Jun 14, 2024

View reviewed changes

ramonjd approved these changes Jun 14, 2024

View reviewed changes

noisysocks enabled auto-merge (squash) June 14, 2024 06:28

noisysocks merged commit 18676a8 into trunk Jun 14, 2024
66 of 67 checks passed

noisysocks deleted the fix/link-suggestions branch June 14, 2024 06:30

github-actions bot added this to the Gutenberg 18.6 milestone Jun 14, 2024

noisysocks mentioned this pull request Jun 14, 2024

fetchLinkSuggestions: Allow for partial matching #62570

Merged

noisysocks mentioned this pull request Jul 24, 2024

[Type] Enhancement : post higher priority in link search results #63836

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update fetchLinkSuggestions to sort results by relevancy #62397

Update fetchLinkSuggestions to sort results by relevancy #62397

noisysocks commented Jun 7, 2024 •

edited

Loading

github-actions bot commented Jun 7, 2024 •

edited

Loading

ellatrix commented Jun 7, 2024

noisysocks commented Jun 7, 2024

This comment was marked as outdated.

skorasaurus commented Jun 7, 2024

github-actions bot commented Jun 12, 2024 •

edited

Loading

noisysocks commented Jun 12, 2024

ramonjd left a comment

ramonjd commented Jun 13, 2024 •

edited

Loading

andrewserong left a comment

This comment was marked as outdated.

This comment was marked as outdated.

noisysocks commented Jun 14, 2024

noisysocks commented Jun 14, 2024

andrewserong commented Jun 14, 2024

noisysocks commented Jun 14, 2024

andrewserong left a comment

ramonjd left a comment

noisysocks commented Jun 14, 2024 •

edited

Loading

Update fetchLinkSuggestions to sort results by relevancy #62397

Update fetchLinkSuggestions to sort results by relevancy #62397

Conversation

noisysocks commented Jun 7, 2024 • edited Loading

What?

Why?

How?

Alternative approaches

House keeping

Testing Instructions

Screenshots or screencast

github-actions bot commented Jun 7, 2024 • edited Loading

Unlinked Accounts

ellatrix commented Jun 7, 2024

noisysocks commented Jun 7, 2024

This comment was marked as outdated.

skorasaurus commented Jun 7, 2024

github-actions bot commented Jun 12, 2024 • edited Loading

noisysocks commented Jun 12, 2024

ramonjd left a comment

Choose a reason for hiding this comment

ramonjd commented Jun 13, 2024 • edited Loading

andrewserong left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

noisysocks commented Jun 14, 2024

noisysocks commented Jun 14, 2024

andrewserong commented Jun 14, 2024

noisysocks commented Jun 14, 2024

andrewserong left a comment

Choose a reason for hiding this comment

ramonjd left a comment

Choose a reason for hiding this comment

noisysocks commented Jun 14, 2024 • edited Loading

noisysocks commented Jun 7, 2024 •

edited

Loading

github-actions bot commented Jun 7, 2024 •

edited

Loading

github-actions bot commented Jun 12, 2024 •

edited

Loading

ramonjd commented Jun 13, 2024 •

edited

Loading

noisysocks commented Jun 14, 2024 •

edited

Loading