-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give posts (any type) higher priority in link search results #63683
Comments
Take this too literally and we'll cause a regression of #56478 😀 I think we want to prioritise posts and pages but not always place them above every other result. It's important that users can easily link to tags and categories especially from the Navigation block. |
I'd say you are likely to search for and link a page many times more likely than a tag or category archive. I would say given a page, tag, and attachment matched with Perhaps deprioritizing attachments would meet the expectations better? |
Yeah in your example screenshot I'd expect Composing with patterns to be first but if you searched for "patterns-1" I'd expect patterns-1 to appear first. The key thing to bear in mind is that we don't want to regress #56478 as that bug made creating some types of navigation basically impossible. I think giving posts a slight (25%? need to play with the exact number) boost and attachments a slight penalty (25%?) should work. |
I'm down for trying that. |
After updating to WordPress 6.7 this is something we've heard several client teams complain about. They are having a much harder time finding the actual relevant content than they did before the update which is impacting their workflows. In all honesty I would love an option to remove attachments from the results altogether. On most sites that simply isn't something that editors need to do. And if so they can add the link manually 🤔 The exploration of giving less prominence to attachments sounds like a good first step though 👍 I'm also going to add the |
Hey @noisysocks, I explored implementing the 25% boost for post types and the 25% penalty for attachments. Here are my findings: Initially, I directly applied the adjustments like this: // Boost for post types, penalty for attachments
if (result.kind === 'post-type') {
relevanceScore *= 1.25;
} else if (result.kind === 'media') {
relevanceScore *= 0.75;
} However, I noticed that the recently added sorting logic was penalizing results with longer titles. This was due to the score calculation formula: (exactMatchingTokens.length / titleTokens.length) * 10;
const subMatchScore = subMatchingTokens.length / titleTokens.length; To address this, I modified the logic to depend on the length of the search query instead of the title: const exactMatchScore =
(exactMatchingTokens.length / searchTokens.length) * 10;
const subMatchScore =
subMatchingTokens.length / searchTokens.length; This worked better, but exact string matches were still being ranked lower than post types. To resolve this, I added a significant boost for exact title matches to ensure they appear at the top: // Significant boost for exact title matches
if (result.title.toLowerCase() === search.toLowerCase()) {
relevanceScore *= 100;
} Currently, the ranking logic is functioning as follows (I’ll share a video to demonstrate this). I’ve also retested the previously implemented fixes to ensure they aren’t breaking anything, and everything appears to be working as expected. Do you think this approach is good to proceed with, or would you suggest any additional changes? If everything looks good, I’ll raise a PR with these updates. Thanks! Complete code:export function sortResults( results: SearchResult[], search: string ) {
const searchTokens = tokenize( search );
const scores = {};
for ( const result of results ) {
if ( result.title ) {
const titleTokens = tokenize( result.title );
const exactMatchingTokens = titleTokens.filter( ( titleToken ) =>
searchTokens.some(
( searchToken ) => titleToken === searchToken
)
);
const subMatchingTokens = titleTokens.filter( ( titleToken ) =>
searchTokens.some(
( searchToken ) =>
titleToken !== searchToken &&
titleToken.includes( searchToken )
)
);
// The score is a combination of exact matches and sub-matches.
// More weight is given to exact matches, as they are more relevant (e.g. "cat" vs "caterpillar").
// Diving by the total number of tokens in the title normalizes the score and skews
// the results towards shorter titles.
const exactMatchScore =
( exactMatchingTokens.length / searchTokens.length ) * 10;
const subMatchScore =
subMatchingTokens.length / searchTokens.length;
scores[ result.id ] = exactMatchScore + subMatchScore;
let relevanceScore = exactMatchScore + subMatchScore;
// Boost for post types, penalty for attachments
if ( result.kind === 'post-type' ) {
relevanceScore *= 1.25;
} else if ( result.kind === 'media' ) {
relevanceScore *= 0.75;
}
// Significant boost for exact title matches
if ( result.title.toLowerCase() === search.toLowerCase() ) {
relevanceScore *= 100;
}
scores[ result.id ] = relevanceScore;
} else {
scores[ result.id ] = 0;
}
}
return results.sort( ( a, b ) => scores[ b.id ] - scores[ a.id ] );
} PreviewCurrent Implementation:Screen.Recording.2024-12-09.at.7.50.25.PM.movTested whether the current changes break the previously added fix in #67367Screen.Recording.2024-12-09.at.7.51.50.PM.mov |
There's #67563 which has a working prioritisation of Posts. I would love for some reviews on that and/or code contributions to tweak this towards what we need. |
Hi @getdave, I tested the solution in #67563, and it seems to work well for me overall. Initially, I thought it might cause the regression mentioned in #56478, but after further testing, I was unable to reproduce the issue. Specifically, I created multiple posts using the following commands:
Additionally created 1 category and 1 attachment with name "Adventure" In my tests, I believe the sorting behavior appears to work correctly. 2024-12-10.21-15-04.mp4@noisysocks, could you confirm if the test cases for this scenario accurately validate the issue described in #56478? I’d appreciate your thoughts on this, @getdave. |
I'm beginning to think that an approach that relies solely on weighting may not be able to fully solve this problem. Search results are always limited to a maximum of 20 results, but what users want to prioritize can vary infinitely depending on user preferences and site content. Maybe the UI itself needs some improvements, like the following: Please excuse the clumsy design 😅 Add a button to load more search results:Allow search results to be filtered by type:@WordPress/gutenberg-design Any ideas? |
A dropdown to let you filter by type could be useful (I could see a filter dropdown live inside the input). The only hesitancy there is that this doesn't solve the main issue at hand, which is that the default search should either emphasize things that are not attachments, or de-emphasize attachments. We might even omit attachments as suggestions entirely, IMO the main flow for linking such is to use the media library. |
To provide everyone with context, attachments were added because there are users who want to link to documents (e.g. PDFs). It's quite common. I would support @t-hamano's proposal in conjunction with improving the weighting. What I would say in terms of design is that I remember this being explored previously and we quickly realised we'd need additional tabs other than Thanks for the dialogue here. Great to see 👍 |
I agree that filtering would be a useful enhancement, and can theoretically be used to solve this issue. A simple way to start might be add two tabs;
|
Would love to see (1) loading 20 additional results when scrolling down to the bottom and (2) a way to filter these results–at least as a developer–so we can exclude content temporarily while a widespread solution is found. For example, we might have 40 categories that have the word "recipes" in them and there's no way for us to get to the one we need (i.e. the parent category called "recipes" without any other words). Similarly, images will often share very similar names with the parent post, for SEO purposes, and now linking to a given post is very tedious. Separating content and media is a good start but I agree that additional tabs could easily be needed by the user. For example, one of my client has many old tags they don't want to delete but they also never want to link to. If it were possible to filter the results, we could develop a plugin that would allow her to disable specific content types from showing up. We've developed a rather hacky solution to exclude attachments temporarily. add_action('enqueue_block_editor_assets', 'exclude_attachment_link_suggestions_init');
function exclude_attachment_link_suggestions_init()
{
wp_add_inline_script(
'wp-core-data',
'(function() {
function initLinkFilter() {
const settings = wp.data.select("core/block-editor").getSettings();
const originalFetch = settings.__experimentalFetchLinkSuggestions;
if (originalFetch) {
wp.data.dispatch("core/block-editor").updateSettings({
__experimentalFetchLinkSuggestions: async (search, config) => {
const results = await originalFetch(search, config);
return results.filter(item => item.type !== "attachment");
}
});
}
}
const observer = new MutationObserver((mutations) => {
mutations.forEach((mutation) => {
if (mutation.type === "childList" && mutation.addedNodes.length > 0) {
initLinkFilter();
}
});
});
window.addEventListener("load", () => {
const root = document.querySelector(".editor-visual-editor");
if (root) {
observer.observe(root, {
childList: true,
subtree: true,
attributes: false,
characterData: false
});
}
initLinkFilter();
});
})();'
);
} |
I'm just adding some weight to the link search needing some enhancements, I am working on a client site which has a load of tags as well as post, pages and WooCommerce products with similar naming. I find that tags and attachments are often the first items to match and that WooCommerce product ("product" post type) results hardly ever appear. I do appreciate the ability to link to attachments (in particular linking to uploaded PDFs) is a good feature for the link search and for some sites will be very useful, but it would be great to be able to have a way to just prevent certain post types, tags, categories from showing in these results. @graylaurenm I tried to test your example code but it didn't seem to be working for me in WordPress 6.7.2, is that code working for you still? |
In LinkControl you can search for content existing on your site. This is great, but I did find that attachments were surfaced higher in search results than posts matching the search requirements.
I propose that pages and posts of all type are prioritized in the search results, above all others. It's much more likely to link to pages and posts, than to attachments.
Visual
The text was updated successfully, but these errors were encountered: