Add tokenized search support to Quick Open dialog and FileSystem filter #88660

MajorMcDoom · 2024-02-22T09:50:41Z

Closes #9145.

Tokenization

You can now search for files in the Quick Open dialog and the FileSystem filter using multiple tokens instead of a single continuous string.
This is useful for when you know multiple parts that the file path or name contains, but not the order, or what is between them.
This also means you don't have to use accurate naming conventions like snake_case or PascalCase or dash-case.

Smart Sorting for Quick Open

Prioritizes paths in which search tokens appear in order.
Prioritizes paths where the search tokens appear in the file name as opposed to folders.
Prioritizes paths in which search tokens appear at the front of folder or file names as opposed to the middle.

Notes

This required a rewrite of the Quick Open dialog's search functionality, as there was no easy way to reconcile a tokenized approach with the previous approach, which pretty much assumed the user will be typing out actual paths with separators. There was even a case for exact match, which is highly improbable, and definitely not "Quick" as the name suggests.
The previous approach did allow some leniency for typos, due to its use of Sorensen-Dice similarity. In practice, however, it is unpredictable and can be either too lenient or too punishing. Typos are not accounted for by most search functionality in the editor, so it was strange to include it in this one feature, especially since it also required you to be flawlessly accurate in other ways.

Cammymoop · 2024-02-23T05:10:08Z

related #82200

AdriaandeJongh · 2024-02-23T05:51:11Z

As nobody else has commented about this: I would appreciate a bit of leniency towards typos. It's not as 'strange' of a feature if it helps people find the things they are looking for.

MajorMcDoom · 2024-02-23T07:01:04Z

As nobody else has commented about this: I would appreciate a bit of leniency towards typos. It's not as 'strange' of a feature if it helps people find the things they are looking for.

That's totally fair, and my wording was not the greatest. I do think that my gripe was with the current implementation, which felt like a strange combination of high demand and leniency, and is also an odd exception amongst other instances of searching in the editor which do not suffer from the same issues despite not having typo-protection.

I do think that the impact of typos is already greatly reduced with the leniency introduced by tokenization. And one must also consider that a search function needs to also eliminate results that do not match. False positives are also likely to slow down the search, particularly if the file you are searching for is not the first result and you have to look through a list of results.

That said, something like what @Cammymoop linked could work well! I'll give it a try soon.

Mickeon · 2024-02-23T19:12:18Z

Ah, "multiple tokens". Is that how they're called? I referred to them as "terms" in both #65315 and #65352 .
I approve of this already because of consistency, but the more this logic is used, the more it should be unified together.

MajorMcDoom · 2024-02-23T20:02:03Z

but the more this logic is used, the more it should be unified together.

I totally thought this too, but after looking at the different use cases, it turned out to be more different than similar. Handling paths for example is very different from handling just names. And handling search could also be different from handling filtering, depending on the UX requirements. For example, you might want typo-forgiveness in search, but you wouldn't want that in filtering.

The only similarity turned out to be looping through an array of tokens obtained from (String::split), and imo that's not enough to justify unification, just yet. But I agree, if we do end up with identical use cases, we should unify.

MajorMcDoom · 2024-02-23T20:08:52Z

@Mickeon Oh hmmm, you know, upon closer inspection, it looks like the FileSystem filter could definitely use your utility logic!
It is gonna be a bigger task though. Perhaps that could be another PR. My current PR was a straight-forward integration into the current system, so it wasn't a huge task. It might be a good stop-gap?

Mickeon · 2024-02-23T20:45:20Z

It absolutely needs to be tackled separately (and maybe by someone that has even more understanding of the codebase) because it'd be a huge undertaking, and it would have to be applied literally everywhere. There are many filter LineEdits out there, and they all would benefit from mostly the same changes.

I'm just pointing it out because code duplication is nasty.
Plus, I do remember @Calinou often bringing up better filters. I do not know the exact wording of it, but it's something about accented characters not being filtered as one may expect.

KoBeWi · 2024-03-12T21:52:25Z

This is useful for when you know multiple parts that the file path or name contains, but not the order, or what is between them.
This also means you don't have to use accurate naming conventions like snake_case or PascalCase or dash-case.

This was already supported (except the order part, but it's not often used I think?), with the caveat that you can't make spaces. From 4.3 dev4:

KXBEchqhld.mp4

editor/filesystem_dock.cpp

editor/filesystem_dock.h

editor/editor_quick_open.cpp

…lter.

akien-mga · 2024-04-19T14:31:42Z

Thanks!

KoBeWi · 2024-05-01T10:11:33Z

This was already supported (except the order part, but it's not often used I think?), with the caveat that you can't make spaces.

Apparently this no longer works. You are forced to make spaces now. I preferred the old way tbh >_>

brevven · 2024-08-02T10:49:59Z

Requiring spaces seems somewhat nonstandard when taking into consideration things like VScode and ctrlp.vim. Is requiring spaces something we could make optional?

MajorMcDoom · 2024-08-02T11:51:53Z

Requiring spaces seems somewhat nonstandard when taking into consideration things like VScode and ctrlp.vim. Is requiring spaces something we could make optional?

See earlier comments in the thread. Both the old and new solutions have big caveats. There has to be a new algorithm implemented for search, in another PR, and would have to be used throughout the whole editor's search features for consistency and to limit code reuse. There were also candidates suggested.

brevven · 2024-08-05T21:17:13Z

Thanks, understood, and can certainly understand the desire to standardize and DRY things up. Given that that's a lot of work, in the meantime, would adding an option on whether or not to require spaces be something the project would support. Not sure that I have time to contribute this, but wondering if it's the type of thing that's worth considering.

MajorMcDoom · 2024-08-05T21:28:55Z

Thanks, understood, and can certainly understand the desire to standardize and DRY things up. Given that that's a lot of work, in the meantime, would adding an option on whether or not to require spaces be something the project would support. Not sure that I have time to contribute this, but wondering if it's the type of thing that's worth considering.

I'm not sure how far away we are from a standard, but I would like to just add an FYI, if this is something someone wants to pursue:

The option you're describing cannot be achieved as an option within the current algorithm. It would have to be a toggle between the old behaviour and the current behaviour, which would affect not only ordering of search results, but also whether certain ones show up at all, because they are fundamentally different algorithms. The current one cannot just toggle "requires spaces", because it fundamentally works by splitting the search string in the first place.

a-johnston · 2024-09-25T21:06:43Z

I think these comments about space/no space handling have been effectively handled by this pr #82200. I made some changes which further improved matching on 3 personal projects (1 large, 2 small) which I used for testing. I'm not sure how much further work it would be to integrate that into the other search/filter modals though; potentially rather than have a static class we could configure class instances as matching contexts to tune each instance to the particular task, but still reuse the core matching logic.

KoBeWi · 2024-10-01T22:18:35Z

There is also #56772, which seems to change the behavior too (supporting both old search and the tokenized one).

MajorMcDoom requested a review from a team as a code owner February 22, 2024 09:50

MajorMcDoom mentioned this pull request Feb 22, 2024

Make search boxes for files treat spaces the same way as search boxes for nodes godotengine/godot-proposals#9145

Closed

MajorMcDoom force-pushed the tokenized-file-search branch from 42774ca to c36f9eb Compare February 22, 2024 09:53

AThousandShips added enhancement topic:editor labels Feb 22, 2024

AThousandShips added this to the 4.x milestone Feb 22, 2024

KoBeWi reviewed Mar 12, 2024

View reviewed changes

editor/filesystem_dock.cpp Outdated Show resolved Hide resolved

KoBeWi reviewed Mar 12, 2024

View reviewed changes

editor/filesystem_dock.h Outdated Show resolved Hide resolved

KoBeWi reviewed Mar 12, 2024

View reviewed changes

editor/editor_quick_open.cpp Show resolved Hide resolved

KoBeWi reviewed Mar 12, 2024

View reviewed changes

editor/editor_quick_open.cpp Outdated Show resolved Hide resolved

MajorMcDoom force-pushed the tokenized-file-search branch from c36f9eb to b8b698a Compare April 18, 2024 01:38

Added tokenized search support to Quick Open dialog and FileSystem fi…

fbfda46

…lter.

MajorMcDoom force-pushed the tokenized-file-search branch from b8b698a to fbfda46 Compare April 18, 2024 02:13

KoBeWi approved these changes Apr 18, 2024

View reviewed changes

akien-mga modified the milestones: 4.x, 4.3 Apr 18, 2024

akien-mga merged commit 3acd14d into godotengine:master Apr 19, 2024
16 checks passed

mihe mentioned this pull request May 31, 2024

Redesign Quick Open #56772

Merged

KoBeWi mentioned this pull request Jul 29, 2024

Editor quick open no longer allows for skipped characters, instead requires substrings #94907

Closed

brevven mentioned this pull request Oct 3, 2024

Script editor back button doesn't work as advertised (regression) #97031

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tokenized search support to Quick Open dialog and FileSystem filter #88660

Add tokenized search support to Quick Open dialog and FileSystem filter #88660

MajorMcDoom commented Feb 22, 2024 •

edited by AThousandShips

Loading

Cammymoop commented Feb 23, 2024

AdriaandeJongh commented Feb 23, 2024

MajorMcDoom commented Feb 23, 2024

Mickeon commented Feb 23, 2024

MajorMcDoom commented Feb 23, 2024 •

edited

Loading

MajorMcDoom commented Feb 23, 2024

Mickeon commented Feb 23, 2024

KoBeWi commented Mar 12, 2024

akien-mga commented Apr 19, 2024

KoBeWi commented May 1, 2024

brevven commented Aug 2, 2024

MajorMcDoom commented Aug 2, 2024

brevven commented Aug 5, 2024

MajorMcDoom commented Aug 5, 2024 •

edited

Loading

a-johnston commented Sep 25, 2024

KoBeWi commented Oct 1, 2024

Add tokenized search support to Quick Open dialog and FileSystem filter #88660

Add tokenized search support to Quick Open dialog and FileSystem filter #88660

Conversation

MajorMcDoom commented Feb 22, 2024 • edited by AThousandShips Loading

Tokenization

Smart Sorting for Quick Open

Notes

Cammymoop commented Feb 23, 2024

AdriaandeJongh commented Feb 23, 2024

MajorMcDoom commented Feb 23, 2024

Mickeon commented Feb 23, 2024

MajorMcDoom commented Feb 23, 2024 • edited Loading

MajorMcDoom commented Feb 23, 2024

Mickeon commented Feb 23, 2024

KoBeWi commented Mar 12, 2024

akien-mga commented Apr 19, 2024

KoBeWi commented May 1, 2024

brevven commented Aug 2, 2024

MajorMcDoom commented Aug 2, 2024

brevven commented Aug 5, 2024

MajorMcDoom commented Aug 5, 2024 • edited Loading

a-johnston commented Sep 25, 2024

KoBeWi commented Oct 1, 2024

MajorMcDoom commented Feb 22, 2024 •

edited by AThousandShips

Loading

MajorMcDoom commented Feb 23, 2024 •

edited

Loading

MajorMcDoom commented Aug 5, 2024 •

edited

Loading