Extended search behaviour options #2894
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Extended chosen with several options to improve flexibility of searching significantly.
package.json
.Backstory
A client of mine used his brand new system for about a month and then commented something like:
"Overall I love it! It's faster than the old system and it looks great. It's just a pity the search is so lousy."
When I dug a bit deeper it turned out he was referring to the Chosen selects I had implemented everywhere in an effort to make it easier to find stuff.
One issue for example, is that he has a bunch of Chosens where he can choose a customer based on their name. Now, some people may have the same first and last name, and some of them may have middle names. They are listed with other info that makes it possible to distinguish who's who if only they turn up in the search.
What my client means by "lousy" is that if he enters a persons first and last name, that person doesn't show up in the list if his middle name happens to also be in the database.
At this point I had to agree that "the search is lousy".
So I set out to improve it, and I believe what I have now is worth sharing.
What it is
New options:
search_term_delimiter
What:
true
,false
,string
orRegExp
Why: Massive improvement to the flexibility of searches - this was my primary goal
How: When set to
true
whitespaces are considered query separators, so a query with multiple words is split into terms that may be found in separate positions in the option. It also accepts astring
or aRegExp
, so if someone fx. wants to interpret spaces as literal, but have both asterix, question mark and comma work as "wildcards" that is possible (together withsearch_contains
).Defaults to
false
so it doesn't affect default behaviour or backwards compatibility - the entire query is interpreted as one literal term just the way it always has been unless this is set to something other thanfalse
.search_word_boundary
What:
string
Why: The default
^|\\b|\\s
is remarkably bad at finding word boundaries in languages that use non-unicode characters. A word boundary is basically detected after every non-ascii character (fx. ü, å, ø and æ to mention just a few - but there are many*). I've looked into possibilities, and unfortunately there doesn't seem to be any way to get decent word-boundary detection for anything except ascii in javascripts RegExp implementation... without either using a third-party library or including some 4k+ characters in the string. Turning it into an option means that people can at least set something appropriate for their language and individual use case if they care. I know I've seen this discussed on GitHub, but can't find it at this moment.How: Whatever this is set to will be used as the word boundary in the regexes when enable_split_word_search is true, but search_contains is false (which are the defaults!).
Defaults to
^|\\b|\\s
so it doesn't affect default behaviour or backwards compatibility (I'm actually not 100% sure here - I think I saw that this was changed quite recently somewhere?).search_in_order
What:
Boolean
Why: More flexibility
How: When
true
terms will be matched in any order. Fx.Smith John
would matchJohn Smith
. When false options will only match if the terms appear in exactly the same order they are entered in the query (but other characters may appear in between depending on other settings).Defaults to
true
- because when you want to match individual terms, you most likely want to match them even in a different order, as long as the option contains all the entered terms. Does not affect default behaviour or backwards compatibility because the query is interpreted a single literal term.search_highlight_full_match
What:
Boolean
Why: Because in certain situations it just makes more sense.
How: When
true
the entire match will be highlighted - ie. from the first matched letter to the last.Defaults to
false
- because in the vast majority of cases it is much clearer to highlight only the typed letters.group_search (Not really new!)
What:
Boolean
Why: I didn't actually add this, but it wasn't documented anywhere, so I've added it to the documentation and examples. The behaviour should be the same as previously.
How: When set to
false
chosen will not search inside group labels and show the entire group among the results.Defaults to
true
, which means do search in group labels.I've added a new section to the demo-page that makes it easier to see the result of all the different settings.
Compatibility and testing
I believe this correctly solves the problem that previous "fuzzy search" PR's tried to solve - but without the issues they each had.
As far as my own testing goes:
One known issue that I haven't solved is that with
search_in_order = false
,search_contains = true
and a query containing many very short terms that are present in a lot of options it will be rather slow. For instance typing ins a i n t h e l e n a a s c e n s
will become pretty sluggish after the first 5-10 "terms". Add just one two-letter term and it dramatically improves. For now I have decided this is enough of an edge-case to not be worth fixing... I simply don't see it becoming relevant in any real-world query. I'm sure it can be fixed if anyone can come up with an example where it actually would be relevant.Cleaning up the naming...?
I'll take the opportunity to also propose the following changes to pre-existing search-behaviour related options (which I have not included in this PR):
These changes can easily be implemented without breaking compatibility by preserving the current names as alternatives (maybe with a deprecation warning in the console) for the next x months / years / versions.
References
Previous PR's attempting to push flexibility into chosens master branch
#2599
#2811
#2028
#858
#867
#1037
#2028
A couple of open issues that would be fixed
#2752
#2831
... there are probably several others