Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow down the candidates by multiple keywords #1

Closed
mykola-hrebenyuk opened this issue Nov 5, 2013 · 11 comments
Closed

Narrow down the candidates by multiple keywords #1

mykola-hrebenyuk opened this issue Nov 5, 2013 · 11 comments

Comments

@mykola-hrebenyuk
Copy link

It would be very handy if fzf could perform keywords matching separated by a space " ".

For example, if we look for match of candidates that meet both conditions "foo" and "bar":

foo bar

You can also specify negative conditions with an exclamation mark "!". This matches candidates that meet "foo" but do not meet "bar":

foo !bar

@junegunn
Copy link
Owner

junegunn commented Nov 6, 2013

Hi, thanks for the suggestion.

It surely is a nice-to-have feature, but I usually take the minimalist approach and do not try to add a feature which is not necessary in order to keep the implementation simple.

  1. We can just write foobar or barfoo. Of course it does not match foo-bar and bar-foo at the same time, or foboar, but I personally haven't felt the need for being able to do so especially when working with a list of files. So I'm not sure. Can you give me an example where this feature would be a great benefit?
  2. The user will not be able to match literal space character, so we should introduce a command-line option to enable or disable this feature. Alternatively, maybe we could allow user to split the query with TAB key or something.
  3. Regarding !, I'm reluctant to introduce a custom syntax which can add complexity. I want fzf to be instantly obvious to anyone so that it does not need any more explanation. Also, as the case of space, the user will not be able to match ! character in this case.
  4. FYI, fzf caches the intermediate results to speed up the search and improve the user experience. (e.g. appl -> apple, or pple -> apple, etc.) The code is going to be much more complex if we allow multiple matchers. So I'm a bit worried about that.

So the answer is not yes or no at this point. I'll think about it. Thanks.

@mykola-hrebenyuk
Copy link
Author

Thanks for your explanations.

Actually the idea was borrowed from extensively using of zaw with extended-search enabled.
If you are interested you can read about it by clicking on the link above and scrolling to the bottom of page.

Anyway, thanks for great command line booster!

@junegunn
Copy link
Owner

junegunn commented Nov 8, 2013

zaw looks interesting, thanks for the link!

I'm currently trying out zaw. One thing I noticed is that it does not support fuzzy-matching (I can't type in ape to find apple), and it seems to me that one of the benefits of its extended-search mode is that it can be used to complement this limitation (ap e will match apple).

Having an advanced, extended search mode on fzf would be nice, but because of the inherent difference between fzf and zaw, we may have to come up an implementation that better suits fzf, rather than just following the way of zaw.

@mykola-hrebenyuk
Copy link
Author

I'm currently trying out zaw.

You may take a look on zaw in action from this blog entry.

One thing I noticed is that it does not support fuzzy-matching

Yes, it filters entries in its own way by tokens provided on its prompt. And it is very handy indeed.
For example, if you look for xml file but you are not sure what exactly one, you can just start from:

xml$

After that entries would be filtered to match only xml files. At this point you recalled that it has "party" in its path/name so you type:

par

As entries are further narrowing down on your screen, you would see from the result that the file is actually related to identity provider configuration, thus you type:

idp

Finally the whole query would look xml$ par idp for getting

.../.../.../idp-shibboleth/.../.../relaying-parties.xml

In other words your filter query develops with developing (clerifying) your thoughts in your head. In this approach order of query keywords doesn't matter, whereas in fuzzy-matcher it does. (You can't just type xmlparidp and get the same result).

I believe it would be great to have this power by providing 'keywords' mode to fzf.
Let give end user right to choose what better suits its workflow.

@junegunn
Copy link
Owner

Thanks for the detailed explanation. The usage pattern you mentioned makes perfect sense.
At the same time, it highlights the contrast between the keyword mode and the current behavior of fzf
and leaves me with some questions.

  • Should fzf try to replace zaw? Why not use both? (Of course one can't use zaw if not on zsh though.)
  • Does it make sense to have a non-fuzzy-finding method in fzf which stands for "fuzzy finder"?
  • If we've decided to implement one, why should we support only ^ and $, instead of full regular expression? Wouldn't it be better to be able to write something like (ya?ml|xml)$ [0-9]+?

I don't have answers to these questions yet.
Maybe I need more time (or more experience with the tool) to be able to evaluate the options.

@mykola-hrebenyuk
Copy link
Author

Should fzf try to replace zaw? Why not use both? (Of course one can't use zaw if not on zsh though.)

Zaw's killer feature -- matching by multiple keywords without order --- was main reason for me switching to zsh.
Though I personally using it only as better alternative for incremental history search and that's speed up my work enormously. Having the feature in fzf would mean free of choice to use it: no matter whatever a shell you use and a workflow you are stick with, you always have 'universal filter tool' -- fzf -- that suits your particular needs.

Does it make sense to have a non-fuzzy-finding method in fzf which stands for "fuzzy finder"?

Don't see any problem with that. Of course it's up to you to take the final decision.

If we've decided to implement one, why should we support only ^ and $, instead of full regular expression? Wouldn't it be better to be able to write something like (ya?ml|xml)$ [0-9]+?

From my perspective full regex would be overhead. Because fzf is not sed. We run fzf to be more productive thus spend as less time on it as possible. And leave it as soon as our find query is matched.

@junegunn
Copy link
Owner

At this point, I think allowing out-of-order matching of patterns could be a nice add-on to fzf, but I'm not so sure about supporting ^ or $, since they invalidate the basic premise of fzf, fuzzy matching.

So for example, if we look for an item that includes apple, banana, and orange, but not sure about the ordering between them, we may type in something like aple ornge bnna, which is an OOO fuzzy matching. I'm not against having such feature, although increased complexity in implementation is still a concern.

@junegunn
Copy link
Owner

An idea:

  • Treat tokens without ^ or $ as fuzzy matching patterns
  • If a token starts with ^ or ends with $, it is matched as it is.

@mykola-hrebenyuk
Copy link
Author

Or add option +t, --tokens that mimics zaw's search engine.

@junegunn
Copy link
Owner

Oh, I don't mean to make the scheme I mentioned above the default behavior of fzf, it will have to be explicitly enabled by the user with a command-line option. In my opinion, the proposed scheme is an improvement over zaw's method as it allows fuzzy matching.

@junegunn
Copy link
Owner

Check out https://github.com/junegunn/fzf#extended-search-mode

It's not exactly same as the extended-search mode of zaw, as it uses fuzzy matcher by default as mentioned in the previous comment, unless a word starts with a single quote. I haven't given enough thought about this new syntax so I'm not confident about it, but it looks okay to me at the moment. (FYI it was borrowed from the similar one from Clojure syntax)

junegunn added a commit that referenced this issue Apr 3, 2014
e.g.
  Match region #1: [-----------]
  Match region #2:       [---]
  Match region #3:         [------]
liskin added a commit to liskin/fzf that referenced this issue Nov 11, 2020
This prevents mistakes like the one fixed by the previous commit, and
also speeds bash startup a tiny bit:

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      22.4 ms ±   0.6 ms    [User: 28.7 ms, System: 7.8 ms]
      Range (min … max):    21.7 ms …  25.2 ms    123 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs
liskin added a commit to liskin/fzf that referenced this issue Nov 11, 2020
Commit d4ad4a2 slowed loading of completion.bash significantly (on my
laptop from 10 ms to 30 ms), then 54891d1 improved that (to 20 ms) but
it still stands out as the heavy part of my .bashrc.

Rewriting __fzf_orig_completion_filter to pure bash without forking to
sed/awk brings this back under 10 ms.

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):       9.6 ms ±   0.3 ms    [User: 8.0 ms, System: 2.2 ms]
      Range (min … max):     9.3 ms …  11.4 ms    298 runs
liskin added a commit to liskin/fzf that referenced this issue Nov 11, 2020
This prevents mistakes like the one fixed by the previous commit, and
also speeds bash startup a tiny bit:

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      22.4 ms ±   0.6 ms    [User: 28.7 ms, System: 7.8 ms]
      Range (min … max):    21.7 ms …  25.2 ms    123 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs
liskin added a commit to liskin/fzf that referenced this issue Nov 11, 2020
Commit d4ad4a2 slowed loading of completion.bash significantly (on my
laptop from 10 ms to 30 ms), then 54891d1 improved that (to 20 ms) but
it still stands out as the heavy part of my .bashrc.

Rewriting __fzf_orig_completion_filter to pure bash without forking to
sed/awk brings this back under 10 ms.

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):       9.6 ms ±   0.3 ms    [User: 8.0 ms, System: 2.2 ms]
      Range (min … max):     9.3 ms …  11.4 ms    298 runs

Fixes: d4ad4a2 ("[bash-completion] Fix default alias/variable completion")
Fixes: 54891d1 ("[bash-completion] Minor optimization")
liskin added a commit to liskin/fzf that referenced this issue Nov 12, 2020
Commit d4ad4a2 slowed loading of completion.bash significantly (on my
laptop from 10 ms to 30 ms), then 54891d1 improved that (to 20 ms) but
it still stands out as the heavy part of my .bashrc.

Rewriting __fzf_orig_completion_filter to pure bash without forking to
sed/awk brings this back under 10 ms.

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):       9.6 ms ±   0.3 ms    [User: 8.0 ms, System: 2.2 ms]
      Range (min … max):     9.3 ms …  11.4 ms    298 runs

Fixes: d4ad4a2 ("[bash-completion] Fix default alias/variable completion")
Fixes: 54891d1 ("[bash-completion] Minor optimization")
junegunn pushed a commit that referenced this issue Nov 12, 2020
This prevents mistakes like the one fixed by the previous commit, and
also speeds bash startup a tiny bit:

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark #1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      22.4 ms ±   0.6 ms    [User: 28.7 ms, System: 7.8 ms]
      Range (min … max):    21.7 ms …  25.2 ms    123 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark #1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs
junegunn pushed a commit that referenced this issue Nov 12, 2020
Commit d4ad4a2 slowed loading of completion.bash significantly (on my
laptop from 10 ms to 30 ms), then 54891d1 improved that (to 20 ms) but
it still stands out as the heavy part of my .bashrc.

Rewriting __fzf_orig_completion_filter to pure bash without forking to
sed/awk brings this back under 10 ms.

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark #1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark #1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):       9.6 ms ±   0.3 ms    [User: 8.0 ms, System: 2.2 ms]
      Range (min … max):     9.3 ms …  11.4 ms    298 runs

Fixes: d4ad4a2 ("[bash-completion] Fix default alias/variable completion")
Fixes: 54891d1 ("[bash-completion] Minor optimization")
kralicky pushed a commit to kralicky/fzf that referenced this issue Jun 23, 2021
This prevents mistakes like the one fixed by the previous commit, and
also speeds bash startup a tiny bit:

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      22.4 ms ±   0.6 ms    [User: 28.7 ms, System: 7.8 ms]
      Range (min … max):    21.7 ms …  25.2 ms    123 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs
kralicky pushed a commit to kralicky/fzf that referenced this issue Jun 23, 2021
Commit d4ad4a2 slowed loading of completion.bash significantly (on my
laptop from 10 ms to 30 ms), then 54891d1 improved that (to 20 ms) but
it still stands out as the heavy part of my .bashrc.

Rewriting __fzf_orig_completion_filter to pure bash without forking to
sed/awk brings this back under 10 ms.

before:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):      21.2 ms ±   0.3 ms    [User: 24.9 ms, System: 6.4 ms]
      Range (min … max):    20.7 ms …  23.3 ms    132 runs

after:

    $ HISTFILE=/tmp/bashhist hyperfine 'bash --rcfile shell/completion.bash -i'
    Benchmark junegunn#1: bash --rcfile shell/completion.bash -i
      Time (mean ± σ):       9.6 ms ±   0.3 ms    [User: 8.0 ms, System: 2.2 ms]
      Range (min … max):     9.3 ms …  11.4 ms    298 runs

Fixes: d4ad4a2 ("[bash-completion] Fix default alias/variable completion")
Fixes: 54891d1 ("[bash-completion] Minor optimization")
junegunn added a commit that referenced this issue Aug 2, 2022
Favors the line with shorter matched chunk. A chunk is a set of
consecutive non-whitespace characters.

Unlike the default `length`, this new scheme works well with tabular input.

  # length prefers item #1, because the whole line is shorter,
  # chunk prefers item #2, because the matched chunk ("foo") is shorter
  fzf --height=6 --header-lines=2 --tiebreak=chunk --reverse --query=fo << "EOF"
  N | Field1 | Field2 | Field3
  - | ------ | ------ | ------
  1 | hello  | foobar | baz
  2 | world  | foo    | bazbaz
  EOF

If the input does not contain any spaces, `chunk` is equivalent to
`length`. But we're not going to set it as the default because it is
computationally more expensive.

Close #2285
Close #2537
- Not the exact solution to --tiebreak=length not taking --nth into account,
  but this should work. And the added benefit is that it works well even
  when --nth is not provided.
- Adding a bonus point to the last character of a word didn't turn out great.
  The order of the result suddenly changes when you type in the last
  character in the word producing a jarring effect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants