Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

globset (and thus ripgrep) interpret braces ({}) as alternative ("a or b") whereas git doesn't #1221

Open
anntzer opened this issue Mar 13, 2019 · 4 comments
Labels
bug A bug.

Comments

@anntzer
Copy link
Contributor

anntzer commented Mar 13, 2019

What version of ripgrep are you using?

ripgrep 0.9.0
-SIMD -AVX

How did you install ripgrep?

Fedora 29 distro package.

What operating system are you using ripgrep on?

Fedora 29.

Describe your question, feature request, or bug.

globset (and thus ripgrep) interpret braces ({}) as alternative ("a or b") whereas git doesn't, causing a different interpretation of entries containing braces in .gitignore.

If this is a bug, what are the steps to reproduce the behavior?

git init foo && cd foo
touch 'test.txt' 'test.txt{}'
echo '*.txt{}' >.gitignore
git status
rg --files --debug

If this is a bug, what is the actual behavior?

$ git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        .gitignore
        foo.txt

nothing added to commit but untracked files present (use "git add" to track)

$ rg --files --debug
DEBUG/grep::search//usr/share/cargo/registry/grep-0.1.9/src/search.rs:195: original regex HIR pattern:
(?:[Zz]{0})*
DEBUG/grep::search//usr/share/cargo/registry/grep-0.1.9/src/search.rs:197: transformed regex HIR pattern:
(?:[Zz]{0})*
DEBUG/globset//usr/share/cargo/registry/globset-0.4.1/src/lib.rs:429: built glob set; 0 literals, 1 basenames, 0 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG/globset//usr/share/cargo/registry/globset-0.4.1/src/lib.rs:424: glob converted to regex: Glob { glob: "**/*.txt{}", re: "(?-u)^(?:/?|.*/).*\\.txt$", opts: GlobOptions { case_insensitive: false, literal_separator: false, backslash_escape: true }, tokens: Tokens([RecursivePrefix, ZeroOrMore, Literal('.'), Literal('t'), Literal('x'), Literal('t'), Alternates([Tokens([])])]) }
DEBUG/globset//usr/share/cargo/registry/globset-0.4.1/src/lib.rs:429: built glob set; 0 literals, 0 basenames, 0 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 1 regexes
DEBUG/ignore::walk//usr/share/cargo/registry/ignore-0.4.3/src/walk.rs:1414: ignoring ./foo.txt: Ignore(IgnoreMatch(Gitignore(Glob { from: Some("./.gitignore"), original: "*.txt{}", actual: "**/*.txt{}", is_whitelist: false, is_only_dir: false })))
DEBUG/ignore::walk//usr/share/cargo/registry/ignore-0.4.3/src/walk.rs:1414: ignoring ./.git: Ignore(IgnoreMatch(Override(Glob(Matched(Glob { from: None, original: "!.git", actual: "**/.git", is_whitelist: true, is_only_dir: false })))))
.gitignore
foo.txt{}

If this is a bug, what is the expected behavior?

rg --files should list foo.txt and ignore foo.txt{} (matching git's behavior), not the opposite.

@okdana
Copy link
Contributor

okdana commented Mar 13, 2019

IMO, one thing that ripgrep probably should do, regardless of whether it disables brace expansion in .gitignore files (which i suppose would require a setting to toggle, communication of that setting, &c.), is let brace sets without alternates just fall through. This is how bash and zsh both behave:

% print -r {}
{}
% print -r {foo}
{foo}
% print -r {foo,bar}
foo bar

It still wouldn't match Git's behaviour exactly, but i imagine it'd be a lot harder to trigger the difference in practice.

@yroyon
Copy link

yroyon commented Mar 31, 2019

I was about to open a new issue, but my problem is the same as this one.

Reproducer:

$ ls -A
.gitignore
$ cat .gitignore
{{a}}
$ rg whatevs
./.gitignore: line 1: error parsing glob '{{a}}': nested alternate groups are not allowed
$ rg --version
ripgrep 0.10.0 (rev 0913972104)

Note that {{something}} is a legal filename. There are tools out there, data shippers for Elasticsearch, that use dirnames like {{cookiecutter.beat}}. I'm getting boatloads of error messages whenever I use ripgrep near those guys.

@BurntSushi
Copy link
Owner

This ticket is a duplicate of #1183, but since this one has more activity, I'm going to close #1183.

@BurntSushi BurntSushi added the bug A bug. label Apr 12, 2019
ilyagr added a commit to ilyagr/ripgrep that referenced this issue Mar 9, 2021
This syntax does not exist in `git`, so it is not documented in `man gitignore`. There is
a question of whether it *should* exist, but as long as it does, it should be documented
somewhere.

See also:
BurntSushi#1221
BurntSushi#1368
BurntSushi pushed a commit that referenced this issue May 30, 2021
This syntax does not exist in `git`, so it is not documented in `man
gitignore`. There is a question of whether it *should* exist, but as
long as it does, it should be documented somewhere.

See also:
#1221
#1368

Closes #1816
BurntSushi pushed a commit that referenced this issue Jun 1, 2021
This syntax does not exist in `git`, so it is not documented in `man
gitignore`. There is a question of whether it *should* exist, but as
long as it does, it should be documented somewhere.

See also:
#1221
#1368

Closes #1816
BurntSushi pushed a commit that referenced this issue Jun 1, 2021
This syntax does not exist in `git`, so it is not documented in `man
gitignore`. There is a question of whether it *should* exist, but as
long as it does, it should be documented somewhere.

See also:
#1221
#1368

Closes #1816
@gofri
Copy link
Contributor

gofri commented Apr 18, 2023

@BurntSushi

This one raises an issue for my application of ripgrep, so I'm willing to work on a PR for it if the following sounds ok to you:

  1. Adding literal_alternates : bool to GlobOptions. Parsing the Alternates token with respect to this option.
  2. For explicit globs add --glob-literal-alternates and --pre-glob-literal-alternates to ripgrep and set in the GlobOptions.
  3. For ignore files auto discovery, add --ignore-file-literal-alternates and a corresponding literal_alternates to GitignoreBuilder (similar to case_sensitive).

Please let me know if that'd be potentially merged.
Also, if it does, let me know if I missed use cases.


The main issue I see is create_gitignore() which wraps the functionality for directories and would need to be modified to also accept this new flag. That is a breaking change, so I guess that would have to go to the next semver (I can add a new function and replace the calls in ignore/src/dir.rs without breaking APIs, but that'd make the code a bit messy, so let me know what you think about that specifically)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug.
Projects
None yet
Development

No branches or pull requests

5 participants