Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux: support simple ignore patterns #124

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

paul-marechal
Copy link
Contributor

This is a first pass at implementing some mechanism to filter paths.
Linux only.

Some pattern examples:

  • /absolute/path[/][*]
  • ./path/parts[/][*]
  • **/path/parts[/][*]
  • *path/parts[/][*]
  • path/parts[/][*]

The general idea is to do simple string matching and convert patterns
into path parts that can then be matched against a real path.

For example, **/foo gets converted to /foo/ and we will try to find
this subtring in the following paths:

  • /a/b/foo/c/ matches.
  • /a/b/foobar/c/ does not match.

In order for the pattern to match /a/b/foobar/c/ we need to write
**/foo* which will be converted to /foo. We now have:

  • /a/b/foo/c matches.
  • /a/b/foobar/c matches.

This is because we are not required to find / right after /foo.
Depending on if we use a / character or not, we can match path parts
strictly or not.

The rules for converting patterns are as follow:

  • If a pattern starts with **/ we replace this prefix by / and will
    match the string anywhere.
  • If a pattern starts with * we do not prefix it with / and will
    match the string anywhere.
  • If a pattern starts with ./ we replace this prefix by the directory
    being watched by the current watcher and will match the string from
    root.
  • If a pattern starts with / we will match the string from root.
  • If a pattern starts with none of the above, we prefix with / and
    match the string anywhere.

Then we apply trailing conversion logic:

  • If a pattern contains a * we remove this character plus everything
    that follows.
  • If a pattern ends with / we ensure there's only one trailing /.
  • If a pattern does not finish by either * or / we append /.

The final filters are triaged between two vectors: mRootFilters and
mFilters in order to match starting from the root or to match
anywhere, respectively.

Signed-off-by: Paul Maréchal paul.marechal@ericsson.com


Few notes:

The original goal was to add support for minimatch regexps, but using regexps in C++ is difficult. By that I mean that I made a working prototype using std::regex, but patterns weren't matched like the JS RegExp engine, and it was significantly slow. I tried using v8::RegExp but this lead me into a rabbit hole way too deep for me, the crux being creating a node Isolate for each thread that nsfw starts. I didn't manage to implement v8::RegExp.

My last resort was what you see here: Use "simple" matching algorithms to support few patterns.

This is a first pass at implementing some mechanism to filter paths.
Linux only.

Some pattern examples:

- `/absolute/path[/][*]`
- `./path/parts[/][*]`
- `**/path/parts[/][*]`
- `*path/parts[/][*]`
- `path/parts[/][*]`

The general idea is to do simple string matching and convert patterns
into path parts that can then be matched against a real path.

For example, `**/foo` gets converted to `/foo/` and we will try to find
this subtring in the following paths:

- `/a/b/foo/c/` matches.
- `/a/b/foobar/c/` does not match.

In order for the pattern to match `/a/b/foobar/c/` we need to write
`**/foo*` which will be converted to `/foo`. We now have:

- `/a/b/foo/c` matches.
- `/a/b/foobar/c` matches.

This is because we are not required to find `/` right after `/foo`.
Depending on if we use a `/` character or not, we can match path parts
strictly or not.

The rules for converting patterns are as follow:

- If a pattern starts with `**/` we replace this prefix by `/` and will
match the string anywhere.
- If a pattern starts with `*` we do not prefix it with `/` and will
match the string anywhere.
- If a pattern starts with `./` we replace this prefix by the directory
being watched by the current watcher and will match the string from
root.
- If a pattern starts with `/` we will match the string from root.
- If a pattern starts with none of the above, we prefix with `/` and
match the string anywhere.

Then we apply trailing conversion logic:

- If a pattern contains a `*` we remove this character plus everything
that follows.
- If a pattern ends with `/` we ensure there's only one trailing `/`.
- If a pattern does not finish by either `*` or `/` we append `/`.

The final filters are triaged between two vectors: `mRootFilters` and
`mFilters` in order to match starting from the root or to match
anywhere, respectively.

Signed-off-by: Paul Maréchal <paul.marechal@ericsson.com>
@paul-marechal
Copy link
Contributor Author

@implausible This one might be a bit more involving. Do you know someone that could review it, if you are interested in this feature? It is worth noting that I am not very familiar with C++, hence why I'd like feedback.

@implausible
Copy link
Contributor

I didn't manage to implement v8::RegExp.

This is not terribly surprising. I wouldn't worry about taking this approach.

if you are interested in this feature?

I am interested in this feature, and am not opposed to moving forward with it; however, part of my goal with NSFW was to make sure that the interface has the same capabilities across all platforms. I think primarily I'd like to see a solution that hits that level of coverage before we move forward. I just don't want to merge this and then be on the ropes to implement the rest of the feature set (I can get pretty busy with work on GitKraken).

@paul-marechal
Copy link
Contributor Author

[...] make sure that the interface has the same capabilities across all platforms.

Makes sense. I'll look into this before requiring you again, thanks for the feedback!

@marcdumais-work
Copy link

@marechal-p I suggest linking to the issue you're fixing, e.g. adding "Fixes #54" to your commit message. Maybe when you eventually update the PR.

@bpasero
Copy link

bpasero commented Aug 31, 2021

💯 💯 this would be so helpful for VSCode 🙏 . Any chance we can move forward with this?

Bottom line: VSCode has a setting to ignore paths but it can currently not be applied when NSFW is being used. On Linux this is esp. bad because it can lead to hundreds of file handles getting opened eventually leading to an error if the OS configured maximum of file handles is low.

If we can get official ignore support in nsfw I believe we can start thinking about making nsfw the only file watcher in VSCode across all platforms.

@paul-marechal
Copy link
Contributor Author

@bpasero As you can see on this PR I had implemented the support for this on Linux. The goal is to also support this on other OSs, but I was struggling with Windows to understand how paths works and support glob-like syntax there... Help is welcome!

@bpasero
Copy link

bpasero commented Aug 31, 2021

I see. Yeah Linux is probably the main OS of interest for us too, given the file handle limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants