Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I think the regexps are busted #32

Closed
phated opened this issue Jan 27, 2021 · 16 comments
Closed

I think the regexps are busted #32

phated opened this issue Jan 27, 2021 · 16 comments

Comments

@phated
Copy link
Member

phated commented Jan 27, 2021

I was investigating https://github.com/gulpjs/glob-parent/blob/main/index.js#L26-L29 in relation to the test suite and I think the pattern is completely wrong.

For example: 'path/\\[bar]' is matched by that pattern, but the "enclosure" ([bar]) doesn't contain any path separator.

My gut feeling is that the [\/]* pattern is supposed to be [\/]+ so that it matches only if there's one or more / inside the enclosure, instead of 0 or more. But I have no idea... maybe @paulmillr or @es128 have an idea 🤞

@phated
Copy link
Member Author

phated commented Jan 27, 2021

Also cc @mrmlnc in case you know.

@paulmillr
Copy link
Contributor

interesting. but not sure I understand fully. could you make a test case?

@phated
Copy link
Member Author

phated commented Jan 27, 2021

I think my problem in understanding is that we have test cases which are the correct behavior, but I think it's only a side effect of this (potential) mistake 😬 changing the regexp makes them fail

@paulmillr
Copy link
Contributor

this is function that gets glob's parent path

which is path in path/\\[bar]. or root in root/**

@phated
Copy link
Member Author

phated commented Jan 28, 2021

Well, the parent in path/\\[bar] is actually path/[bar] because the [ was escaped in the pattern. This is actually what the test tests, but I think getting that correct outcome is a side-effect of this (potential) bug. When I make the change, the test fails 🙈

@paulmillr
Copy link
Contributor

Ain't backslash escaped here? \\

@phated
Copy link
Member Author

phated commented Jan 28, 2021

I think that's the escape sequence? From the README:

// BAD
globParent('foo \\[bar]') // 'foo '
globParent('foo \\[bar]*') // 'foo '

// GOOD
globParent('./foo \\[bar]') // 'foo [bar]'
globParent('./foo \\[bar]*') // '.'

@es128
Copy link
Contributor

es128 commented Jan 28, 2021

@phated If applying your suggestion makes the test, which indicates the correct behavior, fail - then what are you chasing down?

I see what you’re saying relative to the comment that’s there, but maybe it’s the comment that’s wrong? In other words, perhaps this is deliberately meant to apply to more situations than just the ones the comment refers to.

@phated
Copy link
Member Author

phated commented Jan 28, 2021

@es128 so, I'm wondering if more logic needs to changed to correct this (if it needs to be corrected) and keep the tests passing. I'm specifically hunting this because of the snyk vuln that was published. They sent me a new regexp but it fails our test suite, which caused me to jump down the rabbit hole to investigate each of the regexp behaviors in this library.

@Trott
Copy link
Contributor

Trott commented Feb 3, 2021

I'm specifically hunting this because of the snyk vuln that was published. They sent me a new regexp but it fails our test suite,

This is surprising to me. I think I fixed it without breaking the test suite. (See #34.) I'm curious what the regex they sent you was.

@Trott
Copy link
Contributor

Trott commented Feb 4, 2021

So, everyone may already be clear on this first part, but just in case: The regexp has a no-doubt accidental no-op in the middle (see #34 (comment)). Once that somewhat verbose way of writing "anything at all, even nothing" is removed from the regular expression, it's a little easier to understand.

The regular expression matches an opening [ or {, followed by anything (or nothing) at all, followed by a closing ] or } that is at the end of the string.

So this ensures that 'path/\\[bar]' is treated the same as 'path/\\[bar]/' because the regexp matches the former and the code adds a slash to the string when that happens.

So the regexp really does nothing other than make sure that a path that ends with a ] or a } is treated as if it ended with ]/ or }/ if there is a matching opening brace.

Normally, this isn't very significant. But it has an effect when the opening brace is escaped because of the way path.posix.dirname works.

Is this a bug? If the [ is preceded by \\ (in a string), then it is treated as an escaped character by glob and is expected to be part of the file or directory name, right? So it should not be assumed to be a directory. It seems like if gp('bar') results in '.' then so should gp('\\[bar]') since [bar] is the filename here and there's no globbing since the [ is escaped.

Then there's the whole "what should it do with a / inside of [ ] or { }". Same for { }. As it says in the glob primer, a{/b/c,bcd} matches a/b/c but also matches abcd. So, what's the parent in that case?

To me, this seems like the old story of using a regular expression where a parser is needed. (And even in that case, I'm not sure this will work.) If it were up to me (and of course, it's not) I would land #34 and publish a patch fix so people won't get caught by a ReDoS attack if they are using this in a web server or something like that, then publish a breaking change that gets rid of the enclosures magic, or at least reduces/simplifies it greatly.

@phated
Copy link
Member Author

phated commented Feb 4, 2021

You touched on it a bit. The regexp isn't supposed to match [bar] and is only meant to match [bar/foo,something] because that would indicate that a glob contains a directory segment (the whole point of this match).

Matching [bar] means that there's a bug in the other logic as well, since you are only supposed to append the / to the path if it contains a slash inside the enclosure. So, we have a series of bugs that need to be patched because the tests are lying to us.

@Trott
Copy link
Contributor

Trott commented Feb 4, 2021

So, if I'm understanding correctly, the expected result for gp('path/\\[bar]') should be path but right now it's a bug that it returns path/\\[bar] instead because the \\ escapes the [ which means it's not a glob but an ordinary path element. So gp('path/bar') and gp('path/\\[bar]') should return the same thing.

Is that right?

@phated
Copy link
Member Author

phated commented Feb 4, 2021

I believe that's the correct interpretation, because \\[bar] would be "filename" (for lack of a better term) segment of that path.

Trott added a commit to Trott/glob-parent that referenced this issue Feb 4, 2021
This change fixes the regular expression denial of service
vulnerability.

This also fixes some incorrect tests that concealed a bug.

Fixes: gulpjs#32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
@Trott
Copy link
Contributor

Trott commented Feb 4, 2021

OK, cool, I've updated the PR now to fix the reg exp so that it does not have a ReDoS and now properly checks for a / in the brackets, and also to update tests that have incorrect results per the interpretation above.

Trott added a commit to Trott/glob-parent that referenced this issue Feb 10, 2021
This change fixes a regular expression denial of service
vulnerability.

Fixes: gulpjs#32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
Trott added a commit to Trott/glob-parent that referenced this issue Feb 10, 2021
This change fixes a regular expression denial of service
vulnerability.

Refs: gulpjs#32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
@Artimunor
Copy link

any idea about the timeframe when these pull requests will be accepted and a new version is released?

phated pushed a commit that referenced this issue Mar 6, 2021
This change fixes a regular expression denial of service
vulnerability.

Refs: #32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
Trott added a commit to Trott/glob-parent that referenced this issue Mar 7, 2021
This change fixes the regular expression denial of service
vulnerability.

This also fixes some incorrect tests that concealed a bug.

Fixes: gulpjs#32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
Trott added a commit to Trott/glob-parent that referenced this issue Mar 7, 2021
This change fixes the regular expression denial of service
vulnerability.

This also fixes some incorrect tests that concealed a bug.

Fixes: gulpjs#32
Refs: https://app.snyk.io/vuln/SNYK-JS-GLOBPARENT-1016905
Trott added a commit to Trott/glob-parent that referenced this issue Mar 7, 2021
Trott added a commit to Trott/glob-parent that referenced this issue Mar 7, 2021
@phated phated closed this as completed in 32f6d52 May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants