-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IncludePatterns and ExcludePatterns options for Copy #101
Add IncludePatterns and ExcludePatterns options for Copy #101
Conversation
Allow include patterns and exclude patterns to be specified, similarly to Walker. There is a bit of extra complexity to handle the case of a pattern like `a/*/c`. In this case, creating the directories `a` and `a/b` may need to be delayed until we encounter `a/b/c`.
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
prefix/match.go
Outdated
) | ||
|
||
func Match(pattern, name string) (bool, bool) { | ||
count := strings.Count(name, string(filepath.Separator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear if filepath
is correct in here. Iiuc the old code only uses this function for local paths but I see this imported in buildkit/contenthash where all paths should be normalized to unix. At least the test should not pass in non-unix I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right about this. I guess https://github.com/moby/buildkit/pull/2082/files#diff-364fe775b55b3eec6b3ea0859ab920f0981c61d377e47830a46de5ec04058c44R437 does not fail on Windows because there are no test cases that depend on parent dirs being included in the checksum. Not sure if there's a way to simulate a change in parent dir metadata in that test so we can capture that scenario.
copy/copy_test.go
Outdated
@@ -10,6 +10,7 @@ import ( | |||
|
|||
"github.com/containerd/continuity/fs/fstest" | |||
"github.com/pkg/errors" | |||
"github.com/stretchr/testify/assert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this assert intentional? Maybe a comment in the check if true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow, do you prefer require
over assert
? I generally use assert
unless the failed assertion would put the test in a bad state where it doesn't make sense to continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use require
exclusively so it is a bit out of place and should have a comment so that it is not reverted in the future for consistency.
if !fi.IsDir() { | ||
if include { | ||
if err := c.createParentDirs(src, srcComponents, target, overwriteTargetMetadata); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this quite a significant added performance overhead to rerun these checks where most of the time directories exist? If true then I think we need some caching. As everything is sorted shouldn't take too much memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does add overhead, but only to the case where an include pattern is matched - so it won't regress performance of any existing cases.
Happy to add caching if you think it's worth the complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is needed. The number of syscalls per file is quite important for copy.
copy/copy.go
Outdated
return err | ||
} else if !overwriteTargetMetadata { | ||
} else if !overwriteTargetMetadata || !includeAll { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit confused about the includeAll check in here as exclude patterns seem to be not affected by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created
will always be false if includeAll
is false (because in that case we delay creation of parent dirs). So this is shorter version of:
if !includeAll {
// directory may not have been created yet, so don't try to set its metadata
copyFileInfo = false
}
copy/copy.go
Outdated
pattern = strings.TrimSuffix(pattern, string(filepath.Separator)) | ||
} | ||
|
||
if ok, p := prefix.Match(pattern, path); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do IncludePatterns/ExcludePatterns use different format/algorithm? Afaics only ExcludePatterns
support !
. Is there a difference of ExcludePatterns
and potentially doing !
in IncludePatterns
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to match the logic in
Line 87 in 8599091
if opt != nil { |
I think I'm following the same logic (it has separate handling for IncludePatterns
and ExcludePatterns
and doesn't appear to handle !
in IncludePatterns
), but maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I don't remember why it is like this. I guess for .dockerignore
is the only use-case atm that required !
and that is the excludes list. IncludePatterns
is not used at all anymore as well and FollowPaths
is used instead. I guess the FollowPaths
case where symlinks need to be followed does not apply here?
I think it is important that something like COPY --filter
can be implemented with this. I'm not quite sure if atm client would need to convert some of the patterns to a negative version. But I guess if we would only use one input then with current implementation depending on if user wants to mostly define include or exclude paths one of them would be much slower. Matching with the options of walker is not the most important here so if you think there is a better/simpler logic we can use instead lmk.
copy/copy.go
Outdated
return false, errors.Wrap(err, "failed to match excludepatterns") | ||
} | ||
if m { | ||
if fi.IsDir() && c.excludePatternMatcher.Exclusions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR but this looks non-optimal. I guess !
is rarely used in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modeled on this code from walker.go
:
Line 131 in 8599091
for _, pat := range pm.Patterns() { |
copy/copy.go
Outdated
@@ -109,7 +114,8 @@ func Copy(ctx context.Context, srcRoot, src, dstRoot, dst string, opts ...Opt) e | |||
if err != nil { | |||
return err | |||
} | |||
if err := c.copy(ctx, srcFollowed, dst, false); err != nil { | |||
includeAll := len(c.includePatterns) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excludepatterns don't matter in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
includeAll
means we can skip evaluating the include patterns. Usually it's used when a parent dir matches an include pattern. It doesn't skip evaluating exclude patterns.
Maybe skipIncludePatterns
would be a better name?
prefix/match.go
Outdated
"strings" | ||
) | ||
|
||
func Match(pattern, name string, slashSeparator bool) (bool, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the return value is (bool, bool)
can we use named return values and a godoc for this function?
pattern = strings.TrimSuffix(pattern, string(filepath.Separator)) | ||
} | ||
|
||
if ok, p := prefix.Match(pattern, path, false); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified:
matched, partial = prefix.Match(pattern, path, false)
if matched && !partial {
break
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could potentially overwrite a true
value of matched
with false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah missed that.
copy/copy.go
Outdated
if err != nil { | ||
return false, errors.Wrap(err, "failed to match excludepatterns") | ||
} | ||
if m { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, prefer:
if !m {
return false, nil
}
dirSlash...
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
@tonistiigi: Any more feedback on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented earlier, different types for include/exclude is a bit of a problem but don't have a specific alternative proposal atm.
Thanks! |
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Allow include and exclude patterns to be specified for the "copy" op, similarly to "local". Depends on tonistiigi/fsutil#101 Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Allow include patterns and exclude patterns to be specified, similarly
to
Walker
.There is a bit of extra complexity to handle the case of a pattern like
a/*/c
. In this case, creating the directoriesa
anda/b
may needto be delayed until we encounter
a/b/c
.cc @hinshun