-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter: fix a bug with parent dirs #203
base: master
Are you sure you want to change the base?
Conversation
This seems to revert changes from #183 (d9cfebb) . From an initial look, the argumentation in #183 makes more sense to me. If |
re: "the function itself should also allow parents if it wants to allow a specific subpath" how would this work? The subpaths are requested from the buildkit server. The Map() function runs on the client, and doesn't know if the subpaths will be requested or not, right? Our current workaround is to ALWAYS include the directory if the Map() doesn't know if the subpaths will be requested. The bug downstream (tilt-dev/tilt#6393) is that this means there are sometimes empty folders lying around, which creates problems in python. |
Maybe that case should use |
honestly, poking around at this, I feel like the FilterOpt API doesn't make much sense anymore. it's vestigial from when Map() was a property of DirSource. Now that the API supports composable FS implementations, maybe we should just implement our own FS with the Map() semantics we need. I don't really understand why you like the current Map() semantics -- according to what you're saying, the only way to implement a valid Map() is to have Map() crawl the subdirectories itself, which means it's duplicating all the logic in Walk(). is there an example of a project that uses it how you think it should work that you could point me to? |
ah, poking around at the downstream Dagger issue, i have an idea! if a Map() returns SkipDir, that should be a "big hammer" that excludes everything underneath. If a subpath of the directory is included by a pattern, the subpath should be ignored. if a Map() returns Exclude, then that should be softer. If a subpath of the directory is included by a pattern, then the subpath should be included, and the parent backtracker should pick up the parent directory. i pushed up some tests to demo the new behavior, what do you think? |
2ea0aea
to
c9de5ba
Compare
Does I guess, the idea of this PR is to try and have the map's The change doesn't change the behavior of any existing valid map functions, but simply makes some previously invalid map functions valid (e.g. in the above, we'd get the parent TL;DR - I think the current state of this PR makes sense to me - but I think we definitely need to annotate the return values for how they behave, the current doc comments aren't very descriptive for this newer behavior. |
@@ -372,6 +372,8 @@ func (fs *filterFS) Walk(ctx context.Context, target string, fn gofs.WalkDirFunc | |||
return err | |||
} | |||
} | |||
|
|||
dir.calledFn = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀 this feels unrelated, were we not setting this before? Was there a case where we were calling the walk target function multiple times? Kinda confused as to why this shows up just here (definitely worth fixing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, i moved this line.
Before this pr, the logic went:
- check if the directory is skipped by includePatterns or excludePatterns
- set calledFn to true
- check if the directory is skipped by mapFn
- call fn
now, i only set calledFn to true directly before calling fn
if the Map() function excludes a directory, but includes a file inside that directory, then the walker must backtrack and include the directory. otherwise, buildkit gets confused and sends "changes out of order" errors. part of fixing tilt-dev/tilt#6393 Signed-off-by: Nick Santos <nick.santos@docker.com>
@jedevc i added some comments on the MapResult options to make the behavior a bit clearer. what do you think? |
@tonistiigi is this PR ok to merge? i'll need a maintainer to merge it for me 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that https://github.com/moby/buildkit/blob/v0.15.0-rc1/exporter/local/fs.go#L116 is wrong (or at least needs to be updated after this PR)?
@@ -197,7 +206,7 @@ func (fs *filterFS) Walk(ctx context.Context, target string, fn gofs.WalkDirFunc | |||
isDir = dirEntry.IsDir() | |||
} | |||
|
|||
if fs.includeMatcher != nil || fs.excludeMatcher != nil { | |||
if fs.includeMatcher != nil || fs.excludeMatcher != nil || fs.mapFn != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm not a big fan of is that everything that gets a map goes to this more complicated code-path to support the backtracking, while most of the time the map is really used to reset uid/gid or skip paths recursively.
Would be really ugly, but smth. like MapAllowBacktrack bool
would solve that.
I don't know how big issue this is in practice though. It might be that most of our requests set includematches/excludematcher anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is your concern more about performance or debuggability?
i'd be in favor of adding a much simpler FS implementation purely focused on the reset uid/gid case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance a bit maybe, but mostly keeping the default code path simpler and preventing potential stability issues.
re: https://github.com/moby/buildkit/blob/v0.15.0-rc1/exporter/local/fs.go#L116 -- i'm not really sure what the intended behavior of that code is, but i think you would want the backtracking there, no? |
I don't think so. If a directory causes a remap error then we shouldn't just add it (with wrong permissions) if it has some child under it. |
The current behavior also seems quite odd to me - if a directory is excluded here, then the files will still be added underneath under the current behavior: which would cause an error on transmission. Currently directories (with valid files underneath) in this case aren't excluded (at least in practice, or we'd see those weird transmission errors everywhere). I think the "right" thing to do in this case would be to use |
in theory, i feel like that code should be using SkipDir -- it'd be faster to skip the whole directory, rather than continuing to traverse it in practice...i couldn't even figure out how to get that codepath to trigger? i traced back to this 5 year old comment about it being an incomplete feature 👀 https://github.com/moby/buildkit/blob/e83d79a51fb49aeb921d8a2348ae14a58701c98c/cmd/buildkitd/config/config.go#L101, but i'm not sure if that comment is still true |
This is enabled by the |
this test demonstrates the bug called out in tonistiigi/fsutil#203 Currently, if a directory's UID can't be mapped to the host, buildkit will export an invalid tarball. Signed-off-by: Nick Santos <nick.santos@docker.com>
this test demonstrates the bug called out in tonistiigi/fsutil#203 Currently, if a directory's UID can't be mapped to the host, buildkit will export an invalid tarball. Signed-off-by: Nick Santos <nick.santos@docker.com>
sorry, have been a bit busy the last couple weeks and finally got back to this. i think the new behavior i'm introducing in this PR is the correct behavior for tarball exports. I opened a PR in buildkit that demonstrates the issue: |
this test demonstrates the bug called out in tonistiigi/fsutil#203 Currently, if a directory's UID can't be mapped to the host, buildkit will export an invalid tarball. Signed-off-by: Nick Santos <nick.santos@docker.com>
if the Map() function excludes a directory,
but includes a file inside that directory,
then the walker must backtrack and include
the directory.
otherwise, buildkit gets confused and
sends "changes out of order" errors.
part of fixing
tilt-dev/tilt#6393