-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitize output paths #2
Conversation
6fa1665
to
f41db12
Compare
This is still a work in progress but I'd like some feedback on my approach here, specifically the api additions, not the rest of the implementation yet. An issue related to symlinks has come up for Coming up with a good solution to this problem has proved difficult. What I've opted for is adding a feature where we can chose to mimic the original behaviour conditionally, by calling While this is not ideal, it gives Windows users the option to extract everything except symlinks and still succeed, without requiring special privledges. Below is an example of what I expect the ...
if windows {
extractor.SanitizePaths(true)
if !linkCreationRights {
log.Warn("Symlink creation privileges not held, see: http://placeholder.com/how-to-enable-symlink-creation-on-various-versions-of-Windows.html")
extractor.StubLinks(true)
}
}
extractor.Extract(reader)
... I'm also considered adding a
Looking for feedback on that, good idea or bad? |
Other approaches to consider:
|
I've changed it so that users can assign their own All of these are relative to the extraction target, currently a user may not sanitize to a path ouside of the extraction root. This could be changed if desired. I'm starting to incorporate this into Improvements here may be possible but so far this is the best approach I've come up with to deal with this. |
I want to clarify that when I say, "this could be changed" I mean the patch could be changed to allow this. As it is right now, we essentially have a hard limit where you may only extract files to the extraction-root, nowhere else. We could allow people to rewrite the entire path (extraction-root included), if they want to. |
This will influence changes to be made here https://github.com/ipfs/go-ipfs-api/blob/bf536f4bb70da7efc8935d17c1b66089b9505461/shell.go#L544 as well. |
Just a random snippet of anecdata, since I see some mentions of NTFS and interesting path names in the topmost description: What's supported by NTFS the-format-on-disk and what's actually allowed by Windows at various API levels can be divergent. E.g., you can't put colons in filenames in Windows Explorer. But you certainly can when mounting an NTFS drive in linux! And later on the Windows NTFS drivers will continue to deal with this surprisingly calmly. So that can add a whole layer of complexity to the logic or feature detection of what can and cannot fly on a filesystem. |
Does it handle forbidden file name with an extension? Names like |
Still debating on if we should allow sanitize functions to modify the entire input path or just the extraction base. This would mean a refactor where SanitizePathFunc func(pathComponents []string) (joinedComponents string) is changed to SanitizePathFunc func(path string) (sanitizedPath string, userDefined error) This would eliminate the need for a callback and allow for chaining functions. Users who wish to use the built in sanitizer, monitor changes, and/or use their own sanitizer on top, could do something like this: // edited 2018-05-08
extractor.Sanitize(true)
builtinSanitizer := extractor.SanitizePathFunc
extractor.SanitizePathFunc = func(path string) string {
builtinSanitizedPath, _ := builtinSanitizer(path)
if path != builtinSanitizedPath {
fmt.Printf("Builtin sanitizer changed path %s -> %s\n", path, mySanitizedPath)
}
mySanitizedPath, err := someLocalFunction(builtinSanitizedPath)
if err != nil {
return err // user defined error from local function that aborts extraction
}
if builtinSanitizedPath != mySanitizedPath {
fmt.Printf("Local sanitizer changed path %s -> %s\n", builtinSanitizedPath, mySanitizedPath)
}
}
extractor.Extract() Which seems better. However library users will have access to the entire path. This means they could redirect output to any path, not just children of the extraction root. Whether this is a benefit or detriment is what needs to be decided upon. I don't think there's any harm in allowing full path rewrites here, but I'm unsure about it. @warpfork
The current goal for the Windows implementation is to stay within the restrictions laid out here: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx Even if we can circumvent these limitations for creation+IO, it may not be desirable, as other applications may have unexpected behaviour when supplied with these file paths. For non-Windows however, the interface here should allow people to implement/impose a less strict subset for NTFS on their platform if desired. Related anecdote, MacOS seems to also have restrictions around @Kubuxu
|
c34d2ce
to
f27e878
Compare
extractor.go
Outdated
elems = elems[1:] // remove original root | ||
// Sanitize toggles output sanitation, using built in sanitation functions | ||
// (Modify paths to be platform legal, symlinks may not escape extraction root) | ||
func (te *Extractor) Sanitize(toggle bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this input arg seems to be ignored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it should actually check it and unset the function pointers on false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small question, then LGTM. Why is windows so complicated?
53d07d8
to
ddacde9
Compare
@whyrusleeping
We're going to need another 30 days for that one. 8-) Edit: |
@djdv thanks :) I made you a collaborator, so if you go accept that invite, you should be able to merge this yourself (trying to put merge power in other non-me peoples hands as much as possible) |
This allows paths inside of an archive to be extracted even if their target-path would otherwise be illegal.
TODO:
This isn't true, I can create files with^
is allowed on Windows/NTFS but not Windows/FAT32, accounting for this will require detecting the target FS and switching censor-sets.^
in the filename on FAT, FAT32, and NTFS on Windows.I'm waiting for something upstream in order to handle
NUL
directory output, it's either going to be fixed upstream or handled specifically here.