Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: os: safer file open functions #67002

Open
neild opened this issue Apr 23, 2024 · 87 comments
Open

proposal: os: safer file open functions #67002

neild opened this issue Apr 23, 2024 · 87 comments
Labels
Milestone

Comments

@neild
Copy link
Contributor

neild commented Apr 23, 2024

Please see the updated proposal in #67002 (comment)


Directory traversal vulnerabilities are a common class of vulnerability, in which an attacker tricks a program into opening a file that it did not intend. These attacks often take the form of providing a relative pathname such as "../../../etc/passwd", which results in access outside an intended location. CVE-2024-3400 is a recent, real-world example of directory traversal leading to an actively exploited remote code execution vulnerability.

A related, but less commonly exploited, class of vulnerability involves unintended symlink traversal, in which an attacker creates a symbolic link in the filesystem and manipulates the target into following it.

I propose adding several new functions to the os package to aid in safely opening files with untrusted filename components and defending against symlink traversal.


It is very common for programs to open a file in a known location using an untrusted filename. Programs can avoid directory traversal attacks by first validating the filename with a function like filepath.IsLocal. Defending against symlink traversal is harder.

I propose adding functions to open a file in a location:

package os

// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)

// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)

The OpenFileIn, OpenIn, and CreateIn family of functions safely open a file within a given location, defending against directory traversal, symlinks to unexpected locations, and unexpected access to Windows device files.


All modern Unix systems that I know of provide an openat call, to open a file relative to an existing directory handle (FD). Windows provides an equivalent (NtCreateFile with ObjectAttributes including a RootDirectory). Of the supported Go ports, I believe only js and plan9 do not support openat or an equivalent.

I propose adding support for openat-like behavior to os.File:

package os

// OpenFile opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFile returns an error.
func (f *File) OpenFile(name string, flag int, perm FileMode) (*File, error)

// Create creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) Create(name string) (*File, error)

// Open opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)

Like the top-level CreateIn, OpenIn, and OpenFileIn, the methods defend against accessing files outside the given directory. This is unlike the default behavior of openat, which permits absolute paths, relative paths outside the root, and symlink traversal outside the root. (It corresponds to Linux's openat2 with the RESOLVE_BENEATH flag.)

A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.


The above functions defend against symlink traversal that leads outside of the designated root directory. Some users may wish to defend against symlink traversal entirely. Many modern operating systems provide an easy way to disable symlink following: Linux has RESOLVE_NO_SYMLINKS, Darwin has O_NOFOLLOW_ANY, and some other platforms have equivalents.

I propose adding support for disabling symlink traversal to the os package:

const (
	// O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
	// or [File.OpenFile], disallows resolution of symbolic links anywhere in the
	// named file.
	//
	// O_NOFOLLOW_ANY affects the handling of symbolic links in all components
	// of the filename. (In contrast, the O_NOFOLLOW flag supported by many
	// platforms only affects resolution of the last path component.) 
	//
	// O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
	// parameter of [OpenFileIn].
	//
	// O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
	// or Plan 9 bind mounts.
	//
	// On platforms which support symbolic links but do not provide a way to
	// disable symbolic link traversal (GOOS=js), open functions return an error
	// if O_NOFOLLOW_ANY is provided.
	O_NOFOLLOW_ANY int = (some value)
)

O_NOFOLLOW_ANY may be passed to OpenFile, OpenFIleIn, or File.OpenFIle to disable symlink traversal in any component of the file name. For OpenFileIn, symlinks would still be permitted in the directory component.

On platforms which do not support the equivalent of O_NOFOLLOW_ANY/RESOLVE_NO_SYMLINKS natively, the os package will use successive openat calls with O_NOFOLLOW to emulate it. On platforms with no openat (plan9 and js), open operations will return an error when O_NOFOLLOW_ANY is specified.

@neild neild added the Proposal label Apr 23, 2024
@gopherbot gopherbot added this to the Proposal milestone Apr 23, 2024
@seankhliao
Copy link
Member

seankhliao commented Apr 23, 2024

is this essentially https://pkg.go.dev/github.com/google/safeopen with Beneath -> In ?

that also has a ReadFile / WriteFile variant which I'd use more then the create version.

@neild
Copy link
Contributor Author

neild commented Apr 23, 2024

The design of this proposal is influenced by github.com/google/safeopen, but differs in a few areas. (Sorry, I really should have mentioned safeopen as prior art.)

Of the three parts of this proposal:

  • os.OpenIn is essentially safeopen.OpenBeneath.
  • File.Open is a slightly more limited but safer version of openat, and has no equivalent in safeopen.
  • O_NOFOLLOW_ANY has no equivalent in safeopen.

ReadFileIn and WriteFileIn seem like a useful and logical extension of this proposal.

@dsnet
Copy link
Member

dsnet commented Apr 24, 2024

Yes, please. When I was working on safe file operations and it turned out to be hard to do correctly without OS support.

Without O_NOFOLLOW, you have to slowly check every segment for symlinks before traversing into it. For the naive implementation, how do you protect against TOCTOU bugs? At the moment that you check some path segment and verify that it's not a symlink (or a safe one) and then proceed to descend into it, some other process (or goroutine) could have asynchronously changed the target.

@dsnet
Copy link
Member

dsnet commented Apr 24, 2024

What, if any, changes would be made to "io/fs"? Ideally, there is a mirror of these APIs in that package.

@neild
Copy link
Contributor Author

neild commented Apr 24, 2024

If we wanted to extend this proposal to io.FS, I believe the one addition would be:

package fs

// An OpenFile is a directory file whose entries may be opened with the Open method.
type OpenFile interface {
  File

  // Open opens the named file in the directory.
  //
  // When Open returns an error, it should be of type *PathError
  // with the Op field set to "openat", the Path field set to name,
  // and the Err field describing the problem.
  //
  // Open should reject attempts to open names that do not
  // satisfy ValidPath(name), returning a *PathError with Err set to
  // ErrInvalid or ErrNotExist.
  Open(name string) (File, error)
}

A more interesting question is os.DirFS. Currently, DirFS has two documented limitations: It follows symlinks out of the directory tree, and if the FS root is a relative path then it will be affected by later Chdir calls.

I don't think we can change DirFS's symlink-following behavior: It's documented, and it's a behavior that a user could reasonably depend on.

The interaction between DirFS and Chdir seems less likely to be something a user would depend on, but it is documented. I'm not sure if we can change it at this point, but perhaps.

Perhaps we should add a version of DirFS that opens the directory root at creation time (retaining a handle to it even if the current working directory changes or the root is renamed), and refuses to follow symlinks out of the root. I'm not sure if that should be part of this proposal or a separate one.

@adonovan
Copy link
Member

Perhaps the new names should include an At suffix to make clear to casual readers of the API that these are not the usual open system calls. Either way, the three new methods should probably reference some shared section of documentation on the concept of the at-suffixed operations.

@neild
Copy link
Contributor Author

neild commented Apr 26, 2024

I presume you mean the new method names? The functions have an "In" suffix.

We could also include the In suffix on the methods; I waffled on whether it belongs there or not:

func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
func (f *File) CreateIn(name string) (*File, error)
func (f *File) OpenIn(name string) (*File, error)

I'm trying to avoid the suffix "At" to make it clear that none of these calls are precisely openat. openat(2) permits escaping from the root directory via absolute or relative paths, and doesn't do anything about symlink traversal. (Linux has openat2(2), which is quite configurable. The proposed *In functions are essentially openat2 with the RESOLVE_BENEATH flag.)

@adonovan
Copy link
Member

Fair enough. Should the fs.OpenFile.Open method also be named OpenIn?

@neild
Copy link
Contributor Author

neild commented Apr 26, 2024

Should the fs.OpenFile.Open method also be named OpenIn?

Probably, for consistency.

@rsc
Copy link
Contributor

rsc commented May 29, 2024

Are we missing RemoveIn?

@neild
Copy link
Contributor Author

neild commented May 29, 2024

We should probably have RemoveIn as well:

// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error

// Remove removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error

Perhaps also RemoveAllIn?

@rsc
Copy link
Contributor

rsc commented May 30, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@bjorndm
Copy link

bjorndm commented May 30, 2024

Maybe it would be better if the parent was an fs.FS? Seems more widely applicable, if somewhat more complex.

@qmuntal
Copy link
Contributor

qmuntal commented May 31, 2024

func RemoveIn(parent, name string) error

Note that Windows does not provide (AFAIK) an unlinkat counterpart. It will have to be emulated doing something like:

func RemoveIn(parent, name string) error
  f, err := os.OpenIn(parent, name)
  if err != nil {
    return err
  }
  return syscall.SetFileInformationByHandle(f.Fd(), syscall.FileDispositionInfo, ...)
}

@hherman1
Copy link

hherman1 commented May 31, 2024

As a user, when should I use os.Open vs os.OpenIn? Should I continue to default to os.Open, and only use OpenIn when I am actively avoiding a security issue, or should my default be OpenIn now?

@neild
Copy link
Contributor Author

neild commented Jun 5, 2024

Maybe it would be better if the parent was an fs.FS?

The os package file functions operate on the local filesystem. fs.FS is an abstraction over a filesystem; it sits atop the os package functions, not under them.

If we want to add support for OpenIn on fs.FS filesystems, we would want something like #67002 (comment). We could add that to this proposal if we want, but for now I'm keeping this proposal focused on the os package.

Note that Windows does not provide (AFAIK) an unlinkat counterpart.

I think that's fine. This proposal requires varying degrees of implementation depending on platform already. (Linux has the very nice openat2 with RESOLVE_BENEATH, platforms without an equivalent are going to require us to do more work to produce equivalent behavior.)

If it's not possible to emulate unlinkat on Windows, that might be a problem, but it sounds like it should be possible.

As a user, when should I use os.Open vs os.OpenIn?

You should use OpenIn when you want to open a file within a directory.

I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.

@bjorndm
Copy link

bjorndm commented Jun 5, 2024

The FS OpenFile looks good, yes. Somehow I skipped that comment, sorry.

@magical
Copy link
Contributor

magical commented Jun 7, 2024

A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.

I don't really understand this. What is a program supposed to do if File.Open returns ErrUnsupported? Either it can give up and report an error to the user, meaning that the program simply doesn't work on plan9 or js, Or it can implement the fallback manually,

f, err := parent.Open(filename) 
if err == os.ErrUnsupported {
   f, err = os.OpenIn(parent.Name(), filename)
   //or even: os.Open(path.Join(parent.Name(), filename))
}

which is exactly the "hazardous" behaviour you say you're trying to avoid. If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback. Returning ErrUnsupported is just creating more work for developers for no tangible benefit.

I think this could be addressed perfectly well in the docs by saying that some platforms (linux, windows, etc) provide extra guarantees around renamed files, and that others (plan9 and js) do not.

@neild
Copy link
Contributor Author

neild commented Jun 13, 2024

If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback.

The question is whether this is a reasonable fallback or not.

In the case of os.OpenIn, I think it's reasonable to fall back to a less-secure implementation. Lacking openat, we can statically validate the untrusted filename component for unintended traversal (os.OpenIn(dir, "../escapes")), and we can test for symlinks on the path, but we remain vulnerable to TOCTOU attacks. TOCTOU symlink attacks, where an attacker creates a symlink on the path while we're in the process of validating it, are an edge case and I think it's okay for us to support os.OpenIn on platforms where we can't defend against them (plan9 and js).

In the case of os.File.OpenIn, however, there are valid operations that we simply can't support without openat or the equivalent. With openat, you can open a directory, rename or even delete it, and then continue to access files in that directory. There's no way to simulate this with operations on the directory's filename.

Perhaps it's okay to say that os.File.OpenIn behaves differently on plan9 and js, and that users who need the ability to follow a directory across renames/deletes are responsible for not trying to do so on those platforms. Returning an error is the more conservative choice.

I note also that if you don't need the openat behavior of following a directory across renames, you don't need to use os.File.OpenIn at all--you can just always use os.OpenIn.

@CAFxX
Copy link
Contributor

CAFxX commented Jun 24, 2024

I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.

I would recommend adding guidance in the documentation of the not *In variants calling out that the *In variants exist and recommended for cases when directory escape is not desirable.

@neild
Copy link
Contributor Author

neild commented Jul 22, 2024

Updated proposal, with comments on various changes arising from above discussion and working on implementation.

The OpenFileIn, CreateIn, and OpenIn functions are unchanged from the original proposal:

package os

// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)

// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)

The File methods now all have an In suffix: File.OpenFileIn, File.CreateIn, File.OpenIn. This is clearer overall: For example, f.CreateIn creates a file in the directory f, it doesn't create f. This also resolves an ambiguity between File.Stat and File.StatIn (see below).

package os

// OpenFileIn opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)

// CreateIn creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) CreateIn(name string) (*File, error)

// OpenIn opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)

To the above, we add MkdirIn, RemoveIn, and StatIn functions and methods. Creating directories and removing files are fundamental operations, and there's no reason to leave them out. DirFSIn (see below) provides a traversal-resistant Stat, so StatIn is included here as well.

Open question: Should we add LstatIn as well? How about SymlinkIn? RenameIn? ReadFileIn and WriteFileIn? On one hand, I don't want to let this proposal get out of hand with an endless array of new functions; on the other hand, some of these do seem useful. I'd appreciate proposal committee's thoughts on where we should draw the line with this proposal.

package os

// MkdirIn creates a new directory in the named parent directory
// with the specified name and permission bits (before umask).
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Mkdir].
func MkdirIn(parent, name string, perm FileMode) error

// MkdirIn creates a new directory in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) MkdirIn(name string, perm FileMode) error

// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error

// RemoveIn removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error

// StatIn returns a FileInfo describing the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Stat].
func StatIn(parent, name string) (FileInfo, error)

// StatIn returns a FileInfo describing the named file in the directory associated with  the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) StatIn(name string) (FileInfo, error)

We add os.DirFSIn, a traversal-safe version of os.DirFS.

Open question: DirFSIn or DirInFS? I prefer DirFSIn--"a directory filesystem in (root)", but internal discussion suggested DirInFS might be better.

For the moment, we do not add any new optional interfaces to io/fs, such as fs.OpenFile (see #67002 (comment)).

There are many existing APIs, both in and out of the standard library, that operate on an io/fs.FS. Providing a traversal-resistant FS implementation is a simpler and more effective approach to hardening programs than requiring every API which operates on an FS to check for and use an OpenFile method.

Open question: It seems likely to me that we're going to want more variations on DirFS in the future. For example, it seems reasonable to want an FS that disallows symlink traversal entirely (essentially passing O_NOFOLLOW_ANY to every file open). Therefore, I think DirFSIn should either accept an options struct to allow for future customization, or should return a concrete type with customization methods. ( For example, fs := os.DirFSIn("root"); fs.SetFollowSymlinks(false)). The following returns a concrete type.

package os

// DirFSIn returns a filesystem for the tree of files rooted at the directory dir.
// The directory dir must not be "".
//
// Open calls will resolve symbolic links, but return an error if any link points outside the directory dir.
//
// The returned filesystem implements [io/fs.FS], [io/fs.StatFS], [io/fs.ReadFileFS], and [io/fs.ReadDirFS].
func DirFSIn(dir string) *FS

type FS struct{}
func (fs *FS) Open(name string) (File, error)
func (fs *FS) Stat(name string) (FileInfo, error)
func (fs *FS) ReadFile(name string) ([]byte, error)
func (fs *FS) ReadDir(name string) ([]fs.DirEntry, error)

The O_NOFOLLOW_ANY open flag remains unchanged.

Open question: Should we add os.O_NOFOLLOW? I only realized while implementing this proposal that it doesn't exist already. (Existing code which uses the flag uses syscall.O_NOFOLLOW.) On one hand, if we're supporting a portable O_NOFOLLOW_ANY, perhaps we should support a portable O_NOFOLLOW as well. On the other hand, O_NOFOLLOW can be dangerously surprising, since it only prevents symlink resolution in the final filename component, so perhaps we should stick to the more robust form.

const (
	// O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
	// or [File.OpenFile], disallows resolution of symbolic links anywhere in the
	// named file.
	//
	// O_NOFOLLOW_ANY affects the handling of symbolic links in all components
	// of the filename. (In contrast, the O_NOFOLLOW flag supported by many
	// platforms only affects resolution of the last path component.) 
	//
	// O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
	// parameter of [OpenFileIn].
	//
	// O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
	// or Plan 9 bind mounts.
	//
	// On platforms which support symbolic links but do not provide a way to
	// disable symbolic link traversal (GOOS=js), open functions return an error
	// if O_NOFOLLOW_ANY is provided.
	O_NOFOLLOW_ANY int = (some value)
)

Open question: How should we handle .. relative path components in filenames?

Consider the following directory tree:

  • a/b is a directory.
  • s is a symlink to a/b.
  • f, a/f, and a/b/f are files.

On the Unix command line, if we cat s/../f, we print the contents of the file a/f.

If we open the current directory and openat(curfd, "s/../f"), we also open a/f.

The safeopen package cleans filenames prior to opening a file, so safeopen.OpenBeneath(".", "s/../f") opens the file f. The safeopen package also forbids symlink traversal entirely, so safeopen.OpenBeneath(".", "s/f") returns an error rather than opening a/b/f.

On Windows, things are confusing (and I'm still trying to understand what's going on under the hood): Using NtCreateFile to open a file in "." (the rough equivalent of Unix's openat):

  • s/../f opens a/f.
  • a/b/../f is an error.

It appears that NtCreateFile will resolve .. path components only if a symlink appears somewhere in the path. This is weird enough that I feel like I must be be missing something.

The question is: What should os.OpenIn(".", "s/../f") do in this case? Options I see include:

  • Resolve the symlink s and the relative path component .., and open a/f. This matches Unix openat behavior.
  • Clean the path prior to opening, performing lexical resolution of .. components, and open f. This matches the safeopen package's behavior. I don't like this option, as it defines a new set of nonstandard filesystem semantics (pathnames are lexically resolved prior to opening).
  • Disallow relative path components and return an error.
  • Disallow symlink resolution and return an error when attempting to open s.
  • Disallow both relative path components and symlink resolution.

My current inclination is the first option above: Permit both symlinks and .. path components, and resolve each step of the path in sequence. (So s/../f opens a/f in the above example.) This may be a bit tricky to implement on Windows, but it should be possible.

I can, however, see a good argument for disallowing . and .. relative path components. This simplifies the implementation, there are few if any real-world cases where resolving paths like s/../f is necessary, and users can lexically clean paths with filepath.Clean if desired.


Open question: How should we handle platforms without openat or an equivalent, namely GOOS=plan9 and GOOS=js?

GOOS=js does not permit implementing OpenIn in a fashion free of TOCTOU races (swapping a directory component with a symlink elsewhere on the filesystem). I believe Plan 9 doesn't have symlinks; if that's the case, TOCTOU races are not a concern on it.

GOOS=js and GOOS=plan9 do not permit implementing File.OpenIn correctly. Opening a directory as f, renaming or deleting that directory, and then using f.OpenIn should act on the original directory. Without openat or an equivalent, we have no way to follow the directory handle and the best we can do is act on the original directory path.

I've argued above for supporting OpenIn on these platforms and not supporting File.OpenIn. I think that I've been convinced by arguments above that it's better to support as much of the API as possible, even if platform limitations prevent supporting all of it. I therefore propose that on js and plan9, f.OpenIn("path") behaves equivalently to os.OpenIn(f.Name(), "path").

@bjorndm
Copy link

bjorndm commented Jul 22, 2024

This is rather extensive API. Perhaps a separate package from os would be better? Maybe os/in?

@neild neild mentioned this issue Jul 23, 2024
@rsc
Copy link
Contributor

rsc commented Jul 24, 2024

On a very minor note, Plan 9 can be considered to implement O_NOFOLLOW_ANY because there are no symlinks on Plan 9 at all.

@rsc
Copy link
Contributor

rsc commented Jul 24, 2024

More generally, I understand the motivation here, but the amount of new API is a bit daunting. I think we need to keep thinking about reducing the total amount of API. It seems like there needs to be some type representing the constrained file system. For this message, let's call it a Dir. It would be defined like:

// A Dir represents a root directory in the file system.
// Methods on a Dir can only access files and directories inside that root directory.
// Methods on Dir are safe to be used from multiple goroutines simultaneously.
// After Close is called, methods on Dir return errors.
type Dir struct {
   ...
}

func OpenDir(name string) (*Dir, error)

func (*Dir) FS() fs.FS
func (*Dir) OpenFile
func (*Dir) Create
func (*Dir) Open
func (*Dir) OpenDir
func (*Dir) Mkdir
func (*Dir) Remove
func (*Dir) MkdirAll
func (*Dir) RemoveAll
func (*Dir) Close

All the top-level convenience things like os.OpenIn can be left out. Code can use OpenDir followed by the operation it wants.

That at least feels like a more manageable amount of API.

I have been thinking for a while and have not come up with a name like more than Dir. It's certainly not perfect, and OpenDir would need a doc comment explaining that it's not opendir(3), but it's not bad.

@mateusz834
Copy link
Member

@rsc With this approach it would be also nice to define some globals, like CWD.
Zig also has a simmilar abstraction in the std, also called Dir https://ziglang.org/documentation/master/std/#std.fs.Dir.

@mateusz834
Copy link
Member

Also

// Methods on Dir are safe to be used from multiple goroutines simultaneously.

Is this true for Close?

@rsc
Copy link
Contributor

rsc commented Jul 24, 2024

@mateusz834 Sure, Close can be called from multiple goroutines simultaneously. Same thing is true of os.File.

@cyphar
Copy link

cyphar commented Oct 4, 2024

A few comments:

  • It seems strange to have a Truncate primitive (when ftruncate(2) exists) while not having a link(2) primitive (do other operating systems not have ftruncate(2)?). I guess not all operating systems have hardlinks?

  • This is also somewhat Linux-specific but one other thing to consider is whether O_NOFOLLOW behaviour is something we want to support and how the API for that should look. Being able to reference a symlink using an O_PATH is quite useful on Linux, and emulating it is a little ugly (though possible, to be fair).

  • I also am not sure about (*Root).OpenRoot. If the security model is that the contents of a Root are untrusted then it is absolutely not safe to create a Root from an attacker-controlled directory. Even in-kernel implementations of RESOLVE_BENEATH are not completely safe against an attacker tricking you into resolving outside the root by moving the root during resolution. Surely this can be added later if there is actual evidence someone wants this (mis)feature?

  • Regarding MkdirAll -- one fairly annoying aspect of this is dealing with dangling symlinks. Naive userspace resolvers act differently to in-kernel resolvers when dealing with dangling symlinks in the MkdirAll case, and implementing this correctly is somewhat complicated (see the symlink stack code in filepath-securejoin and libpathrs). If you ever want to support in-kernel resolvers it is necessary to ensure that you don't permit MkdirAll with dangling symlinks, which requires that kind of symlink stack emulation. It's up to you if you feel this kind of complexity belongs in the Go stdlib.

  • I also have some misgivings about GOOS=js. If the goal of this is to be safe against symlink attacks, then surely the best policy is to return an error (preferably at compile-time) on systems where we can't ensure that?

  • As for RemoveAll, os.RemoveAll is now safe on Linux but not yet on Windows (see os: RemoveAll susceptible to symlink race #52745) so calling into the os.RemoveAll implementation on the internal Root file descriptor is probably the simplest way of implementing this.

@neild
Copy link
Contributor Author

neild commented Oct 4, 2024

It seems strange to have a Truncate primitive (when ftruncate(2) exists) while not having a link(2) primitive (do other operating systems not have ftruncate(2)?).

I think this is an oversight. The list includes every top-level os function which operates on files. We have os.Link, so we should have Root.Link as well:

func (*Root) Link

I agree that Root.Truncate doesn't seem very necessary when File.Truncate exists, but supporting every file operation on a Root is simpler than picking and choosing the "necessary" ones.

This is also somewhat Linux-specific but one other thing to consider is whether O_NOFOLLOW behaviour is something we want to support and how the API for that should look.

Root.OpenFile can be used with O_NOFOLLOW, just like the top-level OpenFIle:

r, err := os.OpenRoot("directory")
f, err := r.OpenFile("a/b/c", os.O_NOFOLLOW, 0) // a and a/b might still be symlinks, c must not b

If we want to offer alternate ways of opening files in the future, we can add methods to Root to change its behavior. For example:

// This is not part of the current proposal.
// It is an example of something we might propose in the future.
r, err := os.OpenRoot("directory")
r.SetNofollowAll(true) // do not follow any symlinks

I also am not sure about (*Root).OpenRoot. If the security model is that the contents of a Root are untrusted then it is absolutely not safe to create a Root from an attacker-controlled directory. Even in-kernel implementations of RESOLVE_BENEATH are not completely safe against an attacker tricking you into resolving outside the root by moving the root during resolution. Surely this can be added later if there is actual evidence someone wants this (mis)feature?

I don't understand the attack vector here. Can you explain in more detail?

A Root prohibits escapes via path traversal and symlinks. If you Root.OpenRoot a subdirectory, the new Root prohibits escapes from that subdirectory. Moving the directory doesn't change that.

Root.OpenRoot is essentially open with O_PATH, and is useful for efficiently implementing functions which traverse a directory tree such as os.RemoveAll.

Regarding MkdirAll -- one fairly annoying aspect of this is dealing with dangling symlinks. Naive userspace resolvers act differently to in-kernel resolvers when dealing with dangling symlinks in the MkdirAll case, and implementing this correctly is somewhat complicated (see the symlink stack code in filepath-securejoin and libpathrs). If you ever want to support in-kernel resolvers it is necessary to ensure that you don't permit MkdirAll with dangling symlinks, which requires that kind of symlink stack emulation. It's up to you if you feel this kind of complexity belongs in the Go stdlib.

I'm afraid I don't understand the concern here. There is no syscall (so far as I know) that implements MkdirAll, so whatever behavior we have here will be in user space.

Is a "naive userspace resolver" one which doesn't implement symlink traversal correctly? If so, I do not believe the path resolution in https://go.dev/cl/612136 is naive.

I also have some misgivings about GOOS=js. If the goal of this is to be safe against symlink attacks, then surely the best policy is to return an error (preferably at compile-time) on systems where we can't ensure that?

Root defends against three classes of attacks:

  1. Path name traversal, where a filename like "../../../etc/passwd" escapes from the intended root.
  2. Static symlink traversal, where an attacker instructs a program to create a link out of a root and then traverses it. For example, an attacker might provide a tar archive containing a symlink which the victim then extracts and traverses.
  3. Dynamic attacks, where an attacker is actively modifying the filesystem to exploit TOCTOU races in the victim.

Without openat or some equivalent, we can still defend against path name traversal and symlink traversal, but we are vulnerable to TOCTOU races in symlink traversal.

There is benefit in defending against traversal. Of the real vulnerabilities that I've seen that Root might defend against, I think a strict majority have been in cases where TOCTOU was not a concern. For example, CVE-2024-3000 was a real and significant vulnerability that didn't involve symlinks at all.

In my initial version of this proposal, I proposed that the new API return an error on GOOS=js. Subsequent discussion convinced me that this was a mistake, and that we should implement as much of the new API as is feasible on each platform. Not supporting os.Root on some platforms will give users a reason not to use it at all.

As for RemoveAll, os.RemoveAll is now safe on Linux but not yet on Windows (see os: RemoveAll susceptible to symlink race #52745) so calling into the os.RemoveAll implementation on the internal Root file descriptor is probably the simplest way of implementing this.

Yes, os.RemoveAll and os.Root.RemoveAll will share a common implementation, and we will fix #52745 (except for GOOS=js) as part of this proposal.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/617378 mentions this issue: syscall, internal/syscall/unix: add Openat support for wasip1

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/617376 mentions this issue: internal/syscall/unix: add Mkdirat and Readlinkat

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/617377 mentions this issue: internal/syscall/windows: add NtCreateFile

gopherbot pushed a commit that referenced this issue Oct 7, 2024
Mostly copied from x/sys/windows.

This adds a various related types and functions,
but the purpose is to give access to NtCreateFile,
which can be used as an equivalent to openat.

For #67002

Change-Id: I04e6f630445a55c2000c9c323ce8dcdc7fc0d0e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/617377
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
gopherbot pushed a commit that referenced this issue Oct 7, 2024
For #67002

Change-Id: I460e02db33799c145c296bcf0668fa555199036e
Reviewed-on: https://go-review.googlesource.com/c/go/+/617376
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
gopherbot pushed a commit that referenced this issue Oct 7, 2024
The syscall package is mostly frozen, but wasip1 file syscall
support was added to syscall and the Open and Openat
implementations overlap. Implement Openat in syscall for
overall simplicity.

We already have syscall.Openat for some platforms, so this
doesn't add any new functions to syscall.

For #67002

Change-Id: Ia34b12ef11fc7a3b7832e07b3546a760c23efe5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/617378
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@cyphar
Copy link

cyphar commented Oct 9, 2024

I'm afraid I don't understand the concern here. There is no syscall (so far as I know) that implements MkdirAll, so whatever behavior we have here will be in user space.

Is a "naive userspace resolver" one which doesn't implement symlink traversal correctly? If so, I do not believe the path resolution in https://go.dev/cl/612136 is naive.

Explaining this before you see the issue in practice is a little complicated, but I'll try my best:

The way I implemented MkdirAll was to resolve as many components and then mkdir the remaining components (this is the obvious way of doing it, so I suspect you plan to implement it that way too). In filepath-securejoin this is done with partialLookupInRoot, and in libpathrs this is resolvers::opath::resolve_partial.

The problem is that if the userspace symlink resolver just prepends symlinks to the "remaining path" (which is what your current resolver does), if you hit a dangling symlink the userspace implementation will have behaviour that won't match an openat2-based implementation. For a concrete example:

% mkdir -p tmp/a tmp/b/c
% ln -s ../b/doesnotexist/foo/bar tmp/a/foo
% mkdir -p tmp/a/foo/baz

The final mkdir -p will fail if you implement an openat2 based resolver because resolving tmp/a/foo will fail with ENOENT but it exists so MkdirAll will have to fail with EEXIST or something similar. However, a userspace resolver that doesn't track whether or not you are in the middle of symlink resolution would return a partial lookup of "existingpath=tmp/b remainingpath=/doesnotexist/foo/bar/baz" and so you would end up creating the directory tmp/b/doesnotexist/foo/bar/baz.

The solution I ended up with was creating a "symlink stack" that tracks whether or not the userspace resolver is still in the middle of symlink resolution. The PRs and code I linked to show how I did it in case having a reference might help. https://github.com/cyphar/filepath-securejoin/blob/main/lookup_linux.go is the "non naive" resolver I ended up with as a result.

I only mentioned it because it will make your resolver a fair bit more complicated, and I was wondering whether it makes to have this somewhat ugly thing in the stdlib in the first pass of this API. But then again, os.MkdirAll is kind of important to support (that's why I added it to libpathrs and filepath-securejoin) so I guess it's unavoidable.

I don't understand the attack vector here. Can you explain in more detail?

A Root prohibits escapes via path traversal and symlinks. If you Root.OpenRoot a subdirectory, the new Root prohibits escapes from that subdirectory. Moving the directory doesn't change that.

Root.OpenRoot is essentially open with O_PATH, and is useful for efficiently implementing functions which traverse a directory tree such as os.RemoveAll.

The attacks are similar to chroot-based attacks, when working on RESOLVE_BENEATH there were several attacks I had to find solutions for and while I think that the current implementation is reasonable (it will abort if there is a rename or mount on the system and there is a special path_is_under check before returning), it is entirely possible for there to be some other attack we haven't considered.

The rename attack I was talking about is that the attacker can move the root of the resolution in a way that .. never explicitly crosses the original root point and can thus escape a-la chroot. The practical attack would look like:

  1. Open root /foo/bar (/foo is attacker-controlled).
  2. Try to resolve a/../../b.
  3. Attacker moves /foo/bar/a to /foo/a after you walk into a.
  4. The .. escapes /foo/bar.

Now, your retry-based implementation is not immediately vulnerable to this particular attack, but that doesn't mean there isn't some other similar attack that we just can't think of at the moment.

To paraphrase Jann Horn's comments on openat2, we really should not encourage users to try to do scoped resolution inside an attacker-controlled directory. Making this an explicit part of a stdlib API that can never be removed seems somewhat ill-advised to me. Maybe we can add this later if someone actually needs it?

Of the real vulnerabilities that I've seen that Root might defend against, I think a strict majority have been in cases where TOCTOU was not a concern.

I guess this depends on the projects you work on. My experience is that while non-TOCTOU cases do happen a fair bit1, a lot of programs are run by users in contexts where the TOCTOU case becomes important at some point in the future.

You're quite right that securing the more egregious cases is worth doing, but I just can't shake the concern that having slightly different security semantics on different platforms is something that is going to bite people, even if it is documented.

1: One of my first patches against Docker a decade ago was fixing a non-TOCTOU ../../../../ bug in moby/moby#5720 and a non-racing symlink bug in moby/moby#8000 and I've fixed several similar issues in runc over the years (though in the past few years we've been dealing with TOCTOUs, hence why I started working on openat2 and libpathrs).

@neild
Copy link
Contributor Author

neild commented Oct 9, 2024

However, a userspace resolver that doesn't track whether or not you are in the middle of symlink resolution would return a partial lookup of "existingpath=tmp/b remainingpath=/doesnotexist/foo/bar/baz" and so you would end up creating the directory tmp/b/doesnotexist/foo/bar/baz.

It sounds like you're describing an implementation of Root.MkdirAll that's something like:

  • Convert the input path into a new path with all symlinks resolved.
  • os.MkdirAll this path.

That seems obviously incorrect: Mkdir on a dangling symlink is supposed to fail, but this follows symlinks. It also doesn't seem any simpler than a correct implementation.

A simple and (I believe) correct implementation would be to take the current os.MkdirAll and convert it to call Root.Mkdir rather than os.Mkdir. This implementation is:

  • If the input path has a parent directory, recursively MkdirAll the parent.
  • Mkdir the input path.

Within a Root, this can be fairly inefficient, because each Mkdir may traverse the entire directory tree. A more efficient approach can walk the hierarchy only once:

  • Let cur be the root.
  • For each component p in the input path:
    • cur.Mkdir(p)
    • Replace cur with cur.OpenRoot(p).

(We then need a bit of complexity to handle MkdirAll("a/../b"), but it's not too bad.)

I sketched an implementation of this approach in https://go.dev/cl/619076.

@neild
Copy link
Contributor Author

neild commented Oct 9, 2024

To paraphrase Jann Horn's comments on openat2, we really should not encourage users to try to do scoped resolution inside an attacker-controlled directory. Making this an explicit part of a stdlib API that can never be removed seems somewhat ill-advised to me. Maybe we can add this later if someone actually needs it?

Root.OpenRoot is useful for efficient path-walking, since it lets you avoid unnecessary redundant lookups. It's also just convenient for many tasks. For example, the MkdirAll implementation I linked above (https://go.dev/cl/619076) uses successive OpenRoot calls to walk down a path.

It is, of course, possible--probable, even--that we're going to discover attacks that we haven't accounted for. I don't see, however, how Root.OpenRoot is any more likely to be vulnerable than Root.Open. It's exactly the same operation; both open a file. And once you have the file opened, a Root created with Root.OpenRoot is no different from one created with os.OpenRoot.

@cyphar
Copy link

cyphar commented Oct 9, 2024

It sounds like you're describing an implementation of Root.MkdirAll that's something like:

Convert the input path into a new path with all symlinks resolved.
os.MkdirAll this path.

That seems obviously incorrect: Mkdir on a dangling symlink is supposed to fail, but this follows symlinks. It also doesn't seem any simpler than a correct implementation.

The second step is done with a very restricted MkdirAll that doesn't allow .. or symlinks, so it's not vulnerable to attacks. But I now see why you want Root.OpenRoot.

A simple and (I believe) correct implementation would be to take the current os.MkdirAll and convert it to call Root.Mkdir rather than os.Mkdir. This implementation is:

If the input path has a parent directory, recursively MkdirAll the parent.
Mkdir the input path.

That does work but now you're looking at O(n^3) complexity for a path that is potentially attacker-controlled and is not restricted by PATH_MAX.

Also (and to be fair -- this is a little esoteric and might not be a real problem, but) an operation like MkdirAll("a/b/../c/../d/../e/../f/../g/../") will create all of the directories (a/b, a/c, a/d). Sure, you're creating directories anyway but depending on the particular setup, you could imagine this making it easier to do an DoS by creating directories with lots of entries in a single directory (most filesystems struggle with a single directory containing millions of dentries, and this could be used as a way of amplifying such an attack).

Within a Root, this can be fairly inefficient, because each Mkdir may traverse the entire directory tree. A more efficient approach can walk the hierarchy only once:

Unless I'm missing something this also won't work for symlinks that contain .. (or symlinks to symlinks that contain .., etc etc) and that will be harder to handle than the trivial a/../b in the path argument case.

(FWIW this approach also just doesn't work at all for RESOLVE_IN_ROOT but that's not the usecase you have so w/e.)

@cyphar
Copy link

cyphar commented Oct 9, 2024

I don't see, however, how Root.OpenRoot is any more likely to be vulnerable than Root.Open. It's exactly the same operation; both open a file. And once you have the file opened, a Root created with Root.OpenRoot is no different from one created with os.OpenRoot.

The difference is that Root.OpenRoot is operating on an attacker-controlled directory, while os.OpenRoot is almost always going to be called on a path in an administrator-controlled directory (if you call it on a path in an attacker-controlled directory they can trick you into operating on any host path).

This means an attacker can do rename operations (among other things) on the root itself for Roots created with root.RootOpen while they cannot do that on Roots created with os.OpenRoot.

Root.OpenRoot is useful for efficient path-walking, since it lets you avoid unnecessary redundant lookups. It's also just convenient for many tasks.

How about making it private until someone shows up that has a strong usecase for why it should be part of the API in a way that encourages users to do something that is potentially less safe?

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/619435 mentions this issue: internal/syscall/windows: add Openat

@neild
Copy link
Contributor Author

neild commented Oct 10, 2024

Unless I'm missing something this also won't work for symlinks that contain .. (or symlinks to symlinks that contain .., etc etc) and that will be harder to handle than the trivial a/../b in the path argument case.

Good point (that'll teach me to put together an example CL without writing tests). The real implementation will need a bit more complexity to handle that case. (I suspect the real implementation will involve generalizing doInRoot from https://go.dev/cl/612136 a bit, since it already handles the necessary bookkeeping.)

How about making it private until someone shows up that has a strong usecase for why it should be part of the API in a way that encourages users to do something that is potentially less safe?

I still don't see how Root.OpenRoot is less safe.

A Root, on platforms with openat, is essentially a file descriptor. Root.Open and Root.OpenRoot both open a file descriptor within a Root; the only difference is the type returned (*File or *Root).

I don't see a scenario where Root.Open and Root.OpenRoot don't share the same set of vulnerabilities: If one can escape the root in some scenario, so can the other. So adding OpenRoot doesn't increase the attack surface. And once a Root is created, it doesn't matter whether it was created by os.OpenRoot or Root.OpenRoot; in both cases, it's just a file descriptor.

gopherbot pushed a commit that referenced this issue Oct 11, 2024
Windows versions of openat and mkdirat,
implemented using NtCreateFile.

For #67002

Change-Id: If43b1c1069733e5c45f7d45a69699fec30187308
Reviewed-on: https://go-review.googlesource.com/c/go/+/619435
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/620157 mentions this issue: os: use relative paths in a test dir in TestOpenError

gopherbot pushed a commit that referenced this issue Oct 15, 2024
Refactor TestOpenError to use relative paths in test cases,
in preparation for extending it to test os.Root.

Use a test temporary directory instead of system directory
with presumed-known contents.

Move the testcase type and case definitions inline with the test.

For #67002

Change-Id: Idc53dd9fcecf763d3e4eb3b4643032e3003d7ef4
Reviewed-on: https://go-review.googlesource.com/c/go/+/620157
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@cyphar
Copy link

cyphar commented Oct 16, 2024

A Root, on platforms with openat, is essentially a file descriptor. Root.Open and Root.OpenRoot both open a file descriptor within a Root; the only difference is the type returned (*File or *Root).

The difference is that by design Root.OpenRoot is creating a file descriptor that an attacker can rename. This ability can lead to attacks in other domains -- there is an infamous chroot breakout that relies on this. I don't see a breakout attack at the moment, I'm just suggesting taking a more cautious approach since you can't remove an API from stdlib.

@TBBle
Copy link

TBBle commented Oct 17, 2024

Late to the discussion, but I thought I'd point out that the hcsshim project has an implementation of a similar thing for Windows, used when extracting container images to ensure reparse points cannot be used to break out of the container image's data directory.

Because in that case following a reparse point is always invalid, their version is simpler and rejects any reparse points it encounters, so it does not need to resolve or validate the resulting path for traversal issues.

It's public operations are: OpenRoot, OpenRelative, LinkRelative, RemoveRelative, RemoveAllRelative, MkdirRelative, MkdirAllRelative, LstatRelative, and EnsureNotReparsePointRelative. (That last one is a simple helper, roughly "Could I OpenRelative this path or children thereof successfully?")

One specific thing worth calling out is that LinkRelative is actually func LinkRelative(oldname string, oldroot *os.File, newname string, newroot *os.File), i.e. it can be used to create hardlinks between two different Roots, in this case between the currently-extracted container image's data directory, and a parent container image's data directory.

Apart from that and OpenRelative taking Windows-specific flags (it wraps NtCreateFile pretty loosely) the API seems to match the proposal here.

That said, I'm not sure if the hcsshim code could be reimplemented on top of this new feature because of the requirement to reject any reparse points encountered, rather than resolve them.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/620576 mentions this issue: internal/syscall/windows: set write access when O_TRUNC is used

gopherbot pushed a commit that referenced this issue Oct 21, 2024
Whenn O_TRUNC is set, Opentat ends up calling syscall.Ftruncate, which
needs write access. Make sure write access is not removed when O_TRUNC
and O_APPEND are both set.

Updates #67002.

Change-Id: Iccc470b7be3c62144318d6a707057504f3b74c97
Reviewed-on: https://go-review.googlesource.com/c/go/+/620576
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
@aclements
Copy link
Member

For consistency with #49580, should the Readlink method be ReadLink? Or maybe this is an argument it should be Readlink in #49580.

For the record, os.Root has a somewhat uncomfortable relationship with the io/fs package, though I think it should have little to no effect on this proposal. Looking back, at some point before #67002 (comment), it looks like we resolved some of this tension by adding an FS() fs.FS method, but I didn't see any reasoning.

There seem to be two questions here: Should os.Root implement fs.FS? And should there be an io/fs interface for file systems that support OpenRoot?

os.Root could implement fs.FS, but unfortunately because Go doesn't support covariant method satisfaction, this would require the methods on os.Root to return interface types rather than concrete types. That would in turn narrow them to the read-only APIs provided by fs, unless the caller performed a cumbersome dynamic type assertion.

I think we could add an fs interface supporting OpenRoot. Akin to the other interfaces in fs, OpenRoot would return an fs.FS. os.Root couldn't implement this interface for the reason given above, though the result of (*Root).FS could.

@neild
Copy link
Contributor Author

neild commented Oct 21, 2024

I think we should have Root.Readlink/os.Readlink for consistency.

If we want Root to be consistent with io/fs.ReadLinkFS, then we should make os.Readlink consistent as well. Which I guess is either add os.ReadLink and deprecate os.Readlink, or use Readlink in io/fs. I think using Readlink everywhere is the least amount of ecosystem churn, so I'd lean that way.

As you say, os.Root can't implement io/fs.FS (unless we want to take a detour into adding covariant method satisfaction). The Root.FS method provides a traversal-resistant alternative to os.DirFS.

It might make sense to have an io/fs interface for OpenRoot, but I think that can be a separate proposal. There are many other operations (mostly writes) supported by a Root that don't have an io/fs equivalent.

@neild
Copy link
Contributor Author

neild commented Oct 22, 2024

Should Root have a ReadDir method? The current proposal (#67002 (comment)) doesn't have one.

You can currently list the contents of a Root with Root.Open(".") to get the directory file, and then File.ReadDir. A Root.ReadDir method would be more convenient and efficient.

If we do have a Root.ReadDir, should it match the signature of io/fs.ReadDirFS.ReadDir? (I'm guessing yes.)

func (r *Root) ReadDir(name string) ([]DirEntry, error)

@aclements
Copy link
Member

We can add (*Root).Link. ReadDir is another question, but that can easily be a minor follow-up if necessary and as @neild pointed out, you do have access to File.ReadDir.

There's been a fair amount of discussion since I checked for comments. Any further comments?

@ianthehat
Copy link

Presumably you can just use fs.ReadDir(root.FS(), name) which seems short enough already to not need more API.

@aclements
Copy link
Member

Have all remaining concerns about this proposal been addressed?

The proposal is:

package os

// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
//   - When GOOS=windows, file names may not reference Windows reserved device names
//     such as NUL and COM1.
//   - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
//     attacks in symlink validation, and cannot ensure that operations will not
//     escape the root.
//   - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
//     On these platforms, a Root references a directory name, not a file descriptor
type Root struct { ... }

func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Link
func (*Root) Truncate

func OpenInRoot(dir, name string) (*File, error) {
   r, err := OpenRoot(dir)
   if err != nil { return nil }
   return r.Open(name)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Active
Development

No branches or pull requests