Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callable supplied to shutil.rmtree(onexc=...) should accept a single argument #103218

Closed
barneygale opened this issue Apr 3, 2023 · 4 comments
Closed
Labels
3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@barneygale
Copy link
Contributor

barneygale commented Apr 3, 2023

The onexc argument was added two weeks ago to replace onerror: d51a6dc

Though it improves the third argument (replacing a 3-tuple with the exception object), I think we should also consider two further improvements while we're in the business of replacing onerror, as the opportunity may not come again for a long time:

  1. The first argument, function, provides an unreasonably deep insight into the internals of rmtree(). The implementation already bends the truth, for example supplying os.path.islink when a junction check fails. Error reporting can already be achieved by printing or re-raising the exception. It's not used in the standard library (e.g. by tempfile). What legitimate use cases are there? I propose we remove it!
  2. The second argument, path, is already be available via exc.filename. This is noted in os.walk()'s documentation of its onerror argument. Also propose we remove it.

If we make onexc accept a single argument, it would align with the onerror argument to os.walk() and os.fwalk(), and provide more relevant (/less redundant) information.

@barneygale barneygale added the type-bug An unexpected behavior, bug, or error label Apr 3, 2023
@barneygale barneygale added the 3.12 bugs and security fixes label Apr 3, 2023
@eryksun
Copy link
Contributor

eryksun commented Apr 3, 2023

The function argument enables the error handler to resolve the error and possibly retry the operation, or a related operation, in order to allow rmtree() to continue.

The implementation already bends the truth, for example supplying os.path.islink when a junction check fails.

To be strictly correct, we could eliminate the _rmtree_islink() call in the top level of shutil.rmtree():

        try:
            if os.path.islink(path):
                # symlinks to directories are forbidden, see bug #1669
                raise OSError("Cannot call rmtree on a symbolic link")
        except OSError as err:
            onexc(os.path.islink, path, err)
            # can't continue even if onexc hook returns
            return
        try:
            if os.path.isjunction(path):
                raise OSError("Cannot call rmtree on a junction")
        except OSError as err:
            onexc(os.path.isjunction, path, err)
            return

In either case, a good way to resolve the error is to delete the symlink or junction reparse point. The upside of calling the error handler with os.path.islink for a junction is that older code might handle the error appropriately via os.unlink(). OTOH, older code would probably just log an error due to os.path.isjunction, and give up.

The second argument, path, is already be available via exc.filename.

Note that the existing OSError call in the above code doesn't include the filename. That would have to be fixed. Also, when _use_fd_functions is true, there are a couple of cases that handle a generic Exception, which could be TypeError or OverflowError (e.g. dir_fd is too big).

@barneygale
Copy link
Contributor Author

The function argument enables the error handler to resolve the error and possibly retry the operation, or a related operation, in order to allow rmtree() to continue.

When is the exception (and file path) not enough to determine that? tempfile uses rmtree() and supplies its own error handler that resets permissions, but it doesn't look at function.

In either case, a good way to resolve the error is to delete the symlink or junction reparse point. The upside of calling the error handler with os.path.islink for a junction is that older code might handle the error appropriately via os.unlink(). OTOH, older code would probably just log an error due to os.path.isjunction, and give up.

That kind of illustrates why I dislike this: it makes the internal implementation part of the function's public interface. We have an opportunity to remove this feature while we're adjusting the error handling in rmtree(), and personally I don't see a good reason to keep it.

@eryksun
Copy link
Contributor

eryksun commented Apr 4, 2023

When is the exception (and file path) not enough to determine that? tempfile uses rmtree() and supplies its own error handler that resets permissions, but it doesn't look at function.

Granted, but I wouldn't hold up tempfile as a shining example. On Windows, its error handler can blow up with a recursion error (see #79325) due to a sharing violation, or any other permission error on Windows that can't be fixed within the limits of the standard library, since the standard library has no support at all for managing file permissions on Windows. I posted a minor modification that avoids the recursion error, but Serhiy never followed up. I don't recall why I added resetperms_funcs. I don't think it's necessary, or even for the better.

Possibly the strongest argument in favor of doing away with the function parameter is that what the path refers to could change before the error handler executes. Or a file could be added to a directory that was empty, wherein a failed rmdir() now has to be handled via rmtree().

@barneygale
Copy link
Contributor Author

barneygale commented May 27, 2023

3.12 beta 1 is out, so there's no longer an opportunity to revise the argument. Closing this ticket!

@barneygale barneygale closed this as not planned Won't fix, can't repro, duplicate, stale May 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants