Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please make sync more robust when encountering inaccessible directories or files #1040

Closed
willbprog127 opened this issue Aug 12, 2024 · 6 comments · Fixed by Backblaze/b2-sdk-python#508

Comments

@willbprog127
Copy link

Greetings,

While using b2-windows version 4.1.0 on Windows 11 (x86_64), the program dies if it encounters directories or files it doesn't have access to. This is undesirable and renders the program almost useless for some use cases. Instead of dying, it should just skip the directory or file that it can't access and move on to the next one.

If this lack of robustness is by design, can you please add a command-line option to disable it?

Command-line (run in terminal as an administrator):

b2-windows sync --skip-newer --threads 1 --no-progress "C:\Users" b2:bucket/parentfolder/Users

Here is the sample error output:

Traceback (most recent call last):
  File "b2\_internal\b2v4\__main__.py", line 13, in <module>
  File "b2\_internal\console_tool.py", line 5529, in main
  File "b2\_internal\console_tool.py", line 5402, in run_command
  File "b2\_internal\console_tool.py", line 1070, in run
  File "b2\_internal\console_tool.py", line 3165, in _run
  File "b2sdk\_internal\sync\sync.py", line 214, in sync_folders
  File "b2sdk\_internal\sync\sync.py", line 263, in _make_folder_sync_actions
  File "b2sdk\_internal\scan\scan.py", line 48, in zip_folders
  File "b2sdk\_internal\scan\folder.py", line 152, in all_files
  File "b2sdk\_internal\scan\folder.py", line 258, in _walk_relative_paths
  File "b2sdk\_internal\scan\folder.py", line 258, in _walk_relative_paths
  File "b2sdk\_internal\scan\folder.py", line 239, in _walk_relative_paths
  File "pathlib.py", line 1056, in iterdir
PermissionError: [WinError 5] Access is denied: '\\\\?\\C:\\Users\\All Users\\Application Data'
[12500] Failed to execute script '__main__' due to unhandled exception!

Then the program exits instead of moving on to the next directory.

While setting --exclude-dir-regex will help in some instances, the command-line tool should be robust enough to skip inaccessible directories or files even if no excludes are specified.

Thanks. 👍

@willbprog127
Copy link
Author

Since the command-line tool does not appear to be working right, I tried doing this with b2sdk and it's doing the exact same thing...

Traceback (most recent call last):
  File "C:\Users\Will\Projects\b2-sdk-python-backup\backup-and-b2-sdk-laptop.py", line 204, in <module>
    syncer.sync_folders(src_obj, dest_obj, now_millis, None, encryption_settings_provider=encrypt_provider)
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\sync\sync.py", line 214, in sync_folders
    for action in self._make_folder_sync_actions(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\sync\sync.py", line 263, in _make_folder_sync_actions
    for source_path, dest_path in zip_folders(
                                  ^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\scan.py", line 48, in zip_folders
    current_a = next(iter_a, None)
                ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\folder.py", line 152, in all_files
    yield from sorted(local_paths, key=lambda lp: lp.relative_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\folder.py", line 258, in _walk_relative_paths
    yield from self._walk_relative_paths(
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\folder.py", line 258, in _walk_relative_paths
    yield from self._walk_relative_paths(
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\folder.py", line 258, in _walk_relative_paths
    yield from self._walk_relative_paths(
  File "C:\Program Files\Python312\Lib\site-packages\b2sdk\_internal\scan\folder.py", line 239, in _walk_relative_paths
    for local_path in sorted(local_dir.iterdir()):
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python312\Lib\pathlib.py", line 1056, in iterdir
    for name in os.listdir(self):
                ^^^^^^^^^^^^^^^^
PermissionError: [WinError 5] Access is denied: '\\\\?\\C:\\Users\\Guy\\AppData\\Local\\Application Data'

@ppolewicz
Copy link
Collaborator

Hey,

it is a rare requirement - didn't really come up often (if ever) for the first 6 or so years since sync was implemented. I agree that it might be useful in some usecases to skip over files which are not accessible, though by default it should fail the job as a job completing with nothing done whatsoever (because it didn't have access to anything, so skipped anything) and a success return code would be very confusing.

Do you have arbitrary files/directories that you have no permissions to in the directory that you are trying to back up? Because if there is a fixed list, you can use --exclude-dir-regex and it'd be safer (as explained above).

@willbprog127
Copy link
Author

though by default it should fail the job as a job completing with nothing done whatsoever (because it didn't have access to anything, so skipped anything)

Unfortunately this will not work in my case. As shown in the command-line above, it is supposed to sync all of C:\Users, so some folders are accessible while others are not. Shouldn't sync try to do everything it possibly can before giving up? I am thinking of apps like 7-Zip that will just log an error then continue -- 7-Zip's mode of failure is robust compared to how b2 sync is behaving. A command-line option to enable skipping errors would allow maximum utility for all users.

Do you have arbitrary files/directories that you have no permissions to in the directory that you are trying to back up? Because if there is a fixed list, you can use --exclude-dir-regex and it'd be safer (as explained above).

My plan was to deploy this on many of my customer computers, so fiddling with excludes would be time-consuming and not very flexible because of the various setups.

I guess I will look at moving my customers off of b2 and come up with some other solution. Some customers have been waiting for this to be resolved and are growing impatient with me.

Thank you for replying Paweł! 👍

@willbprog127
Copy link
Author

@mjurbanski-reef Thank you so much! 😄

@mjurbanski-reef
Copy link
Contributor

@willbprog127 Please note that even when you feel like B2 CLI/SDK is not working as you wish it to, and consider switching service, you can use S3 compatible tools with B2 as well through https://www.backblaze.com/docs/cloud-storage-s3-compatible-api . We encourage use of the native CLI&SDK, but the option is there.

That being said, we have been investigating this some more and found this to be a regression. Hence, a bugfix to SDK has been released in b2sdk==2.5.1 , which will be included in CLI next release.
The expected behavior is that the synchronization will continue, but warnings will be printed out to stdout.
In b2v4 CLI (which b2 binary is an alias of), the exit code will be set to 1 as well as always when permission problems are encountered.
The b2v3 binary will keep the 0 exit code regardless of these warnings.

@willbprog127
Copy link
Author

@mjurbanski-reef Thanks again. I will investigate further when I have some time. I appreciate the attention to this! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants