-
-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(recreate_)cmdline encoding / binary requirement #7246
Comments
As much as you expect surrogate-escaped filenames, right? The source paths for data to put into an archive is part of the cmdline (also with borg2, right?), so it could theoretically contain non-UTF-8 and thus surrogate escapes.
Couldn't you have a middle-ground, with shlex.quote'd strings? You could decode the surrogates as |
Yes, you can end up with a path which contains non-UTF-8:
I can imagine this happening when interacting with the storage backing samba shares for instance. To be clear: I have no use-case for that, just being exact here. |
@horazont ok, so guess the code as in the PR is ok now. it stores shlex.join(sys.argv) as s-e-str and later, when generating json or screen output, it removes the s-e (screen/json) and adds a _b64 key to the dict that has the base64 encoding of the bytestring. |
ArchiveItem.cmdline list-of-str -> .command_line str, fixes #7246
borg1:
archive.cmdline
is basically a copy of thesys.argv
list and when accessing it, borg1 even expects surrogate escapes in each list element. when outputting the cmdline to screen or to json, borg1 removes the surrogate escapes. also, it usesshlex.quote
on each list element and joins the list elements together with blanks in between.borg2: I'ld like to simplify that, just not sure how.
the reason why I want to simplify it is that if we really must have a s-e string, then the corresponding json would need to have
cmdline
andcmdline_b64
keys (with lists as values) to correctly represent this information without data loss. But this seems overkill. And I could not even base64 encode the whole cmdline in one go, but it would have to be a list with b64 encoded elements...So:
sys.argv
anyway?archive.cmdline
and if yes, could we just remove or replace them atborg transfer
time?cmdline
just be one simple, valid unicode string (without s-e) and not asys.argv
list copy?The same issues apply to
archive.recreate_cmdline
(borg recreate
uses this).This is related to #7232 and #6151.
The text was updated successfully, but these errors were encountered: