Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(un)zstd deletes files it shouldn't #1082

Closed
pdknsk opened this issue Mar 30, 2018 · 9 comments
Closed

(un)zstd deletes files it shouldn't #1082

pdknsk opened this issue Mar 30, 2018 · 9 comments
Assignees
Labels

Comments

@pdknsk
Copy link

pdknsk commented Mar 30, 2018

$ touch FILE
$ unzstd -f FILE.zst
zstd: FILE.zst is not a regular file -- ignored 
$ ls FILE
ls: cannot access FILE: No such file or directory

This happens with any suffix zstd knows.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Mar 30, 2018

Indeed, thanks for reporting @pdknsk .

This is related to the use of -f, which tells zstd to overwrite any destination file without asking.
Otherwise, a prompt zstd: FILE already exists; overwrite (y/N) ? would require user input.

This situation happens because zstd opens the destination file first, and the source(s) after.
The intention is to make it possible to pipe multiple sources into a single stdout, as closing/opening multiple times stdout doesn't work.
Since -f tells zstd to erase destination without questioning, it does that first, and only then realizes that source file does not exist.

We'll have to reverse the logic to avoid this issue.

@Cyan4973 Cyan4973 added the bug label Mar 30, 2018
@pdknsk
Copy link
Author

pdknsk commented Mar 30, 2018

I only used -f for brevity of the example. It also happens without it, except that you get prompted first, as you mentioned. Since the prompt is misleading and doesn't warn that the file is to be deleted, it's not much help in this case.

@pdknsk
Copy link
Author

pdknsk commented Mar 30, 2018

Another scenario when the file shouldn't be deleted, caused by the same problem.

$ touch FILE FILE.zst
$ unzstd -f FILE.zst
zstd: FILE.zst: unexpected end of file 
$ ls FILE
ls: cannot access FILE: No such file or directory

@pdknsk pdknsk changed the title unzstd deletes files it shouldn't (un)zstd deletes files it shouldn't Mar 30, 2018
@pdknsk
Copy link
Author

pdknsk commented Mar 30, 2018

Also for zstd.

$ touch FILE.zst
$ zstd -f FILE
zstd: FILE is not a regular file -- ignored
$ ls FILE.zst
ls: cannot access FILE.zst: No such file or directory

@Cyan4973
Copy link
Contributor

All these scenarios have a same root cause, opening destination file before source file.

However, scenario 2 is bit more complex, because source file exists, but is malformed.
Thus, not overwriting destination file requires both that source file exist and that it is decodable (at least at the beginning).

That would delay opening destination file to way later in the decoding process, something that the current framework doesn't allow easily.

Therefore, I'll concentrate first on fixing cases 1 and 3.

@eyherabh
Copy link

eyherabh commented Apr 2, 2018

Would there be anything wrong with creating temporary files and moving (-f) them afterwards or deleting them upon error? Seems to me that it solve all the issues, but is there something I am missing? slow in some systems? portability? Thanks for the advice.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Apr 6, 2018

I guess it would work @eyherabh .

On the negative side, it's a little bit of extra work for the file system.
I guess in most cases it wouldn't be noticeable, except maybe when there are thousands of small files to process quickly.

More importantly, during operation, you would have both FILE and tmp present on storage, with tmp growing, and then FILE entry being erased when tmp is completed and changes name. But if user is a bit low on storage, and does not expect FILE being effectively present twice, that can make a difference (especially if FILE is large).

Also, note that gzip, xz, compress and such do not guarantee recovering the previous file if the newly produced file with same name does not complete (compress or decompress) properly. In which case, the old file is lost.
But that's in essence what -f means : authorization to overwrite.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Apr 6, 2018

I only used -f for brevity of the example. It also happens without it, except that you get prompted first, as you mentioned. Since the prompt is misleading and doesn't warn that the file is to be deleted, it's not much help in this case.

I'm surprised by this statement @pdknsk .
I made a simple verification test :

$ touch FILE
$ unzstd FILE.zst
zstd: FILE already exists; overwrite (y/N) ? n
    not overwritten
$ ls FILE
FILE

The original file is still there.

@pdknsk
Copy link
Author

pdknsk commented Apr 7, 2018

​Yes, I meant when the user confirms to overwrite (delete in this case).

@Cyan4973 Cyan4973 self-assigned this Sep 26, 2018
Cyan4973 added a commit that referenced this issue Oct 1, 2018
when source file does not exist (#1082)
Cyan4973 added a commit that referenced this issue Oct 2, 2018
@Cyan4973 Cyan4973 mentioned this issue Oct 2, 2018
Cyan4973 added a commit that referenced this issue Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants