Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Being able to use destinations folders that already exists #8

Open
HugoGranstrom opened this issue Feb 9, 2021 · 2 comments
Open

Comments

@HugoGranstrom
Copy link

It would be a nice feature to be able to choose a destination folder that already exists. For example:

someDir/
	- myZip.zip

And extract it like this:

someDir/
	- myZip.zip
	- the content of myZip.zip

The reason for this is the inconvenience of having to nest the destination in a new folder like this (plus the zip file may itself contain a folder as the top element):

someDir/
	- myZip.zip
	- myUnzippedFolder/
		- the content of myZip.zip

If you don't want to deal with overwriting files that already exist, you could perhaps demand from the user that they clean all old copies away before extracting. And otherwise you raise an exception if you stumble upon a clash.

What do you think? 😄

@guzba
Copy link
Owner

guzba commented May 28, 2021

Hey, so I did think about this a while ago and realized I never put my thoughts down here as a reply.

Currently, when extracting, a temporary directory is created and all of the files are written there first. Then this directory is moved to the final directory path.

This way, if something fails, there isn't a random half finished export polluting the directory. It will either fully work, or fully fail.

This requires the destination path not existing at the start.

Alternatively, to allow the path to already exist, I could either:

  1. Just write the files directly to the target directory, and if it fails partway through, that's just how it goes. There are some files there but not all of them. It's the developer's job to handle that.

  2. Write everything to the tmp dir to ensure the extraction worked, but then instead of atomically moving the entire directory to the destination, I do this one file at a time. It's probably fine, but could still end up in a strange failure where not everything is in the correct place.

In either of these cases, what happens if a file already exists with the same name as the file being uncompressed. Does it overwrite or fail? This is avoided by ensuring it is a clean extraction.

This is all important because zip archives do not require having a root folder as mentioned above. This just happens to be fairly common, but it cannot be counted on all the time.

Let me know what you think of these possible problems.

@HugoGranstrom
Copy link
Author

Hi nice to hear your thoughts on this 😄

I agree that the way you are doing it right now is a very safe and robust way of doing it, just not as flexible as one could want. I agree that we shouldn't pollute the extraction place upon failure so option 2 seems the most interesting to me. What are the possible problems you see with moving the files one by one from the temp dir to the dest? Feels like moving files should be a pretty safe thing to do? 🤔

On the matter of existing files, I think it's good to give the option to the user. If they want to overwrite (opt-in), then overwrite. If they don't want to (default), raise a catchable exception if an existing file is found. Then the user has the responsibility to clean it up before trying again. How to handle it, or when rather, is a more difficult question. Do we check first thing that nothing in the zip already exists or do we do it when copying the files? The first one would reduce pollution but it may be a bit less efficient to loop through the zip's content twice?

So my personal wishlist, and I may be greedy 😜, is the following solution:

  • Extract to temp dir and copy it to dest dir.
  • The user chooses whether to overwrite or raise exception upon pre-existing files.
  • The user is responsible for cleaning up the dest dir in case of exception.

How does this sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants