-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloud implementation of shutil.copy and shutil.copytree #125
Conversation
A few thoughts:
|
I will move the implementation as methods of For the cloud-to-cloud, I was planning indeed to implement differently the case with the same client. And about this, do you have a preferred approach of checking whether the client of |
I think actually that checking for the same instance is the right way to go, i.e., |
cloudpathlib/cloudshutil.py
Outdated
""" | ||
if not src.is_file(): | ||
raise ValueError('src must be a file.') | ||
if isinstance(src, CloudPath) & isinstance(dst, CloudPath): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One little note that all these should be and
not &
Codecov Report
@@ Coverage Diff @@
## master #125 +/- ##
========================================
- Coverage 91.8% 89.8% -2.0%
========================================
Files 19 19
Lines 1050 1076 +26
========================================
+ Hits 964 967 +3
- Misses 86 109 +23
|
) | ||
temp_file = self._local | ||
temp_file.parent.mkdir(parents=True) | ||
self.download_to(temp_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of download to a temp_file
, you can do self.fspath
which will use the cached version if it is downloaded and download it to the cache if not. So the lines below can just use that. For example destination.upload_from(self.fspath)
raise ValueError( | ||
f"Origin path {self} must be a directory. To copy a single file use the method copy." | ||
) | ||
if not destination.is_dir(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check might require that destination
already exists, but I don't think that should be a requirement. Maybe use destination.exists() and destination.is_file()
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. is_dir
will return False
if destination
does not exist. And, shutil.copytree
will create the destination directory, so we should mirror that behavior.
Hi @genziano, seems like this is heading in the right direction—excited to get it merged! Anything that we can help with in terms of implementation questions or writing tests? |
Hi @pjbull! I have unfortunately been busy these last weeks, but next week I will work on finishing it up, and hopefully picking up some new issue. Unrelated to this PR, I have a question: do you guys have some examples of usage with Azure blob storage? I'm not sure about how to create paths pointing to azure blob accounts. I'd also be happy to include some examples if needed! |
Is something like this what you mean or also including the authentication/credential flow (which is in the docs) from cloudpathlib import CloudPath, AzureBlobPath
# generic version starting with az:// is dispatched to AzureBlobPath
az_path = CloudPath("az://container/directory/file.txt")
# explicit version
az_path = AzureBlobPath("az://container/directory/file.txt") |
Yes thanks, exactly that. I didn't figure out that the account name was just part of the connection string. |
Hello! |
* Cloud implementation of shutil.copy and shutil.copytree (#125) * Draft of cloudpathlib.copy, that replicates shutil.copy * Import only PathLike from os * Reuse _local_to_cloud_copy for cached files * try to use the intended cached _local version * Update cloudshutil.py * Implemented the CloudPath methods: upload_from, copy, copytree * Add docstrings * Fix indentation * fix unnecessary imports * Fix typing issues * Allow CloudPath.copy to a directory as well * Add remove_src option to enable copy * update copy, copytree, upload_from * Fixes for bugs from writing tests * Tests for copy, copytree, upload * Mypy checks * Add sleeps to ensure write times in tests * Update functions table * extra sleep time for live tests * Address code review comments * Add chagelog updates for release Co-authored-by: Alessandro Gentile <alemaudit@gmail.com>
Closes #108 and closes #58.
Work in progress.
Three methods of
CloudPath
implemented:upload_from
, to upload a local file to a cloud file path;copy
, to copy a cloud file to a cloud file or directory path;copytree
, to copy a cloud directory path to another cloud directory path recursively.Still to implement
copy
differently, which avoids download / upload when the client is the same.(I will take care of the tests after we're set with the changes.)