-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for complex file_id #156
base: main
Are you sure you want to change the base?
Conversation
storage_name, file_id = path.split("/", 1) | ||
if storage_name in cls._storages: | ||
return storage_name, file_id | ||
# the trying complex storage name | ||
return tuple(path.rsplit("/", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only supports two cases:
- complex file_id (e.g
folder/file
) and simple storage name - complex storage name (e.g
storage/folder
) and simple file_id
We could add support for both complex storage_name (storage/folder
) and file_id (subfolder/file
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for replying. Initially, I tried adding support for complex storage_name and file_id at the same time, but the challenge I ran into was that using /
as a separator was insufficient to easily identify where the storage part ended and the file part began. At the time, I could not come up with a simple solution that would also be straightforward to the users. So I settled upon the simplest or
approach that covers the case I'm interested in.
To give more context on the challenge of supporting complex storage_name and complex file_id in the same path, consider this example storage/folder_a/folder_b/folder_c/file
. The parts [folder_a
, folder_b
, folder_c
] can belong to either the storage, the file, or both (in different combinations, where complexity increases with the number of sub-folders).
One way to solve it is to introduce a special path separator that indicates that the left part is a storage and the right part is a file, something like //
so users would use storage/folder_a // folder_b/folder_c/file
. It's a simple approach, but it breaks backward compatibility.
Another approach is similar to the current solution: test the combination of sub-paths against storage and return the first matching. However, this will require adding complex logic to rule out possible conflicts, not to mention getting the time complexity into at least O(N) space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The O(N) complexity is not bad assuming that N<= 10 for most cases. What I am worried about is whether this is supported by Apache libcloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I am worried about is whether this is supported by Apache libcloud.
I don't see why libcloud would not support it, as far as the storage name is detected correctly. The tests are passing, and, in my fork, I have it running successfully against S3 as a storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, in that case feel free to update and include some tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @enkidulan , Thanks for the PR. This sounds good and completes the feature in #154.
It's my first opportunity to contribute to it, so I want to let you know that I appreciate your work.
Thank you, it means a lot.
First of all, many thanks for creating and maintaining this library. It's my first opportunity to contribute to it, so I want to let you know that I appreciate your work.
The changes in this PR are somewhat similar to the logic introduced in #149 and #154 that adds support for complex store names.
Complex store names are a great feature, but they don't cover cases where content on storage is grouped by any of the arbitrary properties (like
user_id,
post_id,
or both). This PR adds support for complex file_id, which allows for a more flexible way of organizing files.Let me know what you think, I'm happy to add more test coverage or update the PR as you see fit.