Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for complex file_id #156

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

enkidulan
Copy link

@enkidulan enkidulan commented Aug 7, 2024

First of all, many thanks for creating and maintaining this library. It's my first opportunity to contribute to it, so I want to let you know that I appreciate your work.

The changes in this PR are somewhat similar to the logic introduced in #149 and #154 that adds support for complex store names.

Complex store names are a great feature, but they don't cover cases where content on storage is grouped by any of the arbitrary properties (like user_id, post_id, or both). This PR adds support for complex file_id, which allows for a more flexible way of organizing files.

Let me know what you think, I'm happy to add more test coverage or update the PR as you see fit.

Comment on lines +160 to +164
storage_name, file_id = path.split("/", 1)
if storage_name in cls._storages:
return storage_name, file_id
# the trying complex storage name
return tuple(path.rsplit("/", 1))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only supports two cases:

  • complex file_id (e.g folder/file) and simple storage name
  • complex storage name (e.g storage/folder) and simple file_id

We could add support for both complex storage_name (storage/folder) and file_id (subfolder/file)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for replying. Initially, I tried adding support for complex storage_name and file_id at the same time, but the challenge I ran into was that using / as a separator was insufficient to easily identify where the storage part ended and the file part began. At the time, I could not come up with a simple solution that would also be straightforward to the users. So I settled upon the simplest or approach that covers the case I'm interested in.

To give more context on the challenge of supporting complex storage_name and complex file_id in the same path, consider this example storage/folder_a/folder_b/folder_c/file. The parts [folder_a, folder_b, folder_c] can belong to either the storage, the file, or both (in different combinations, where complexity increases with the number of sub-folders).
One way to solve it is to introduce a special path separator that indicates that the left part is a storage and the right part is a file, something like // so users would use storage/folder_a // folder_b/folder_c/file. It's a simple approach, but it breaks backward compatibility.
Another approach is similar to the current solution: test the combination of sub-paths against storage and return the first matching. However, this will require adding complex logic to rule out possible conflicts, not to mention getting the time complexity into at least O(N) space.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The O(N) complexity is not bad assuming that N<= 10 for most cases. What I am worried about is whether this is supported by Apache libcloud.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am worried about is whether this is supported by Apache libcloud.

I don't see why libcloud would not support it, as far as the storage name is detected correctly. The tests are passing, and, in my fork, I have it running successfully against S3 as a storage.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, in that case feel free to update and include some tests

Copy link
Owner

@jowilf jowilf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @enkidulan , Thanks for the PR. This sounds good and completes the feature in #154.

It's my first opportunity to contribute to it, so I want to let you know that I appreciate your work.

Thank you, it means a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants