Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the Id parameter to recognize spaces #3388

Closed
ishank011 opened this issue Mar 24, 2022 · 25 comments
Closed

Use the Id parameter to recognize spaces #3388

ishank011 opened this issue Mar 24, 2022 · 25 comments

Comments

@ishank011
Copy link
Contributor

if baseURL != nil {
// TODO read from StorageSpace ... needs Opaque for now
// TODO how do we build the url?
// for now: read from request
webDavURL := baseURL.String() + space.Root.StorageId
drive.Root.WebDavUrl = &webDavURL
}

Why aren't we using space.Id anywhere? Just recognizing a space by a single ID is wrong as there can be clashes across storage providers, and not all providers will be using UUIDs.

@ishank011
Copy link
Contributor Author

@butonic @C0rby @micbar

@ishank011
Copy link
Contributor Author

Also, even if we want to use a single ID across all providers, it should be the OpaqueId, not StorageId

@micbar
Copy link
Contributor

micbar commented Mar 24, 2022

@ishank011 The StorageID is the spaceID and must be unique

@micbar
Copy link
Contributor

micbar commented Mar 24, 2022

The resourceID is always StorageID!OpaqueID

@ishank011
Copy link
Contributor Author

The resourceID is always StorageID!OpaqueID

I think you mean Id, right? The Root object should correspond to the reva resource IDs.

@micbar
Copy link
Contributor

micbar commented Mar 24, 2022

  1. What is a storage Space?

A storage space is a Folder on a storage provider which is marked as a space Root. It is used as a root reference for relative paths inside a root. A Single Storage Provider can host unlimited numbers of storage Spaces

  1. What is a spaceID

The spaceID is the ID of the RootFolder of a space. This folder has a ResourceID. The ResourceID of a SpaceRoot always consists the same StorageID and OpaqueID. That means on a Space Root StorageID == OpaqueID. Therefore the space id only uses the ID once. For example a spaceID could look like f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c!f7fbf8c8-139b-4376-b307-cf0a8c2d0d9c. On a space both IDs are equal so we only need to use one.

@butonic
Copy link
Member

butonic commented Mar 24, 2022

  1. An id in the graph API is an opaque string
  2. We construct it by concatenating the CS3 {storageid}!{opaqueid}.
  3. When dealing with a space root the storageid and opaqueid are actually the same.
  4. To shorten the id we allow omitting the opaqueid
  5. Shares that are represented as a space have storageid that identifies the space and an opaqueid that together identify the shared resource.
  6. The spaceid always follows the same pattern: {storageid}!{opaqueid} of the root resource id.

@butonic
Copy link
Member

butonic commented Mar 24, 2022

@ishank011 we should probably have a quick call to clarify things?

@labkode
Copy link
Contributor

labkode commented Mar 28, 2022

@butonic calendar invite sent

@micbar
Copy link
Contributor

micbar commented Mar 28, 2022

@labkode @ishank011 Please add more context to the question. From my side, all questions are anwered.

@micbar
Copy link
Contributor

micbar commented Mar 28, 2022

All Storage providers need to provide UUIDs. That is a hard requirement. We cannot rely on Inodes or something which is not a "logical unique identifier". SpaceIDs must survive cross storage moves.

@ishank011
Copy link
Contributor Author

@micbar migration scenarios should be handled separately in case of conflicts IMO, given the frequency of such events happening compared to how much we use space lookups.

In our deployment, counting each share as a space, we have 500k+ spaces and for each metadata call, retrieving all of those and checking that there's an ID match is not a good idea at all. Also, given how heterogenous our drivers are, implementing this functionality in all of those adds a pretty big overhead.

@micbar
Copy link
Contributor

micbar commented Mar 28, 2022

So you consider shares as spaces? IMO this is different from our view. We consider Storage Spaces and Shares as different entities. Shares are represented on the drives endpoint as Space type "mountpoint".

A Share is

  1. A grant on the resource of the share owner
  2. And a mount point on the share receivers storage.

Please check out #3365

You can see how we work with compound IDs on the grant and the mountpoint.

@ishank011
Copy link
Contributor Author

Even if you have a separate category for shares, the scalability issue still remains.

@micbar
Copy link
Contributor

micbar commented Mar 28, 2022

What do you suggest?

@labkode
Copy link
Contributor

labkode commented Mar 29, 2022

@micbar

See diagram below.

The current implementation does not rely on SpaceID (storageID + opaqueID) to find storages, but solely on the storageID, that is not enough to route a request (broken label in the diagram).

Also, as @ishank011 explained, the way matching is done to identify a space in a storage provider is inefficient.

A spaceID created as StorageID+OpaqueID gives two benefits:
a) provides routing information based on the storageID and
b) provides global uniqueness based on OpaqueID

So the implementation for routing should take into account this layout to avoid this self-inflicted ineficiency, and in case the spaceID does not contain the StorageID, then you fallback to ask every single storage provider to find the space (current implementation).

Regarding the shares, we need to discuss it, either they are spaces and treat it as such or they are a different entity and they are managed in a different way. We need to understand the inner mechanics for implementation in EOS.

spaces-ishank

@micbar
Copy link
Contributor

micbar commented Mar 29, 2022

@labkode @ishank011 The current implementation uses the spaceID

if baseURL != nil {
// TODO read from StorageSpace ... needs Opaque for now
// TODO how do we build the url?
// for now: read from request
webDavURL := baseURL.String() + spaceID
drive.Root.WebDavUrl = &webDavURL
}

@ishank011
Copy link
Contributor Author

@micbar any update after our discussion on how the spaces registry works?

@micbar
Copy link
Contributor

micbar commented Apr 7, 2022

yes, @butonic created new tickets, let me check

@micbar
Copy link
Contributor

micbar commented Apr 7, 2022

this is the one about the registry #3431

@micbar
Copy link
Contributor

micbar commented Apr 7, 2022

And there is a description which @butonic created one year ago about the resourceIDs https://owncloud.dev/extensions/storage/proposedchanges/#space-providers

We just didn't implement it yet.

But we concluded that we would need the storageproviderID also and switch to the static registry also on the edge branch.

IMO this would unblock everything.

@ishank011
Copy link
Contributor Author

@micbar sounds really good. Thanks a lot!

@stale
Copy link

stale bot commented Jun 10, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status:Stale label Jun 10, 2022
@micbar
Copy link
Contributor

micbar commented Jun 14, 2022

@butonic has this been fixed already?

@stale stale bot removed the Status:Stale label Jun 14, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants