-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize path based space operations and regex based registry #2862
Conversation
af3eb6e
to
8b3f759
Compare
SpaceType: "+mountpoint", | ||
}, | ||
}, | ||
} | ||
|
||
spaces, err = r.findStorageSpaceOnProvider(ctx, p.Address, filters, unrestricted) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ishank011 seems like the nextcloud server mock needs to adapt. |
|
||
// Prepend the mount path of the storage provider if the request was for an absolute reference | ||
if ref != nil && !utils.IsRelativeReference(ref) { | ||
ref.Path = path.Join(mountPath, ref.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes storage providers aware of their mount point. which we deliberatly removed because shares inevitabley create multiple valid paths for the same resource. But this path must not be exposed to prevent share receivers from getting to know how someoneelse organized his files ...
With spaces I tried to make mountpoint
and grant
type spaces distinguishable. We are currently looking at the /dav/spaces
endpoint to make it possible to use a full resuorceid, including storageproviderid, spaceid and nodeid to represent the root from which to further navigate down the tree with the relative path that follows the spaceid.
That will unify how shares as well as spaces are accessed using the /dav/spaces
endpoint. This became clear when we were fleshing out the share jail in ocis web and the desktop client. The web client has to discover the granted resources because it needs to use the same fileid to start collaborative editing sessions. Mountpoints actually represent different resources, which is why the mountpoint
spaces in the /me/drives
endpoint also carry the remoteItem
property which holds the fileid for the actual shared resource, aka the grant
space.
I think one clarification is that grants cannot have an absolute path, all you can do is expose the root of the grant and then navigate relative to it. You must not be able to escape that jail.
For path based access we can take that relative reference to the sharesstorage provider, which will create a path for it.
But when accessing a normal storage provider, the storage space registry is the only place that should know the path prefix for that storage provider. The gateway can then wrap and unwrap paths. What the storage provider is responsible for is a path from his conceptual root to a storage space:
/users/e/einstein/relative/path
consists of
{storage_path}/{space_path}/{relative_path}
where
{storage_path}
is/users
, managed by the registy, wrapped an unwrapped by the gateway{space_path}
ise/einstein
, an alias for the personal space of einstein, managed by the storage provider{relative_path}
,relative/path
the path to a resource relative to einsteins personal space
Now this could be configured with regexes, so that /users/[a-k]
is routed to a specific storage provider. the registry would still use the mount_point
property for that space and cut off only /users/
, the storage provider would then have to navigate to the space using e/einstein
. It would alse prefix the relatuve path with the space_path
and the gateway would prefix /users/
again.
Or the mount_point
includes the regex, then the space_path
that reaches the storage provider becomes just einstein
and the gateway prefixes the request with what it cut off before forwarding the request to the storage provider.
It might just be terminology, but the storage provider must not know about the mount_point
. It cannot, because different registries might have different ideas about what path leads to a certain storage provider.
Another example: you only have one storage provider, but you want to have a /users
and a /projects
namespace, where /users
holds all personal spaces and /projects
all project spaces. As long as the storage provider is oblivious of where it is mounted you can configure a providers with two types of spaces, each with their own mount_point. That is currently not implemented, but it certainly makes sense to allow more than one type of space per storage provider and also mount them at a different location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my current braindump, I need to think about this further ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this path must not be exposed to prevent share receivers from getting to know how someoneelse organized his files
But this is exactly what we want to do. Just leave it empty and it'll work how you want it to. I could move this to the EOS driver but this fits better in the storageprovider layer IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, as a feature it makes sense to have a way to enable disclosing the paths 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ishank011 If all you want to achive is being able to shard the user directory, why can you not implement that logic in the registry, with the gateway doing the path wrapping and unwrapping? Is the problem that the storage provider will then only see a relative path inside a space, eg:
- gateway receives
/users/e/einstein/some/path
- registry matches
/users/e
with the/users/[a-k]
mount_point regex - registry cuts off the matched prefix for the provider
/users/e
and returns a provider with the mount_point set to/users/e
- gateway cuts off the mount_point from the path, which becomes
einstein/some/path
- storage driver has to looks up the space for
einstein
, lets call it thespace_path
- the driver can navigate to
some/path
, relative to einsteins home
now this does not involve any id lookups ... this would be purely path based resource access ... without the storage provider having to know the mount_point?
At this point the storage provider still has to know about the space_path, maybe using a template like {.Spaceid}
or {.Owner.Id}
. Which is why the space registy can be configured with a path_template
. It serves the same purpose, but now all path logic is in the registry.
IMO the code complexity would decrease if we use the path / storageproviderid to let the registry look up the storage space once and then always continue with relative references.
We should have a call to clarify the how to proceed ...
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
resp.StorageSpace.Id.OpaqueId = storagespace.FormatStorageID(s.conf.MountID, resp.StorageSpace.Id.GetOpaqueId()) | ||
resp.StorageSpace.Root.StorageId = storagespace.FormatStorageID(s.conf.MountID, resp.StorageSpace.Root.GetStorageId()) | ||
// The storage driver might set the mount ID by itself, in which case skip this step | ||
if mountID, _ := storagespace.SplitStorageID(resp.StorageSpace.Id.GetOpaqueId()); mountID == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such thing as a mount id in the cs3 api. The Storage Space ID consists of {provider id}${space id}(!{opaque id})
, which is split into {storage provider id}
and the {space id}(!{opaque id})
. We should name this providerID.
however ... why would the driver know anything about provider ids? When we update the cs3 api we will make the resource id contain three properties: providerid, spaceid any opaqueid. This will get rid of any id wrapping / unwrapping ...but why would you make the driver aware of the provider id? it is just a routing information. ... hmmm the only use case I see for that is when the driver can host multiple providers ... which we might want to support when dealing with migrating spaces between providers ... but for now this seems linke watering the boundaries to me.
one more thing: ocis web started using /dav/spaces requests ... as does the client. all requests will become relative ... |
@ishank011 I set this to draft. Please let us know your thoughts after evaluating current web whch uses the /dav/spaces endpoint and only makes relative requests |
@butonic the web client behavior is based on a capability called How do you plan to support it in the future? With this set to false, path-based approaches work and these changes would again make sense |
Incorporated in #2919 |
fixes #2859