-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move versioning and trashbin functionalities to actual interfaces on the storage level. #12274
Comments
This is the right way to move forward. So we can have different behaviors depending on the storage backend. 👍 |
Yes, makes sense. Should the versioning system be interchangeable or always specific to a storage ?
This could provide more flexibility and choice about how to store the version, but might also make everything too complex to maintain. |
So this means we'll finally have the opportunity to try moving the trash+versions inside the external storage itself, inside a ".trash" and ".versions" hidden folders. This also means a lot of the specialized encryption code required for trash/version can be thrown away, and we can simple use the "View" and let the encryption code take care of the keys in all cases. |
I would tie the versioning impl to the storage. In case multiple implementations are possible this might be set up using some configuration. But lets keep this simple for now. Regarding encryption: we have to consider making encryption a storage wrapper - see icewinds proposal regarding sync performance improvements which is also necessary for engagements in the area of chunking. What about a small storage hack week soon? |
From what I understand, moving encryption to a storage wrapper isn't something trivial and might need a lot of thinking and changes. But maybe experimenting with it first (POC) is the way to go. @schiesbn didn't seem to be convinced though. Storage wrappers have another drawback that might need addressing: the order of execution is dependent on the order in which the wrappers are registered. |
Just to make this clear, I'm not against moving encryption from the proxies to the storage wrapper. I just think that it is not something which should be done quickly and we should first think about all possible implications. I don't want to throw out matured and quite stable code that easy. As for moving the trash and versioning to the storage. Generally I like the idea. But also in this case we should first think about all possible implications. For example how will the trash bin retrieve the deleted files in the future? Right now we just have to look into one folder, find all files and can navigate in sub folders, etc. If the deleted files are distributed across the storages we would need to collect them from all storages and create a virtual centralized view for it. Also we need to think about how additional data is handled, e.g. versions and encryption keys also need to be moved to the trash and restored again. I don't like the idea that the storage implementation should handle all this stuff. All doable, but it is not a small task and needs some thoughts first. Just saying this in order to have a feeling about the dimension of this changes. I'm still not sure if this centralized view on the trash bin is the best idea. Ideally we would keep the deleted files where they are and just hide them. This way we wouldn't need to move around anything. Neither the files nor versions, encryption keys, etc. The trash bin button wouldn't lead to a centralized view of all deleted files but would be a toggle to show/hide the deleted files in the current folder. I think this would be the much more promising future for the trash bin from a ownCloud user point of view. |
I think that might lead to confusion when using external storage such as a mounted windows home directory. |
Yes, that's a common argument. On the other hand people also open their windows shares and other network drives regularly with their desktop file manager. These desktop file manager also put metadata into this directories (Thumbnails, some create a .trash folder, afaik MacOS creates .DS files,...). People don't care about it, so why should they care about ownCloud specific files? At the end it boils down to the question how well we want to integrate this storages into ownCloud. Imho the best integration is achieved if we use the storage how it fits best for the ownCloud user experience. Other do the same, as shown with the file manager example. |
Just a little clarification related to "leaving version handling to the storage implementation". I think it's not 100% accurate. I'd say it's up to the storage implementation to choose what implementation to use. Core can still provide the VersionHandler instances (CentralVersionHandler that stores to data dir, StorageVersionHandler that stores directly on storage, etc) but it's up to the storages' |
That's not what I said. But what's happens if you delete a file on a storage with his own trash handler. How will we handle the encryption keys for example? Either the storage handler needs to handle the encryption keys (as said, I wouldn't like the idea) or we create a hook and let the encryption app handle it. But what does the second solution mean? The encryption app would move the keys to a generic place and need to store additional meta information where the real file is. You would also need this mapping if you want to show previews in the trash bin for example. We would increase the dependency on the database for this metadata. Honestly I'm quite happy that currently we go in the opposite direction, reducing the dependency on the database and and make the key<->file mapping easier. Also what would happen if the storage disappears? The deleted files would be gone but all the meta data and the keys would still be there. As said, this is all solvable. But please don't underestimate (1) the amount of work, (2) the risk to replace stable and mature code with not so stable and not so mature code and (3) the thoughts we need to put into it before we move into one direction. |
@schiesbn I think we can decouple encryption from versions/trashbin as discussed yesterday. Version handlers should simply use the First step would be to come up with a list of steps / checklist of what we need to be careful with, then address all the points. |
If encryption were to use file ids to store it's keys instead of paths there would be no need to encryption to do any work for handling encryption keys while moving files in and out the trashbin. |
Since OC7 we have trashbin + versions info in the oc_filecache, so in theory it should be possible to switch to using file ids. The idea we discussed yesterday was to have the encryption app work at the "data/$user" level instead of "data/$user/files". It means that "files_trashbin" and "files_versions" would be considered like regular folders when working with files. So moving a file to the trashbin from "data/user/files/afile.txt" to "data//user/files_trashbin/afile.txt.d123456" wouldn't be any different than moving a file from "data/user/files/afile.txt" to "data/user/files/subdir/afile.txt". No special handling needed for keys. |
One thing we shall keep in mind is that there are actually systems which no longer use our file cache table because they implement their own cache classes to retrieve any metadata information from the storage itself. In such an implementation versions, trashbin, sharing and storage itself (maybe encryption as well ???) are actually (re)implemented for exactly this storage. At the end we should not make any assumption about the internal operations of these modules. In such an environment ownCloud is basically only visualizing the data and taking care about syncronization. So maybe we have to think about our implementations as of today (sharing, versions, trashbin and encryption) to be the implementation of these interfaces for the regular and local posix based filesystem. |
If files are only moved yes, at the moment we copy them so the file ID in the trash bin is not the same as the file ID of the original file. Also what happens if a admin enforced a file cache re-scan? Will the file IDs change? Also in case of bugs it is really hard to debug it and to find the cause of the problem if the keys are only named by the file id. Any bug e.g. if the file ID changes because of a bug in the file scanner or a key gets attached to the wrong file ID would be really hard or even impossible to resolve which would lead to data lose. I'm not sure if it is a good idea to drop human readable key names in favor of file ids even if I see the obvious advantages. |
In the long run we need to find a way to make file ids stable. There are other features that already rely on it like sharing and in the future metadata/favorites/tags. |
Sure, but bugs will always exists. Losing a share because of a bug is one thing, data lose is a different thing. I would like to maintain as few dependencies as possible for file encryption. Ideally you can copy your data folder to a different computer and run a local PHP script to decrypt your files or copy it to a different ownCloud and it works. Both will not be the case if we depend on file IDs. Think about the encryption bugs you debugged lately and imagine that instead of human readable paths and filenames all keys would be in one folder with names like: 1223243.key, 343647.user1.shareKey, 436743.key,.... This would be a nightmare. |
Note: @MTRichards has mentionned this requirement for external storage as well in this ticket so I added it there as bullet point: #12216 (it gathers multiple requirements/improvements for external storage) |
@schiesbn mind posting your WIP PR here where you're doing the decoupling of the encryption app ? |
There is no PR. That's already how it works today. Today all you need for encryption is stored on file system level and you can easily see which key belongs to which file. You can also copy the complete data folder to a new ownCloud no matter of encryption was enabled or not and it will continue to work on the new ownCloud. This wouldn't be the case if we would depend on the file cache ID. The same way we decoupled the trash bin from the database from OC6 to OC7. |
Another point just come to my mind. If every storage can implement his own trash bin and versioning and use the functionality provided by the storage back-end. What would this mean for example for the expire function for versions? Would we apply our implementation to all versions? What if the storage back-end also expire versions in some way? Would we use the expire method of the back-end, ours, both or can we chose at all? Wouldn't this lead to a quite inconsistent behavior and become highly confusing for the user? The user shouldn't care about the back-end and experience a consistent behavior across all folder. |
This is an interesting point. And yes, you are correct. The external storage provider would replace our versioning and our trashbin for those files and folders that it owns. This would mean that if you erase a file in sharepoint, and it is in the sharepoint trash it would also appear in the ownCloud trash. In theory anyway. Then when ownCloud deleted trash, it would follow the external storage provider trash rules for those files and folders impacted there. BUT, to be clear, I am not asking this to be done for SharePoint right now. I am just making sure that we have a path to this sort of capability over time. |
We seemed to agree today that we'd rather have better integration with the remote storages (ex: Sharepoint, Dropbox) for things regarding trash and versions instead of trying to have consistent behavior between storages. The
As a next step we can identify how these interfaces will look like. |
@labkode as discussed |
For trashbin: The way I see it is that by default every storage would have a hidden ".trashbin" folder (external storage, received shares, etc). |
Hmm, also before we can decide on a "trashbin Webdav API" for public access, we need to sort out whether we still want this "per storage" trashbin because it would influence the format of the endpoint as it might need an extra "storage id" argument. |
@PVince81 the trashbin functionality should be delegated to the storage as an interface to implement, as you mentioned, one could use the default implementation in its own storage. My ideal storage would be the one that can handle itself data, metadata, versions and trashbin without relying on hooks from a top level app like it is the case with the current files_trashbin and files_versions app. What do you think? |
@labkode my concern at the moment is mostly about the UI and the current app. Also what about external storages like SMB, do we store files in a hidden folder ".trash" there ? If not, we'd need a fallback like "use the default trashbin in the data folder" instead. |
Regarding versions, there are even more concerns. This would be even more complex when moving files between storages, because moving files also carry their versions. It means that versions would need to be converted from the local format to the Dropbox format for example. This is too complex. So maybe we should restrict the spec to "allow storages to store trashed files and versions in a different location" instead of "allow storages to do whatever they want with trashed files and versions" ? Basically having a more specific use case. Would be good to get more information about the intended use case for this extensibility.
|
|
Another possibility is to make it possible to have a custom trash/versions handler that can be specified in config.php, basically an alternative backend. The frontend would stay the same. The backend would not be storage dependent, but if needed the backend itself could decide to behave differently based on storage implementations. Or simply write a new trashbin/versions app that implement the same APIs if the frontend is not important ? |
The spec might have changed a bit, can you update ? |
As of today we are facing storage backends (e.g. SWIFT) which come with their own versioning/trash functionality.
We need to have the possibility to respect these functions and give developers the proper handle to implement/integrate these mechanisms.
Proposed Steps:
@schiesbn @icewind1991 @PVince81 @karlitschek please share your comments and ideas - THX
The text was updated successfully, but these errors were encountered: