Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement high level file locking #11804

Closed
PVince81 opened this issue Oct 28, 2014 · 67 comments
Closed

Implement high level file locking #11804

PVince81 opened this issue Oct 28, 2014 · 67 comments

Comments

@PVince81
Copy link
Contributor

Currently the files_locking app only works on storage level, which is not enough to guarantee atomicity of operations. This is because we have hooks and proxies and apps are able to perform additional filesystem operations.

A high-level file locking mechanism should be implemented to guarantee atomicity of file operations on a higher level.

For example, this is what happens when renaming an encrypted file:

  1. User renames file
  2. System calls $view->rename()
  3. File is renamed on the storage
  4. Encryption keys are renamed
  5. Version files are renamed

Since files_locking only operates on storage level, only step 3) is locked.

If after step 3) someone renamed the new file to yet another name, they will get a "share key not found" error.

The high-level locking mechanism would work as follows:

  1. User renames files
  2. System calls $view->rename()
  3. A lock is set on source and target file names
  4. File is renamed on the storage
  5. Encryption keys are renamed
  6. Versions are renamed
  7. Lock is released

Also note that files_locking does not lock folders.

There's also the open question whether parent folders should be locked and whether folder locking is feasible at all.

Here is an example of what kind of race conditions happen due to lack of high-level locking: #11795
(rename a folder while the sync client is uploading files into it)

@craigpg @karlitschek @MTRichards @DeepDiver1975 @icewind1991 @schiesbn @bantu

@PVince81
Copy link
Contributor Author

To discuss:

  • how to lock (DB, zookeeper, temporary files, etc)
  • what to lock: files, target files
  • whether to lock folders when doing an operation directly on them
  • whether to lock parent folders when doing an operation on the children

@DeepDiver1975
Copy link
Member

  • expose these high level locks via webdav - has impact on sync client
  • visualize locks on web ui ?

@PVince81
Copy link
Contributor Author

  • proper timeout handling: locks should not stay forever

@DeepDiver1975
Copy link
Member

note regarding timeout: webdav lock support timeouts as well - we need to keep these values in sync

@PVince81
Copy link
Contributor Author

The locking also needs to be multi-level, which means that an operation on level 1 (user renames folder) could also cause additional locks to be created.

One example is that whenever the user accesses the storage, sometimes we need to adjust the mount points in case of conflicts. Adjusting the mount points would itself be a sub-transaction.

Maybe we should rather talk about transactions and provide methods "beginTransaction($files)" with a list of files to lock. Then "endTransaction($files)" to end it (or use a $handle).

Let's say renaming a file consists of calling "file_exists($target)" then "rename($source, $target)".
The rename.php from the ajax call would then call:

$handle = $view->beginTransaction(array($source, $target));
if (!$view->file_exists($target)) { // for the sake of the example
   $view->rename($source, $target);
}
$view->endTransaction($handle);

Inside of $view->rename() there will be another call to $view->beginTransaction(array($source, $target)) that increments the lock count for these files. This is in case $view->rename() is called alone.

@PVince81
Copy link
Contributor Author

PVince81 commented Nov 2, 2014

Another case appeared on my own server today: double deletion over WebDAV: #11910

@PVince81
Copy link
Contributor Author

Some ideas:

  • implement file locking on view level using file paths from the user's point of view (with mount points)
  • for shared folders, always lock the paths of the owner
  • save locks in a DB table or memcache (interchangeable by abstracting the "lock provider")
  • lock provider can be instantiated by Storage classes (in case we could even use flock on some
    storages)
  • no difference between read/write locks

@karlitschek
Copy link
Contributor

related #11734

@PVince81
Copy link
Contributor Author

  • locking provider
  • plug it into the code (View)
  • return proper locking codes to the client (through exceptions ?)
    • Sabre exception (return HTTP code 423)
  • special cases for direct storage access (file PUT)

@PVince81
Copy link
Contributor Author

  • scanner needs to lock too

@PVince81
Copy link
Contributor Author

  • unit test: acquire lock directly, then call $view->operation to ensure that it also acquires the lock

@PVince81
Copy link
Contributor Author

@PVince81
Copy link
Contributor Author

  • refresh lock for long running operations ? (touch operation): set lock timeout value high enough to match PHP execution time (goal is to avoid dead processes to lock forever)

@PVince81
Copy link
Contributor Author

  • release lock on shutdown handler, if possible

@PVince81
Copy link
Contributor Author

  • config.php switch to "Enable EXPERIMENTAL locking"

@PVince81
Copy link
Contributor Author

  • allow locking outside of "files" (like files_trashbin, thumbnails), but probably not "files_encryption" and "cache"

@DeepDiver1975
Copy link
Member

@jnfrmarks @PVince81 do we have smashbox tests which are failing as of today because of missing locking? If not we shall implement them asap. THX

@PVince81
Copy link
Contributor Author

testConcurrentDirMove might be one, but fails only randomly on some envs.

If we had a way to slow down the server (or dynamically add breakpoints with XDebug 😉) it could then become easier to reproduce such issues consistently.

@PVince81
Copy link
Contributor Author

There is another fundamental issue that we'll eventually need to think of, related to APIs: so far we ask every app to use the public API, which is the node based files API (OC::$server->getUserFolder()). That API uses the View. So far so good, this means app will benefit from locking too.

The trouble is that now we're changing the trashbin and versions app to use the Storage instead of the Node API. This is because using the node API would lock several times. It starts to feel like in the end the Storage API will need to be made public eventually... which is not really what we wanted.

Originally I thought we could have a method lockPath() in the node API that apps can call to tell ownCloud when they start working with a file, and releasePath($path). Every subsequent calls on lockPath() within the same PHP request would not double lock the same path, but just do as shared locks do: increment a counter. I think it is still possible to add this a bit later once the groundwork for locking is done (and also discuss whether we want this).

@PVince81
Copy link
Contributor Author

Well this would also mean that apps like trashbin and versions should be able to work with the public node API too in the end (and not use the private View or Storage API)

@PVince81
Copy link
Contributor Author

  • think about all the places where we want to explicitly "lock before a big bunch of operations" and "release lock afterwards".
    • file upload: we need to bundle everything from "precondition check" to "final rename" inside a single transaction (currently the precondition check is done early but there is no guarantee that it is still true at the time of the rename)

@PVince81
Copy link
Contributor Author

We can probably discuss the individual cases in separate tickets once the main work is merged.

@PVince81
Copy link
Contributor Author

Note for testers: to enable this feature you need to:

  1. set "filelocking.enabled" to "true" in config.php
  2. set 'memcache.locking' => '\OC\Memcache\Redis' in config.php
  3. Run a local redis server

@icewind1991
Copy link
Contributor

redis coming later

Redis already works if you configure redis as your memcache backend

@PVince81
Copy link
Contributor Author

Raised owncloud/smashbox#94 about writing a special app that could help recreate the race conditions more consistently.

@PVince81
Copy link
Contributor Author

@PVince81
Copy link
Contributor Author

  • more user friendly error messages in the web UI when a file is locked

@PVince81
Copy link
Contributor Author

  • also lock properly for system-wide external storage mount points (to be discussed)
    This is once again an opportunity to mention my idea of getting rid of "applicable" for external storages but using the "sharing" approach instead (admin mounts ext storage locally, shares with applicable users/groups), to avoid requiring custom code specific to ext storage and simply reuse the sharing one: System external storage mounts as shares instead #11182

@PVince81
Copy link
Contributor Author

@PVince81
Copy link
Contributor Author

PVince81 commented Jun 1, 2015

More issues and test steps documented here: #16664

@carlaschroder
Copy link

Please review docs owncloud-archive/documentation#1182

@PVince81
Copy link
Contributor Author

@icewind1991 @DeepDiver1975 what about "short waiting time when acquiring before abort" ?
Currently it seems to directly acquire a lock or fail directly.

In the files_locking app we had a usleep().
But anyway, this is more about tweaking locking performance.

@PVince81
Copy link
Contributor Author

raised wait-retry as #17016

@PVince81
Copy link
Contributor Author

Most of this has either been implemented already or raised as separate tickets.

We can either close it or keep it to keep an overview (but then remove the severity tags?)

@PVince81
Copy link
Contributor Author

There are tickets for the remaining issues, tasks => closing

@lock lock bot locked as resolved and limited conversation to collaborators Aug 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests