-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data loss on rename of a 49 GB folder #13391
Comments
To clarify: the rename happend through the web UI |
503 ? That's "service unavailable" and shouldn't even trigger an actual rename. |
You said you didn't have any other storages than home:: and the root, so this excludes the case of an unavailable external storage. |
@PVince81 Yes. There is no external storage. Regards the 503: It's the only rename request and it definetly gets renamed. Could that be caused by the timeout? |
Not sure. Do you think php-fpm would decide to send 503 by itself when a timeout occurs ? If the server was not available / maintenance mode, our Sabre plugin should kick in very early and prevent any file operations. |
You could check the owncloud.log from around the time it happened (mind the timezone/utc differences) |
I noticed that a rename caused a lot of database queries (600 for a folder with 507 files in it). It needs to update the path of all elements. I guess this caused the timeout and PHP-FPM will kill the process once the timeout is hit. https://blackfire.io/profiles/a33715d9-0191-4f9a-ad4a-2f3166d71584/graph |
@icewind1991 What is the reason to store the full path? Isn't knowing the parent enough to generate the full path? |
@DeepDiver1975 @karlitschek I would rate this a bit higher. I talk to @icewind1991 and he would like to come up with a partly improving change, but reducing the load (especially the SQL queries) in a way like it was done for the delete operation isn't possible for 8.0 (#13394). On the one hand this would require bigger changes but on the other hand this will cause critical problems (and even data loss) on renaming folders with many children. Is this rated a showstopper or not? |
Yes, but most queries we do are by path |
@icewind1991 I guess it scales better if you simply traverse the file tree. And you can cache this too. The current approach doesn't scale in any direction. :( |
I still don't understand why renaming a simple folder in place could run into a timeout (not even moving it to another location) |
@PVince81 No. Have a look at the |
Ah right... the DB update :-/ |
A ticket should be either technical debt or a bug. |
@MorrisJobke have you been able to find any more clues ? |
@PVince81 It's simply just the massive amount of DB updates. And the executing process got killed before it can finish this task. Nothing we can change for now :( |
This is really a showstopper bug (as you can see in #10711). Our manager is considering to stop using Owncloud because of all these rename and sync problems that never gets fixed. I hope that you can fix all of these problems, because for us right now, oC can't be used in production. Thanks! |
@DeepDiver1975 I've set this to 8.1, this should definitely be looked into. It might take some time to debug because this bug is difficult to reproduce consistently. I suspect that the part that handles renames will need to be rewritten to use a different approach, either by using part folders #13756 or updating the cache for each file one by one, as proposed here #13775 instead of doing a bulk update at the end. |
? You can't reproduce this? But the rename takes ages for you too, didn't it? |
The few times I tried I couldn't reproduce the issue. At least in my case there was no data loss / deletion from the sync client. |
Either the rename operation needs to take longer than one sync cycle, which means the sync client would try and access an inconsistent DB state. Or the rename must run into a PHP timeout where the PHP process gets killed (php-fpm case) Maybe case 1 can be simulated by adding a few sleep() operations in the code to slow down renaming. |
@icewind1991 it didn't work, still happening. How about the hasUpdated approach you suggested ? |
Work in progress here #16963, searching for alternative approaches to lock the cache/scanner |
@PVince81 @icewind1991 Thanks for this! You all rock :) |
Hi, can it be that #15702 is related to this? Thanks! |
Not necessarily. This ticket here is about files randomly disappearing, it is not consistent. |
If you have a test instance where you can test 8.1, you could enable file locking, see https://doc.owncloud.org/server/8.1/admin_manual/configuration_files/files_locking_experimental.html |
Ah indeed, I responded too fast and only noticed after the difference after. I will build a testsetup with 8.1 to experiment with the filelocking and present some feedback on the other thread. Thanks. |
So will the file locking solution also work when you rename a large folder and undo that rename within a few seconds? How would that work out? |
If you undo the rename while the operation is still in progress you will get a message like "folder is currently busy" and will need to try again later. If done through the sync client, the sync client will automatically retry later. |
On another note, @icewind1991 had a POC fix that should accelerate renaming of database entries: #13956 |
Excuse me for intruding, but it is not clear to me if the problem still happens or not. |
It just happened on my server. We are still using owncloud 8.0 . Upgrading to 9 will solve the problem? |
@alantygel yes, because OC 9 has some locking mechanism to avoid this kind of race conditions |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I accidentially renamed a folder on my production instance:
Notes
If someone wants to help me with digging in the debris is welcome.
Access log:
The rename:
The access log filtered for the folder Bilder:
Nothing special in php-fpm.log or apache error log.
The folder is successfully renamed (in database and in the filesystem), but all database entries are gone (files are still there in the filesystem). Just a forced rescan with the occ command line tool was able to get them back into the database. Browsing the folder in the web UI didn't trigger the update of the file cache.
! For the user (without admin rights) it's not possible to get back the data from the server. It's simply not shown.
I will try to investigate further and try to reproduce.
cc @karlitschek @DeepDiver1975 FYI could get a showstopper soon - I opened this ticket to document my process
The text was updated successfully, but these errors were encountered: