-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client removes file on server if it could not create VFS placeholder file #3444
Comments
@isdnfan can you upload logs here https://cloud.nextcloud.com/s/botn4JSYfMR83xt |
Hi @allexzander thank you for your time. I uploaded the debug archive nc-vfs-camera.zip and Nextcloud_sync.log. From the latter it looks like the sync cycle at 10:07 failed to create lot of placeholder files
and then something happens at 2021-06-14T10:08:22 - most likely the deletion but I can't see it easily from this log
In total the problem happened 2 times on this day at 10:xx I restored the files around 13:00 and they got deleted again an hour later ~14:xx |
@isdnfan Thank you. We have received your logs. Are you able to create files manually in the location mentioned in logs? Can those files be synced if you disable the VFS mode and use same location for your local sync folder? |
As I stated above:
But I think there was some general problem with VFS - as was unable to access files not hydrated at this time.. (error "0x8007016A The cloud file provider is not running.") after reboot access to the non-hydrated files worked again.. now I switched the folder back to "free up local space" and files replaced by placeholders within seconds. UPDATE: #3452 and #3447 look related for me. I could imagine that I killed VFS by sending the client to sleep, or maybe I even killed the client when it was unresponsive for long time. |
This bug report did not receive an update in the last 4 weeks. Please take a look again and update the issue with new details, otherwise the issue will be automatically closed in 2 weeks. Thank you! |
for me the issue did not repeat anymore but so far I see the root cause was never found so the problem could reoccur. |
@Discostu36 If you happen to have this debug log still around, would be nice if you upload it to https://cloud.nextcloud.com/s/ozbSCx5wGDrtRGQ |
@isdnfan It could've been improved by fixing numerous other bugs with VFS between 3.2.2 and 3.3.0. |
I deleted it some days ago, but might still be in trash, will have a look this evening. |
@Discostu36 Sorry for not paying attention to it earlier. Somehow, I've missed your reply. If the file is not there, you may also want to try the latest 3.3.0 version to see if the issue is still there. https://nextcloud.com/install/#install-clients |
@allexzander it's true I have the feeling new version works better in terms of performance and stability. But I didn't see any change addressing three underlying problems we see here:
first issue is hard: I don't know if there is good solution. It is easy to monitor file changes while the client is started but what should happen if the user removes local files while the client is stopped? prefer server/client or raise conflict? I would prefer manual problem resolution in this case. Additional interaction is bad in terms of user experience but it gives the user a chance to avoid data loss (depending on server trashbin setting file may be completely removed). As reference: MS OneDrive has additional confirmation to remove files in the cloud if the user removes lot of files locally.. second issue is easier to handle in my eyes:
I have no idea about the third one - in my eyes there is no reason to sync/touch the whole directory if only one file is changed.. Maybe there is a reason but even then the client should become more fail safe - if there is some local problem (VFS problem, local hardware issue, exhausted local storage, permissions problem) the client should stop syncing until the problem is resolved.. (or even better only upload new files to the cloud) I really appreciate you feedback (at least as documentation about how it is expected to work). |
Do you have any information that could help understanding why the files could not be created ? The idea is that to improve reliability we can always make it try again but that would only partially solve your problem.After all you want your files ? |
@mgallien I don't get your point. I have no idea what caused the problem. I remember at this time the client was unstable - it was eating CPU, created huge local DB files, actions in the UI lagged. I'm sure I killed the client multiple times, additionally I might have stopped some action when I suspended the PC when it was in the middle of some operation. as a result VFS was broken (placeholder files creation) - the rest was fine, the client successfully downloaded all the files from the affected folder after I changed to "make available locally". new client versions look more stable so the issue might not happen anymore (or less frequent). But the issue uncover some facts about the sync process which could be improved to make the client more safe - especially the fact the client doesn't remember local state is unhealthy and for this reason removes files from server is really bad and worth attention. (and fact why the whole folder is touched when only one file changes). I'm happy to discuss the logic of the sync process - maybe I don't understand something. the issue happened here described more generic:
in my eyes the logic must be different
in other words a client must not delete files on the server until it is confident the local state is healthy and holds a full copy of user files. otherwise it should replicate server state because the server is the only instance which knows what happened in the time the client didn't run/didn't properly sync. |
I'm currently experiencing the same problem. Two days ago, the client (3.3.2 on Windows 10, "wincfapi" used for virtual files) deleted 12.000 files, ~55GB in total. Support for virtual files is now disabled in the client. The client (that deleted the files) is installed on a new laptop. At first, the sync with virtual files seemed work fine (no DELETE calls in the web server log), so it was not an issue with the initial setup of the client. The log excerpt is from today, prior to disabling virtual files. |
thank you @ImanuelBertrand showing the issue exists on new versions as well. Do you have any idea what might be the root cause why placeholder files failed to create? did you recognize any atypical pattern, any issues (maybe with other programs)? something interesting in the windows eventlogs? |
This is not true if a server was migrated, then the client knows better what was added/removed while the server was down. However I agree that avoiding data loss is the main objective. Thus files should be kept on the server if one cannot be 100% sure deleting them was triggered by the user. Adding files or keeping them by mistake is a lot less of a problem compared to silently deleting files. |
I remember nothing atypical. 2021-09-03 23:25:15 +0200: EL: System booted The entries prefixed with EL are from the Windows event log. |
there is tow points in your reply
my point is that any errors in the placeholder files creation will block sync and I guess that is not what you want |
They way I see it, blocking the sync would be preferable to deleting files. Of course that distinction is only relevant as long as the issue is not found and fixed. |
this is exactly what I suggest (at least as long the client works as it does now). because not stopping the sync results on files deleted on subsequent run.
exactly: if you stop syncing and give the user good hint where/how to provide a bug report chances are higher you get the reports and data you are looking for. silently going forward and removing files results in
I completely agree with @ImanuelBertrand - hard fail is better then continue somehow and cause data loss (or at least lot of work for restore). I think we all agree this is a really bad situation which should never happen - but it happens.. three users managed to identify the problem and report the issue to same bug report within short time since VFS was released. I suggested some mitigations - I have no idea if this are suitable or complete nonsense - I didn't receive any response.. |
svenb1234
I only partially agree. In general we must consider the server as the most stable part. The scenario you show only works as long only one client is involved. what is if you have 5 clients? should each client move/add/change files only because the local data is different from the server? what if you restore the server for some reason? there are lot of moving parts in the system - but in general the sync is build around the server - this one must be the "root of trust". If the server has crashed, has been restored - admin should perform some action (maybe the is a way to automate it) to inform the clients new full sync is required.. other way round the client should never take priority over the server. it must only perform actions when it knows this action is intended e.g. remove the file only after a successful sync cycle (full sync on start). definitely there are ways to improve the sync like keeping a journal (like database transaction log) so one doesn't have to rewind all the history - but in general every endpoint must ensure it doesn't work on invalid/incomplete data set.. |
This just bit me too. Deleted close to 40k files from my server before I noticed. |
I believe a similar issue exists with VFS disabled, possibly caused by long file paths with more then 260 characters. There are also reports of the client deleting files from the server if the clients disk runs out of free memory. I think the underlying issue ist that the client doesn't remember if it fails to create a file locally. Ignoring the issue and continuing as if the file was successfully created is just asking for trouble. |
recently noticed #3731 - the client stops sync cycle if a file blocked by AV program - all good (despite the fact it crashes) - something similar must happen for other problems - if the sync fails for some reason inform the user and stop until the problem is fixed. |
Happened to me as well after enabling VFS. We went back and disabled VFS again since this never happened before. nextcloud: 21.0.5 |
another problem #4016 which could be avoided by measures I suggested before
which must not be the case!
|
Client 3.4.1/windows, server 22.2.0, this problem still happening with VFS. |
Had the same problem within my organisation, 6800 files deleted. |
at all we have integrated a fix that should solve this issue |
assuming you want feedback: my files state were in the middle of the deletion going on. i upgraded to 3.4.2 and i see the deletion just resumes. envronment: latest 22 nextcloud server, windows 10 3.4.2 desktop client vfs active. What do you mean by the issue is fixed ? |
@tob123 If you are using the same sync folder with the new 3.4.2 client, then, indeed, the deletion may continue, as the folder is still in that state that files are removed from it and the desktop client will also delete them on the server. |
I could not find this hint in the release notes. How is John Doe supposed to know this, leaving alone fixing it? Shouldn't the update take care of solving the issue? |
@allexzander thanks for the tip.i wanted to see whether i could restore from nextcloud's trashbin instead of classical backups. it's in progress now based on some steps i thought would be useful to share, although it's by far no smooth experience yet to restore this way. summary: step 1 curl to see what is in the trashbin.
this will create output similar to:
step 2 filtering what you want to restore.you can use the tool yq, jq to get the files you want. for the next step the reference ie needed as mentioned in xml as href (in the example above: /remote.php/dav/trashbin/demo/trash/Nextcloud%20intro.mp4.d1643722857 have a list of files ready in file "hrefs"
step 3 restore using curlrecommendation: use screen,vnc or something that makes sure you do not lose your session while the following starts to run.
|
@tob123 whoah this is really nice !!!! |
thx for feedback. i could restore all i wanted this way (~ 35000 files). advantage seems that it is quite consistent (item that got removed gets pulled out of trashbin to original location). it comes at a cost in terms of performance and time to find what you want to have restored. using screen is recommended when you execute step 3. will add that |
some more feedback on client 3.4.2: from my end it runs stable now. some remarks / concerns still left. see below. remarks:
For remark 3 : should i create a new issue / bug ? Tobias. |
I thought 3.4.2 was stable as I haven't had an issue in a long time. I randomly checked my taskbar after launching Windows only to find TWO instances of Nextcloud running. A few seconds after the other one starts showing the same errors. I forcefully close it only to find 705 files in the trashbin (luckily they weren't E2EE files). The client was trying to "update" thousands of files, deleting them for good. If I didn't randomly open the taskbar I would have lost thousands of files. |
We have the same issue. More than 5000 files deleted/moved to NC trash bin. We have found that turning off and on the client PC trigger it somehow. First it shows the error that it can't update VFS data and start deleting whole folders with hundreds of documents. |
@george2asenov . i encountered similar issues but the behavior is healthy now (using nextcloud latest 22 or 23 AND desktop version 3.4.4). I agree the issue is (or at least has been) severe. Can you share what versions you are running (including the users that had the issue?) |
I've got two problems with VFS (I'm not absolutely sure if they are related to this issue, I can create a new issue if not):
|
Nextcloud Hub II (23.0.3) this is the one user that faced the issue. But it is enough when the files are important documents. |
We have seen this exact issue with NC 24.0.5 and client ver 3.6.2. We have up to 60 clients sync'ing, with a typical load of 32. |
Just have the same problem, a whole directory of music files completely erased. It is possible that 2 years later this problem is still not addressed? |
we've just had thousands of files from multiple groups deleted by a single user without them doing anything, presumably from the same problem. restoring wasn't too bad thanks to trashbin:restore and being able to know the deletion date through the last modification time of the emptied directories. however this was a very stressful moment, as we've had multiple users reporting that all of the data from their collective was gone. we have many users so it is hard to know which ones use Windows on VFS mode, and even harder to make sure they update the client. We're also not sure that it is going to fix the problem. Our server version is 28.0.2, client version is unknown at this time (we're not even sure which user caused this yet, and it is very hard to investigate). |
you can find the user causing the issue from activity app or audit log.
would be great if you can collect client logs.
…On Tue, Feb 20, 2024, 22:55 ballit6782 ***@***.***> wrote:
we've just had thousands of files from multiple groups deleted by a single
user without them doing anything, presumably from the same problem.
restoring wasn't too bad thanks to trashbin:restore and being able to know
the deletion date through the last modification time of the emptied
directories.
however this was a very stressful moment, as we've had multiple users
reporting that all of the data from their collective was gone. we have many
users so it is hard to know which ones use Windows on VFS mode, and even
harder to make sure they update the client. We're also not sure that it is
going to fix the problem.
Our server version is 28.0.2, client version is unknown at this time
(we're not even sure which user caused this yet, and it is very hard to
investigate).
—
Reply to this email directly, view it on GitHub
<#3444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEKJGHMBQAKOQ2MV2X4LBZTYUULTXAVCNFSM46W3SDM2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJVGUYTQMZXG43A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I think you should update the CLIENT from version 3.4.1 (which is from Dec 20, 2021 and now over 2 years old) and not the server. How should the server differ between intentional removals and one that were a bug in the client? Additionally you can refer to the hint @pierreozoux wrote about denying access of such old clients to your server
As long as you cannot reproduce the behavior on a new client version, the issue seems to be an issue with your local administration. |
I believe I've just run into this issue; correct me if this is unrelated. The client is saying "Access denied" when syncing a few folders, which I'm assuming is something I broke recently, but it's then deleting all of those folders on the server. I went to look for a video today and noticed that several folders were missing (>70GB!), and I had to recover them from the server's trash bin. |
This sounds like the same issue to me, or at least a similar one. Do you have VFS enabled? |
How to use GitHub
I found some other issues where the client deleted files (like #1433) but my issue looks like specific VFS problem.
Expected behaviour
client should not remove server file if creation of local placeholder file fails (maybe mark local file as "dirty" or fallback to full file download)
Actual behaviour
For some reason client could not create placeholder files. I have no idea why it even tried to create placeholder files - majority of affected files existed and successfully synced months before (at least client reported successful sync).. as a result a client folder had no files anymore and subsequent sync removed all the files on the server side. Fortunately this happened to few hundreds of files and I could recover the files from server trashbin.
but the problem still existed and the client complained it could not create placeholder file, and removed the files again!
at the same time access to existing placeholder files (with blue cloud icon) was not possible - the error was "0x8007016A The cloud file provider is not running." The error is often reported for OneDrive - and computer restart is recommended as solution. After client restart I can access placeholder files again. I have no clue how to find out if some process or service crashed - eventlogs don't say anything
Steps to reproduce
no idea.
maybe this is related: short time before I added huge folder with my photo archive (600GB,>50k files) but from my feeling it was synced successfully (with VFS) but might introduced some performance issues problems with the client. The client feels totally overloaded - click on settings or properties of every folder results in minutes of unresponsive UI and client using lot of CPU
the file was initially removed by the client and I recovered it from server trashbin. removed files didn't reside in this huge folder but on the other one InstantUpload and 2-3 others (completely random in my eyes). After this happened 2 times I switched the specific folders to "Always available locally" and the client successfully downloaded all the files.
Client configuration
Client version: Version 3.2.2stable-Win64 (build 20210527)
Operating system: Win 10 1909
OS language: EN
Installation path of client:
Server configuration
Nextcloud version: 21.0.2 (docker/apache)
Storage backend (external storage): mysql
Logs
I have debug archive and Nextcloud_sync.log from the problematic period, but I'm not willing to upload complete file due to privacy reasons (debug archives are up to 25MB file with extracted log of 300MB). Please advice how to find and extract and
Client logfile:
Since 3.1: Under the "General" settings, you can click on "Create Debug Archive ..." to pick the location of where the desktop client will export the logs and the database to a zip file.
On previous releases: Via the command line:
nextcloud --logdebug --logwindow
ornextcloud --logdebug --logfile log.txt
(See also https://docs.nextcloud.com/desktop/3.0/troubleshooting.html#log-files)
Web server error log:
Server logfile: nextcloud log (data/nextcloud.log):
The text was updated successfully, but these errors were encountered: