-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Owncloud deletes files & tries to sync again (with little success) #6677
Comments
It seems that the owncloud client does not see that the file are on the server and thus believe they were removed. This may happen if a bug in the server make the file disapear (for example because of some external storage temporarily not available, and no file is reported to the client instead of an error.) The blocked connection is another wierd problem. As we normally have timeout to avoid this kind of problem. Please provide the log as it may contains information about both issues. |
Hi @ogoffart - thanks for the answer. As to your questions:
Other info
|
I only downloaded the first 3.8 GB before it aborted. Every hour, the sync is aborting because of a 401 error. This is most likely caused by the expiration of the session cookie. And since it appears that the discovery phase taks more than one hours, it never progress. I recommand you uncheck the new folders that have too many sub-folder so the discovery phase will be quicker and the sync will proceed. In the log, there are no evidence of files being removed. So the logs can't answer this question. My suspission is that the server failed to show some files. And following that the client thought it was removed on the server, so it removed the files locally. What is strange is that the surfdrive client should have had exactly the same behavior. Also, there should normally not be a 401 as we authenticate every requests. |
Hi @ogoffart - thanks for your analysis. A couple of points:
Is there any way to make the discovery phase less painful (i.e. slow)? This is, in general, an issue that might be worth looking into. I am often waiting super long for a sync to show up. Two more thoughts:
|
They got removed, now they are considered new :-)
Yes, we are working on that. But it won't be there before 2.6 or later.
That's interesting. There should normally be no differences. Do you have selected the option to ask for permission to download new folders of a given size? Can you try to disable it? Maybe this slows a bit the sync as it needs to query the size. |
It could also be that every time I connected a system, it just so happened to be a version that wasn't causing any trouble...? Too little data to tell, really, could be either. As to your suggestions, I will try again tonight when I get home. |
FYI @tomneedham from ownCloud will get in contact with the SURF guys to try to track this down from server side. |
Ah @guruz @tomneedham good to know, I was planning to drop the SURFers a line as well, but then I'll leave that to you. |
@Phlos You can ping them as well. Point them to this issue… |
@michaelstingl I e-mailed my institute which seems to be the only way for mere mortals to get in touch with the SURFdrive ppl. Let's see.... ;) |
Hi @Phlos , Strange problem, you are experiencing this issue since the latest 2.4.2 client version if I understand you correct? Thank you for register this issue, will check on our side what is going on. |
Hi @T0mWz Yes, although the problem has happened before. Not in the past couple of months though. As said above, sth with the server connection is going wrong, it seems (at hourly intervals)... |
Update:
Within my directory tree, only directories within my bigger directory "SYNTHETICS" were deleted, while a similarly large directory "DATA" was untouched. Hypothesising upon the cause of the deletion, could it be that if the connection breaks during the discovery phase, everything that has not yet been discovered is somehow branded as "doesn't exist on the server", upon which it is deleted? |
Hi All, I checked the logs at our side, to see whats happening here. At this time there were two clients online, this Linux 2.4.2 client and an another 2.4.1 Linux client which did not do much.
After this 401 error, the client logs in again around 00:38:50 o'clock. |
Hi @T0mWz should I do or check something? Not quite sure from what you write. You write
but it is important to note that it is not the specific folder, but the specific time (every hour) that is causing the trouble. Around the 31st of July it was at every ~[hour]:36 (see my message above), now on the 3rd of August it is at every ~[hour]:39. It seems to shift by 0-5s every time. Whether it is ps, if you want to insert multi-line output you can put it between triple back quotes ``` above and below |
Hi @Phlos , Today I have made a comparison of your real data and the metadata. OwnCloud works on the base of a database containing the metadata of your files, which does not seem to be completely in sync. I start a command to bring them in sync again, but because you have a lot of files, this will takes a while. Tomorrow, I will check if everything is back in sync, from then we have to see if your client reacts normally again. |
Hi @Phlos , The metadata of your files is back in sync. Would you be able to see if the problems were solved for you? |
@T0mWz No, unfortunately the problem persists :( Hourly errors as always. @ogoffart also unchecking loads of directories didn't bring any solution. Now, if I grep for "ERROR" in the log, I get some new error type ("Premature end of document") and the hourly schedule is a bit distorted every time this happens. See below.
|
"Aborted by the user" is caused by the 401 error. (When we get a 401 we abort the request. We are not supposed to get 401 unless the password is wrong, so we'll let the UI retry a connection in order to show the password dialog) "Premature end of document." Means that the PROPFIND reply was not a valid XML document, likely the server truncated the reply. |
Sorry @Phlos , you're right. It happened by coincidence that I saw twice the same file / folder where the 401 error occurred. @ogoffart , It seems that the client receives a 401 error where the oAuth token may have expired, after which it does a POST for a new oAuth token refresh. See Apache log below;
@Phlos , I see that you have also a sync-client version 2.4.1 running on an another location. Shows that client version also this behavior? |
The 2.4.1 client (which is installed on my main workstation at work) has no issues. |
So the problem is that the initial discovery is taking more than one hour which is more than the oauth token expiration. And the sync client can't cope with that, as it has to cancel the sync and refresh the oauth token. In theory we should refresh the oauth token without restarting the sync. But this is a bit tricky with the current client architecture. The new discovery algorithm (2.6) will also allow to do the propagation and the discovery at the same time. Which will allow the next sync to restart where we left, without the need to spend one hour re-discovering again.
I was hoping it would.
There are almost no difference with 2.4.2. |
Hi @ogoffart --
I will try and update you with the results... Now unchecked a 51 GB dir, if that doesn't do the trick I don't know what will :-P -- let's hope the deletion is not futile!
Hm, interesting.
What do you mean by this? Not sure what oauth is.
There were definitely always the same amount of directories. The only significant difference is that the laptop (the problematic machine w/ 2.4.1) is only used intermittently (i.e. there may be several weeks where I don't use it a lot) whereas the workstation is only turned off briefly every now and then, and is therefore kept mostly in sync. Also, the laptop was probably not in sync at the time that its client was updated to 2.4.2. Can this be related? |
Hi @ogoffart , Thanks for your reply. For SURFdrive, we have configured multiple back-ends. OwnCloud sync clients starting from 2.4.0 and newer will authenticate through oAuth against SURFdrive. Older clients still using the Shibboleth Single single sign-on method to login. That is why I would expect the same behavior with the 2.4.1 client. @Phlos, do you synchronize on your other system with the 2.4.1 client also all files or just a selection of folders/files? |
My workstation (with OC 2.4.1) syncs everything. |
Note, at 08-18 16:25:5 a sync failed with the error "Free space on disk is less than 48 MB". Maybe this somehow influenced the db? Even if sync finished successfully several times afterwards. |
@ckamm I doubt it, because the disk space issue was only relevant this day. All the other times files got deleted, there was plenty of space. (on 08-18, I had temporarily rsynced a rather large subdirectory to my homedir, i.e. where it was safe from deletion ;) ) |
Interesting before the 05:26 deletion, there are two PROPFINDS from the server:
All the previous PROPFINDS have the same size, but the last one before the deletion has quite a larger size in the response. 14829 vs 501 bytes. This could lead us down the thought path that it is an incorrect server response that is triggering the client to delete the files. |
Was still thinking about this issue. Was wondering if there were changes on your side @Phlos . @owncloud Guys; Currently we capture 401 response to another page due to unauthorized user logins. @Phlos, would you perhaps be able to try again to see of it makes a difference. Super thanks for your help! |
@T0mWz sorry for the late reaction! I haven't had any more deleterious experiences recently. I strongly pruned my data, though, and only have subsets of my directory tree synced to my laptop, not the whole thing. At least for now, it seems to work, but I'm holding my breath continuously... ;) Could there be a relation with directory size / number of files / depth/breadth of tree? |
Something weird just happened again as I was looking at it. Unfortunately, no logging took place, but it was the directory @T0mWz can you see anything in the server logs around 15:19:16? |
@Phlos About logging: If you're using 2.5.0 definitely
That way detailed logs will be written all the time and you'll have a couple of hours to make a copy of them when something strange-looking happens. You only need to do this once - it's a new feature to make it easier to have logs for rare, hard-to-reproduce issues like this. |
I opened #6814 to track the oauth2 and timeout issue. (So far no progress have been done for a month regarding the reason why the directories got deleted) |
Update: after purging my tree, for a while the problem didn't show up. However, it had been growing again since, and last week (of course just before I wanted to take the laptop along with me on a trip) it decided to do the big ol' delete again. Potentially having a big tree not only results in the one hour discovery cutoff (which inhibits re-download, as per #6814 ) but also somehow results in deletion. Related? @ckamm I tried the F12 option, but unfortunately the owncloud log dir defaults to my /tmp/ which is in my root which is extremely full as it is already - will have to start it up manually. Maybe something to change, i.e. let the user decide on the log dir location? I'll now and tune down my selective sync in hopes of building up the tree again.. |
Hi @Phlos , Painful to hear. :( During the upgrade begin of October, we upgrade to latest OC version and latest oAuth version, assuming that this could possibly help in combination with latest OC sync-client. However, this has reset the AccessToken lifetime back to the default 1 hour. Will restore this again to a longer period, so hopefully the sync client has some more breath to complete a sync. However .. this can not be infinite and would like to start with an increase to 2 hours. @ogoffart, I think it's a hard subject. But is there a way that a client can better deal with an expired oAuth AccessToken? |
@T0mWz Solution in #6814 and #6819 . You could test with the latest daily builds from https://download.owncloud.com/desktop/daily/?C=M;O=D |
It's happened again. Near-complete deletion of my largest directory w/ most elaborate tree. I was working on my laptop yesterday after not having used it for about a week. It then proceeded to download most of the stuff in this one directory while I was working in another. This other directory keeps syncing fine (have in between worked on another machine on the same files). [this is an adapted message because I thought that the deletion had taken place today] |
@Phlos Logs are very interesting! mail at ckamm de. |
The issue I see in the log is similar to what was observed before:
I'll continue to look for potential causes. |
The issue I see is that |
This could fix a problem where the client incorrectly decides to delete local data. Previously any sqlite3_step() return value that wasn't SQLITE_ROW would be interpreted as "there's no more data here". Thus an sqlite error at a bad time could cause the remote discovery to fail to read an unchanged subtree from the database. These files would then be deleted locally. With this change sqlite errors from sqlite3_step are detected and logged. For the particular case of SyncJournalDb::getFilesBelowPath() the error will now be propagated and the sync run will fail instead of performing spurious deletes. Note that many other database functions still don't distinguish not-found from error cases. Most of them won't have as severe effects on affected sync runs though.
This could fix a problem where the client incorrectly decides to delete local data. Previously any sqlite3_step() return value that wasn't SQLITE_ROW would be interpreted as "there's no more data here". Thus an sqlite error at a bad time could cause the remote discovery to fail to read an unchanged subtree from the database. These files would then be deleted locally. With this change sqlite errors from sqlite3_step are detected and logged. For the particular case of SyncJournalDb::getFilesBelowPath() the error will now be propagated and the sync run will fail instead of performing spurious deletes. Note that many other database functions still don't distinguish not-found from error cases. Most of them won't have as severe effects on affected sync runs though.
This could fix a problem where the client incorrectly decides to delete local data. Previously any sqlite3_step() return value that wasn't SQLITE_ROW would be interpreted as "there's no more data here". Thus an sqlite error at a bad time could cause the remote discovery to fail to read an unchanged subtree from the database. These files would then be deleted locally. With this change sqlite errors from sqlite3_step are detected and logged. For the particular case of SyncJournalDb::getFilesBelowPath() the error will now be propagated and the sync run will fail instead of performing spurious deletes. Note that many other database functions still don't distinguish not-found from error cases. Most of them won't have as severe effects on affected sync runs though.
not tested with 2.5.4It appears hard to reproduce a setup locally, that would trigger the issue. Not tested here. Okayish |
It just happened to me on Windows 2.5.4. A folder was still syncing, I renamed it and all the other folders got deleted. Only this one left. Hopping on beta channel now, hoping for the better. |
I have a 250 Gb storage allowance through a university system which I use to sync my files between several computers that I work on. The university system supplies a tailored client for windows, but for linux I use OwnCloud. I noticed in the last few days that OwnCloud is deleting files from a local computer and then tries to redownload them. This is an issue I have had several times in the past and somehow keeps popping up. [edit: That it started happening again is possibly related to the fact that I updated to 2.4.2 on 19 July).] I have only experienced this on my linux machines, never with Windows (although it should be noted that I sync only a rather small subset of my stuff to the Windows OS).
Expected behaviour
I have the OwnCloud client on my laptop. It syncs my files. If I am offline for a while, it syncs everything as soon as I'm back online.
Actual behaviour
I have the OwnCloud client on my laptop. It deletes files that I had on the local machine and on the server from the local machine, then tries to redownload them.
The sync for redownloading is not unproblematic:
Consequently, I wait for hours to get a sync that maybe 30 minutes, after which it stops working, I restart everything, etc.
Steps to reproduce
Since this is not a very straightforward issue, I'm afraid there are no straightforward steps either. I did the following, though:
Server configuration
Unfortunately, I do not know what specs the university system has.
Client configuration
Client version: 2.4.2
Operating system: Ubuntu 16.04
OS language: English (United States)
Qt version used by client package (Linux only, see also Settings dialog):
Not sure, but the settings dialog says "Built from Git revision d6e975 on Jul 19 10:54:56 using Qt 5.6.2 OpenSSL 1.0.2g 1 Mar 2016"
Client package (From ownCloud or distro) (Linux only): I'm not sure what is meant by this question.
Installation path of client: /opt/ownCloud/
Logs
I had no logs up until now -- using the instructions below I have now started logging what is going on. I'll leave my laptop at work and connected overnight, and come back tomorrow to see what the dawn brings...
Other comments:
This issue might be related to #3102, #6282 and #6322, but since my issue seems to be subtly different (and since I don't understand half of those conversations), I hesitantly file this as a new issue. Apologies if this is not the way forward.
The text was updated successfully, but these errors were encountered: