Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Internal Server Error" #9324

Closed
tachyon-ops opened this issue Jul 1, 2014 · 50 comments
Closed

"Internal Server Error" #9324

tachyon-ops opened this issue Jul 1, 2014 · 50 comments
Labels

Comments

@tachyon-ops
Copy link

So, I've come a long path, debugging and checking out problems and issues.

My Client is uploading some-file.avi with 49Mb. To do so, it chunks the file.

When it's finished, windowlog (thanks to this https://forum.owncloud.org/viewtopic.php?f=14&t=17807 ) has this error:

07-01 09:40:06:438 syncengine.cpp:631 void Mirall::SyncEngine::slotJobCompleted(const Mirall::SyncFileItem&) "001_TUTORIALS/Tutorial_02_Organization_and_export_an_Obj_from_Nurbs_to_Poly.avi" 6 "Das Element ist aufgrund vorheriger Fehler nicht synchronisiert: Error downloading https://owncloud_place/remote.php/webdav/some-file.avi-chunking-955182228-5-0 - server replied: Internal Server Error"

This file exists in data/admin/cache, not in the data/admin/files/some-file.avi...

Certificates is checked out due to this log:
ccount.cpp:358 Certs are already known and trusted, Errors are not valid.

Linux Clients are more rubust then Windows (apear much less errors). So, at first I thought it was some client error. Now, I am more inclined to think that or the client doesn't ask to move from cache to files, or the server is blocking something.

Server logs:
owncloud.log has nothing (clean while the client is bumping in this error)
httpd/error_log just has some certs missconfig (I even edited sabredav to ser verify to 0 to rule this out)
httpd/suphp_log [Mon Jun 30 14:58:16 2014] [info] Executing "/home/admin/owncloud_place/status.php" as UID 7797, GID 7797
I use php-fpm...
php-fom/error.log [01-Jul-2014 06:48:32] ERROR: failed to ptrace(PEEKDATA) pid 17787: Input/output error (5) (the hour should be between 8 and 9 not 6)

And... I guess I have something in /var/log/php-fpm/slow.log

[01-Jul-2014 08:27:08] [pool admin] pid 30439
script_filename = /home/admin/owncloud_place/remote.php
[0x00007f8a88f7c900] fread() /home/admin/owncloud_place/lib/private/helper.php:561
[0x00007f8a88f7c0c8] streamCopy() /home/admin/owncloud_place/lib/private/files/view.php:358
[0x00007f8a88f7bcf0] file_put_contents() /home/admin/owncloud_place/lib/private/cache/file.php:74
[0x00007f8a88f7baa8] set() /home/admin/owncloud_place/lib/private/filechunking.php:50
[0x00007f8a88f7ae60] store() /home/admin/owncloud_place/lib/private/connector/sabre/file.php:215
[0x00007f8a88f79ea0] createFileChunked() /home/admin/owncloud_place/lib/private/connector/sabre/file.php:70
[0x00007f8a88f797d0] put() /home/admin/owncloud_place/lib/private/connector/sabre/directory.php:76
[0x00007f8a88f791c8] createFile() /home/admin/owncloud_place/3rdparty/sabre/dav/lib/Sabre/DAV/Server.php:1647
[0x00007f8a88f78688] createFile() /home/admin/owncloud_place/3rdparty/sabre/dav/lib/Sabre/DAV/Server.php:900
[0x00007fffb78b3370] httpPut() unknown:0
[0x00007f8a88f782f0] call_user_func() /home/admin/owncloud_place/3rdparty/sabre/dav/lib/Sabre/DAV/Server.php:474
[0x00007f8a88f773f8] invokeMethod() /home/admin/owncloud_place/3rdparty/sabre/dav/lib/Sabre/DAV/Server.php:214
[0x00007f8a88f76b28] exec() /home/admin/owncloud_place/apps/files/appinfo/remote.php:61
[0x00007f8a88f76278] +++ dump failed

So... sabredav - php-fpm issue?

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

Possibly. That's a really weird I/O error, as if the chunk files could not be read properly.
I'm not familiar with php5-fpm, maybe @bantu can help ?

Thing is, the sync client only uploads the last chunk and it's the server that knows that it's the last one and then puts the chunks together into the target location (inside "$datadir/$user/files/targetfile"), so it's probably not related to the client.

@PVince81 PVince81 added the Bug label Jul 1, 2014
@tachyon-ops
Copy link
Author

Thanks @PVince81
Well, I can assert that this is happening for me in both OC6.0.3 OC6.0.4 and OC7 (which I'm debugging right now)

Apache/2.2.27 (CentOS)
CentOS 6.4 x64
php 5.4.29
System Linux ******.com 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu Jun 19 21:14:45 UTC 2014 x86_64
Server API FPM/FastCGI
php-fpm active

Environment with Kloxo-mr (sort of just like webadmin or ispconfig, has some specific environment things, but that's all)

I am willing to help out understand this (and learn meanwhile)... I've exausted my resources.

http://pastebin.com/9drbfUfR for /var/log/php-fpm/slow.log

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

As discussed on IRC, we found out that:

  1. All chunks are first uploaded and then they just stay in the cache folder. They are never deleted.
  2. When retrying sync, new chunks are uploaded.
  3. With the Windows client the issue happens almost 100% of the time but with the Linux client only about 5%

@nmpribeiro as discussed, please try with bigger files (600 MB) to see if you can reproduce it more consistently, even with the Linux client.

I suspect there might be a race condition of some sorts.
It could also be caused by the environment or any kind of settings that would disallow reading of the chunk files after upload.

@dragotin
Copy link
Contributor

dragotin commented Jul 1, 2014

This is the key message in the logfile: Das Element ist aufgrund vorheriger Fehler nicht synchronisiert: ...

It means that the file is ignored in sync because it it blacklisted. And it is blacklisted because it there happened internal server errors with it during sync. That happened three times and than the file gets blacklisted and ignored to not bother the user with error messages.

To try a clean sync again, the user can click on 'retry sync' in the Activity tab of the client's settings dialog.

@tachyon-ops
Copy link
Author

"The element is not synchronized due to previous errors:"
sure enough, but I've done retry sync over and over. there is something else here. Namely "why" it got those Internal Server Errors in the first place.

We'll see with that 600Mb file in a moment.

@dragotin
Copy link
Contributor

dragotin commented Jul 1, 2014

yes, internal server errors are a problem of course. That needs to be found out and fixed. Usually with internal server errors, there is an entry in the apache error log.

@tachyon-ops
Copy link
Author

56 chunks for a file...
all uploaded smoothly to cache so far.

and it got into the cloud!
now... or that's some crazy odd moment, or those files I was testing with have an issue.

@tachyon-ops
Copy link
Author

no entry, perhaps because it's being managed by php-fpm?

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

Which sync client did you use this time ?
Did the chunk delete themselves after fusion ?

@tachyon-ops
Copy link
Author

@PVince81 Yes chunk files gone after file went to it's place inside admin/files/test folder

Windows client 1.6.1

I'll only be able to test linux later.

testing again those problematic files

@tachyon-ops
Copy link
Author

avi files with about 30-50 Mbs are giving the error...

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

@nmpribeiro did you put the error messages through a translator ?
It seems you had german in some places and maybe the weird message is due to translation ?

Talking about this one:

php-fom/error.log [01-Jul-2014 06:48:32] ERROR: failed to ptrace(PEEKDATA) pid 17787: Input/output error (5) (the hour should be between 8 and 9 not 6)

@tachyon-ops
Copy link
Author

@PVince81 it was only the client in german. Everything else is english (the whole server)

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

Okay, so it's mostly for small files between 30 and 50 MBs ?

Chunk size is 10 MB, so I suspect it shouldn't happen for files <=10 MB.

@tachyon-ops
Copy link
Author

you're right. that I've already debugged

@PVince81
Copy link
Contributor

PVince81 commented Jul 1, 2014

@bantu @josh4trunks @Kondou-ger ever heard of something similar with php5-fpm ?

@tachyon-ops
Copy link
Author

switched to suphp and seams to be working so far.
If any error happens next days, I'll confirm is a php5-fpm error in my environment.

@tachyon-ops
Copy link
Author

Confirmed. I am now trying to replicate the problem in a VM to share.

@tachyon-ops
Copy link
Author

I've setup a machine with kloxo-mr 6.5.1b, trying to replicate my original setup problem.
php54m, with php type: php-fpm
I'm trying to test my VM with the host... and they transfer 1.5 Gb in the blink of an eye

The only diference between them (besides the possibilty of diferent packages or confs that are very hard to trace) is in fact the velocity of transfer which does not match with the real case scenario.

I will hence save this VM, and since my problematic machine is a production one, If this issue re-apears somewhere, I'll be more then glad to provide any conf details from the problematic machine and check them agains my working VM. Might find something there. These packages do seam to be very unstable at the moment, It might have been fixed in some update. However, I don't recall any update happening for php or php-fpm between this week in my prob. machine.

Meanwhile, I'll close and assume this was a dificult to trace issue, that I do not have more time to proceed investigating.

Thank you all mates, specially @PVince81

@tachyon-ops
Copy link
Author

owncloud/client#1899 (comment)

Someone else is also pinpointing php-fpm... (by the way, I really need to operate with php-fpm...)

@tachyon-ops tachyon-ops reopened this Jul 5, 2014
@bantu
Copy link

bantu commented Jul 5, 2014

Probably not too useful, but that error message seems to be generated here. https://github.com/php/php-src/blob/PHP-5.4/sapi/fpm/fpm/fpm_trace_ptrace.c#L75

@PVince81
Copy link
Contributor

PVince81 commented Jul 7, 2014

@butonic php-fpm timeout stuff has been noticed here as well, produces strange behavior.
In case you can help clarify it 😄

@tachyon-ops
Copy link
Author

further tests seam to pinpoint into the fact that max_execution_time cannot be higher then idle-timeout, and request_terminate_timeout must be in the middle of these two.
(credits here: http://foaa.de/old-blog/2010/11/php-apache-and-fastcgi-a-comprehensive-overview/trackback/index.html )

However, there's still a bunch of errors (still frequent - about 25 errors in 30)
So, I've also changed my php-fpm config from pm = ondemand, to pm=dynamic (credits from here: http://www.if-not-true-then-false.com/2011/nginx-and-php-fpm-configuration-and-optimizing-tips-and-tricks/ )
As I am speaking (man, I hope it will not break as I post this), I just saw 2 "Operatoin canceled" in the activity queue (out of 50), that I am hopping it was from my reboot.
Nevertheless it's much more efficient.

I also have a VM with this setup to share to anyone who wants (PM me)!

This must be debuged further between owncloud and php-fpm
There must be a way to avoid these errors and make it even more efficient.

Chunking option should exist in server's owncloud configuration for administrators to test/prevent this issues.

@tachyon-ops
Copy link
Author

Anyone who has been having those issues, check if you have php-fpm, then try to change it to pm=dynamic (see above post).

For me, it was (seams) the solution! (thou it's still something to understand from the developers viewpoint)
If more people with the issue solves it by this config change, then the issue is pinpointed.

@PVince81 , thanks for your help.

_edit for sporadic errors still occour. thou now it's bareable (I know that files will get into the cloud)
*_edit: it's erratic. no solution whatsoever. same server config, different connection stability.

@tachyon-ops
Copy link
Author

max_execution_time = 30 /etc/php.ini
IdleTimeout = 120 /home/apache/etc/conf.d/fcgid.conf
pm=dynamic;
request_terminate_timeout = 120s /etc/php-fpm.d/admin.conf

I would love to test other chunking size then 10MB, unfortunately,
c:\Program Files (x86)\ownCloud>SET OWNCLOUD_CHUNK_SIZE=2MB
c:\Program Files (x86)\ownCloud>echo %OWNCLOUD_CHUNK_SIZE%
2MB
and then executing owncloud.exe, is not working for me.
I would like to see a chunking setting or flag to be able to dissect even further this issue (I think 10Mb is too long of a time to php-fpm processes in certain configs like mine)

@tachyon-ops
Copy link
Author

it's erratic. no solution whatsoever. same server config, different connection stability behaviours.

Become fuzzy again.

@butonic
Copy link
Member

butonic commented Jul 11, 2014

On my development machine I need to be able to debug requests without php-fpm killing the process under my nose. In my /etc/php5/fpm/pool.d/jfd.conf I have among other settings:

;request_terminate_timeout = 0
pm = dynamic
pm.process_idle_timeout = 300s

An in my /etc/php5/fpm/php.ini I have among other settings:

max_execution_time = 300
max_input_time = 60
default_socket_timeout = 60

From the page you cited:

You could adjust the idle-timeout of mod_fastcgi to a very high value and this problem will not occur – but if it’s really required to restart the processes from the PM, you’ll have a problem. This is where FPM comes in. It provides a config option called request_terminate_timeout, which enforces termination of the PHP processes for sure. What you can do now, is set the max_execution_time to a reasonable value, then adjust request_terminate_timeout to something longer, which cannot be overwritten by the user and then set the FastCGI idle-timeout right above.

Your max_execution time seems to be 30 seconds ... which I find quite low when trying to upload larger files with a small bandwith. 5min, as I have configured it may be ok on a development machine, but seems excessive for a production machine. Try raising it to 120s, just as your fpm request_terminate_timeout.

@PVince81 to shed some light on the problem with php fpm. owncloud tries to raise the max_execution_time in lib/base.php:

                //try to configure php to enable big file uploads.
                //this doesn´t work always depending on the webserver and php configuration.
                //Let´s try to overwrite some defaults anyways

                //try to set the maximum execution time to 60min
                @set_time_limit(3600);
                @ini_set('max_execution_time', 3600);
                @ini_set('max_input_time', 3600);

                //try to set the maximum filesize to 10G
                @ini_set('upload_max_filesize', '10G');
                @ini_set('post_max_size', '10G');
                @ini_set('file_uploads', '50');

Unfortunately, changing these does not affect the php-fpm request_terminate_timeout setting and the php process will be killed before the php max execution time has been reached. Nothing we can do here other than giving a recommendation for how it should be configured. Maybe 3600 seconds = 1h?

@nmpribeiro request_terminate_timeout should always be a little longer or just as long as phps max_execution_time.

@butonic
Copy link
Member

butonic commented Jul 11, 2014

hm 1h timeout seems very excessive. @danimo @dragotin have you done some testing how long it takes to upload a 10M chunk?

@tachyon-ops
Copy link
Author

@butonic yes... you definetly shed some light into this whole issue.

With a dsl is not hard to bump into time limitations: I would say I have seen uploads ranging from 30kbps to 255kbps (ISP's alwayws think we just download... ):

10MB-sized file (10MB = 10 485 760 Bytes = 83 886 080 bit) on a 30kbps connection (30kbps = 30720 b/s) require 2730 second

would be good to have some sort of chunking size option on the gui, and anyone who has these problems could change (or if sync gets x amounts of internal server errors, drop chunking)

I have been trying to make a package for windows so I can make a little tweak on the chunking code, and share the expeirence. no success so far

@MatthewVance
Copy link

On Windows (built from revision 40d765), I've also had issues uploading relatively large files.

Since yesterday, I was not able to upload a 230 MB mp4 file. I’ve tried on three different networks (one of which was wired in). The client showed the file being uploaded. After a few MBs were uploaded, the sync failed and started over. The Sync Activity log shows the “Action” status as “Operation cancelled”. This upload issue appears to have caused additional errors with other files (per the Activity log). Once the file was manually moved to a different folder in Windows Explorer, uploads started working again and the files which appeared to have been affected uploaded correctly.

This morning, encountered this issue again with a 65 MB zip file. The activity log shows the zip file having an operation canceled status on several occasions. It also shows other files which failed to upload. Again, one the problem file was removed, uploads continued without issue.

@butonic
Copy link
Member

butonic commented Jul 29, 2014

@MatthewVance can you post related sections of the log files on the server and the client?

@MatthewVance
Copy link

@butonic Someone else manages the server so I am unable to post those log files. As for the client, here's a link to the relevant info from the client log: https://gist.github.com/anonymous/01a340330221f9dbed24

Please note the 65 MB file finally uploaded after I moved it into a new folder within ownCloud.

@AmazingSpanoMan
Copy link

I am also having this issue with three files of a couple thousand that were being synced. I tried deleting them and recopying them into the folder, I've gone in and deleted the chunks via FTP on my hosting server. What I noticed is that when trying two of the three files the client stalls every 10mb or so and continues the upload. I also noticed that even though the two files are 95mb combined the client says that it uploaded "100mb of 95mb" before stalling for a time and giving the error. (See below)

owncloud error

What other information can I provide? I was, alternatively, able to upload these files for safe keeping via the web UI upload feature with no issues. My only concern is even though these images are in the folder they should be on both the computer and server side, it seems the client can't figure out that the file is already there and is still trying to upload the files every so often. (Of course erroring out and eventually ignoring these files)

Thanks in advance!

@AmazingSpanoMan
Copy link

@butonic I looked for jfd.conf but can't locate it anywhere on my hosting server. I used the server search tool as well as manually looking in my "etc" folder on my server space.

I'm going to mess with the php.ini file again and look at your settings to see if it helps. I will report back.

@AmazingSpanoMan
Copy link

Nope still getting the same error. :/

@tachyon-ops
Copy link
Author

@AmazingSpanoMan can you tell me your internet speed please?

@butonic
Copy link
Member

butonic commented Aug 13, 2014

@AmazingSpanoMan sorry, It seems I confused you with jfd.conf. php-fpm uses a separate config file for every worker. The one I am using to serve files from my home directory is /etc/php5/fpm/pool.d/jfd.conf. The default pool is in /etc/php5/fpm/pool.d/www.conf.

@AmazingSpanoMan
Copy link

@nmpribeiro I have 100mbps download and upload is ~12mbps. I am currently using both dropbox and google drive while I work kinks out of the owncloud install and have no problem with these files. Thanks!

@AmazingSpanoMan
Copy link

Here is a new error on a single large file.

owncloud issues

@tachyon-ops
Copy link
Author

My bet is that your internet is slow enough so that while sending a 10mb chunk, it bumps into php (php-fpm) execution limit. I habe built a 5mb chunk mirall client, but still didn't test it.
When I have time to spare, I'll post my findings (meanwhile, I urge the mirall developers to have a chunking change UI button (with drop down values perhaps?).

If less sized chunks give no error, then we would have completly pinpointed the error (I'm almost 90% certain this is the issue). So it's rather a server configuration and mirall default setting (10Mb chunking) that makes low connections troublesome (mine fluctuates a lot).

@AmazingSpanoMan
Copy link

Yeah it seems that anything under 10mb have not problem. It is these large files and only with the desktop client. If I go to the web UI I can upload no problem. The only issue with that is the desktop clients doesn't see the file and will still attempt to upload the files on my computer to the server.

Please keep us posted!

@dajusc
Copy link

dajusc commented Oct 29, 2014

Same problem here (oC: 7.0.2, client: win 1.6.4). Any progress?

@plastikschnitzer
Copy link

Same problem here, OC 7.0.4, clients 1.7.0 on Linux and Mac. As far as I see, the problem is related to the speed of the internet connection:
If I am uploading files on an internet with 384 kbit/s upstream, the error message described above appears with all files larger than 10 MB (Error downloading ... server replied: internal server error).
If I upload the same files on a 2000 kbit/s upstream connection, everything is working fine!

@dajusc
Copy link

dajusc commented Feb 3, 2015

I don't have the problem any more, since I moved from the external webspace (Strato) to my own homeserver (clean install, Turnkey oC in VM).

@plastikschnitzer
Copy link

I also do not have the problem any more as I upgraded the hosting and now I have a max_execution_time and max_input_time of 3600 seconds instead of 180 seconds before (in php.ini). Now all uploads are working fine, no matter which bandwith or how big the files are.

@DeepDiver1975
Copy link
Member

looks like these issues can be solved by having proper configuration - did we document this properly? @MorrisJobke can you please double check? THX

@MorrisJobke
Copy link
Contributor

@DeepDiver1975 There is an open issue for that: owncloud-archive/documentation#702

@MorrisJobke
Copy link
Contributor

I also hope that this issue is fixed by #10992

@MorrisJobke
Copy link
Contributor

.user.ini can be read by the php-fpm pool and then can grab the php.ini conf values there.

Feel free to reopen if I'm wrong

@tachyon-ops
Copy link
Author

Sorry for the long delay. Switched from kloxo to vestacp, and OC7 to OC8. Much better. So far, no more errors relating to the issue seen. (have not touched on the 10mb chunking option either...)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests