-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long running php processes: LDAP timeout #4831
Comments
I do agree on the possible explanation as I had some doubt when some people reported having issue with nextant background process not ending while using LDAP. |
I did some code research, and the cause of this seems to be the LDAP error handling here:
Looks like an exception is thrown, but nobody is willing to handle it.
The LDAP code at this point is a little hard to read and understand for someone not working with it on a daily basis. I had to match the numeric error codes like "-1":
from LDAP header files: As a side note: it'd be nice to have those constants mentioned in the source for better readability. |
this could potentially result in other bugs, when paged results are being used. If those are restarted, it's possible to run again into a timeout and the fun starts again. Perhaps it makes more sense to set/request a higher timeout from the LDAP Server when a background task is running (keyword Alternatively, can the background job be split into smaller chunks? |
The handling of long running paged results may require special treatment, that's right. I'm not sure wether splitting a background job into smaller junks will help: As far as I observed, the LDAP connection during "nextant:index" is opened (and used?) at the very beginning, then a long time with no LDAP activity follows - and I guess that's what makes the LDAP server close the connection. So splitting the background job into smaller junks would only improve the situation if every junk was processed by a new process (with a new LDAP connection, that won't hit the timeout). I also thought about changing some LDAP connection related settings, however most of the options I found seem to be OpenLDAP specific (and therefore wont help with other LDAP implementations like my Samba DC LDAP):
At the very end I still think re-connecting is the best solution, even if it may require some extra handling (e.g. for paged results). How do other user backends handle situations like this? |
I could totally catch the |
@daita you don't. try to set $resource to null before https://github.com/nextcloud/server/blob/master/apps/user_ldap/lib/LDAP.php#L333 Not sure whether it works, and as stated above this might create more issues without further adjustments. For you it's probably okay since you don't interact much with LDAP, as I interpret it. |
I've been running a similar test overnight: Instead of setting $resource to null I replaced the exception with a simple debug warning:
"occ nextant:index" now finishes without (obvious) error.
|
This would fail really, really hard in several other places. Been there, not pleasant. ;) |
I'm totally aware of that. That's why I hope for a clean solution ;-) |
Potential fix in #5104 |
backported to stable12 in #5210 |
Bug initially reported to nextant (nextant issue #175), but could be caused by a more global user_ldap problem.
For long running jobs some LDAP servers (Samba domain controller in my case) seems to kill the LDAP connection after some time. This connection is not re-established by nextcloud/user_ldap when required.
Possible explanation: During normal operation most php processes of nextcloud run in apache config mode with a relatively short runtime - typically 30s. Most likely not enough to run into any LDAP timeouts. However nextclouds cron.php (when started by system cron) does run with php cli config (with unlimited runtime) - in this case a re-connect to LDAP may be required.
As apps are (most likely?) unaware of the actual user backend, the problem cannot be solved by the app developer, but needs to be handled in the user backend itself.
Steps to reproduce
Expected behaviour
Job should finish normally.
Actual behaviour
Jobs breaks after indexing all files (takes days to finish!)
On next run, the indexing starts all over again ...
Server configuration
Operating system:
Ubuntu 16.04
Web server:
Apache 2.4.18
Database:
Postgres 9.5
PHP version:
7.0
Nextcloud version: (see Nextcloud admin page)
11.0.3
Updated from an older Nextcloud/ownCloud or fresh install:
Updated
The text was updated successfully, but these errors were encountered: