-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cron job not running after crashed once #23054
Comments
Hi @QuentinFarizonAfrimarket. Thank you for your report.
Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:
For more details, please, review the Magento Contributor Assistant documentation. @QuentinFarizonAfrimarket do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?
|
I agree, this definitely needs fixing. Just FYI: there has recently been a proposal to review the current cron implementation and improve its stability: magento/architecture#171 |
Hello @hostep thank you ! I think I found a reproducible scenario that caused issue on my system :
Consequences Workaround : Ideas : |
@hostep Furthermore, I don't think the database lock (created in ProcessCronQueueObserver::lockGroup) is ever released if process crash without having a chance to unlock it. So it may never run again, what do you think ? |
This is not entirely true, since it doesn't clean up jobs with status magento2/app/code/Magento/Cron/Observer/ProcessCronQueueObserver.php Lines 496 to 499 in 48d8d43
As far as I know,
That's a good point, the lifetime was already lowered a lot in 244a2e9, but for the indexing group, it probably makes sense to lower the default failure history even further.
I'm not exactly sure what happens with a lock acquired in a mysql server when the php code suddenly stops, I would think that the connection with mysql then closes and the lock is released automatically, but I'm not sure about this... Btw: the locking is not used to manage the statuses of the jobs themselves as far as I know. |
Thanks @hostep Indeed you're right, running jobs are never cleaned. I think it would make sens to clear the after $historyFailure or max($historyFailure, $historySuccess) I think this requires a quick fix in 2.2 and 2.3 release lines, until your proposition for hardening crons is implemented. What do you think ? Indeed, from my observations the mysql lock seems to be released, so it should'nt be an additional issue (needs confirmation) |
@magento give me 2.2.8 instance |
Hi @QuentinFarizonAfrimarket. Thank you for your request. I'm working on Magento 2.2.8-develop instance for you |
@magento give me 2.2.8 instance |
Hi @QuentinFarizonAfrimarket. Thank you for your request. I'm working on Magento 2.2.8 instance for you |
Hello @hostep I have another issue : individual mview updates (like catalog_product_flat) are stuck in 'working' mode in the table mviews_state. I suspect that this modification : 5927a75 must also be applied to https://github.com/magento/magento2/blob/2.3-develop/lib/internal/Magento/Framework/Mview/View.php#L303 I can push a PR if you agree (for both issues) |
Agreed. There should be some way for Magento to detect failed cron jobs and not assume they are still running. But I'm not sure if the existing lifetime fields are the right solution here. Maybe a new field should be added for a new lifetime called 'max duration of running job' or something like that?
The change to catching a It would be nice to have some feedback from some experts in here. @dmanners: I spoke to you on Imagine 2018 about this particular problem (amongst some other problems), but this still hasn't been properly fixed, can you try to pull in some experts from Magento in here to see how this problem can be solved? |
Hi @M-A-X-I-M. Thank you for working on this issue.
|
Yes agreed, but it would be easy to push as a first fix. I thought about |
Cool, feel free to create a PR with this suggestion! It will most likely get approved. |
Even with my patch (#23125) I still have issues where no cron is currently running, it's not stuck in cron_schedule, but some mviews are stuck to 'working'
I have no ideas how this can happen, I'm open to suggestions |
@QuentinFarizonAfrimarket Hi! |
Hello, yes I have applied the fix #23077 and recreated triggers : #23079 (comment) It certainly helps, but we sill have issues, also with catalog_product_flat (temporary index table seems to disappear during indexation). Not sure how it is related to mview getting stuck since it's supposed to be catched by this PR |
@hostep I've noticed in my own environment that any jobs that are running and never complete are marked as missed. A method that clears missed jobs after a determined lifetime would also solve this issue. |
@Ctucker9233: reading the code I'm not seeing how that can happen A job is only set to magento2/app/code/Magento/Cron/Observer/ProcessCronQueueObserver.php Lines 294 to 297 in 8e16abc
This So you shouldn't see And there is already some code to cleanup
|
@hostep I misscommunicated. I didn't mean that I am seeing running jobs switch their status to missed. What I meant was that in my environment at least a lot of missing jobs seem to accumulate and that can also cause the cron to crash or have a deadlock. The code that looks like it's meant to clean up missed jobs also seems to be not functioning in my case. |
The PR in #28007 should fix this. It clears up things stuck in running and mitigates deadlocks so the cleanups can finish and stop the cron_schedule table growing exponentially. |
Hi @QuentinFarizon. Thank you for your report.
The fix will be available with the upcoming 2.4.3 release. |
Preconditions (*)
Magento EE 2.2.8
Crontab configured as per the documentation
Steps to reproduce (*)
Expected result (*)
Actual result (*)
Table cron_schedule is filled with pending jobs, no job for indexer_update_all_views is run (no output in var/log/cron.log, no status update in cron_schedule table.
Logs :
var/log/cron.log (last success + the error message)
=> And then no more logs about indexer_update_all_views, even if other jobs from the index group run correctly and output success in var/log/cron.log
Database recovered a minute after and query was OK
The text was updated successfully, but these errors were encountered: