Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message queue consumer never exiting when --max-messages specified. #17951

Closed
deefco opened this issue Sep 6, 2018 · 31 comments
Closed

Message queue consumer never exiting when --max-messages specified. #17951

deefco opened this issue Sep 6, 2018 · 31 comments
Labels
Component: Cron Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development Progress: done Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it

Comments

@deefco
Copy link
Contributor

deefco commented Sep 6, 2018

Preconditions

  1. Magento version 2.3.x & 2.4-develop

Steps to reproduce

  1. Create new consumer, subscribe it to empty topic
  2. Start consumer using bin/magento queue:consumers:start ConsumerName --max-messages 1

Expected result

  1. Consumer tries to get 1 message from empty queue, exits.

Actual result

  1. Consumer keeps listening forever, never exits.

Additional information

This behaviour is caused by the following commit: https://github.com/magento-partners/magento2ee/commit/19721b54aae0f18632b58486a3af544d37e5a866#diff-9876e6dc23c7c514b028b967d58aa0b0

Which unfortunately isn't public because message queue was developed for EE. The class CallbackInvoker was changed so the path taken when --max-messages is supplied enters an endless loop instead of exiting.

Only information supplied is this: "MAGETWO-57177: [Critical][OMS] Max messages makes the consumer die after existing messages are consumed" even though this is expected behaviour according to the developer documentation.

Can anyone please explain the reason for this change? Opening issue here because message queue is being moved to community edition in 2.3.0

@magento-engcom-team magento-engcom-team added the Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed label Sep 6, 2018
@magento-engcom-team
Copy link
Contributor

magento-engcom-team commented Sep 6, 2018

Hi @deefco. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me {$VERSION} instance

where {$VERSION} is version tags (starting from 2.2.0+) or develop branches (2.2-develop +).
For more details, please, review the Magento Contributor Assistant documentation.

@deefco do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

  • yes
  • no

@ghost ghost self-assigned this Sep 6, 2018
@ghost ghost added Component: Cron Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development labels Sep 6, 2018
@ghost
Copy link

ghost commented Sep 6, 2018

@deefco, thank you for your report.
We've acknowledged the issue and added to our backlog.

@ghost ghost removed their assignment Sep 6, 2018
@okorshenko okorshenko self-assigned this Oct 6, 2018
@magento-engcom-team
Copy link
Contributor

Hi @okorshenko. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).

    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.

  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 4. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento-engcom-team give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 5. Verify that the issue is reproducible on 2.2-develop branch.

    Details- Add the comment @magento-engcom-team give me 2.2-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.2-develop branch, please add the label Reproduced on 2.2.x

  • 6. Add label Issue: Confirmed once verification is complete.

  • 7. Make sure that automatic system confirms that report has been added to the backlog.

@okorshenko okorshenko assigned okorshenko and unassigned okorshenko Oct 6, 2018
@magento-engcom-team
Copy link
Contributor

Hi @okorshenko. Thank you for working on this issue.
Looks like this issue is already verified and confirmed. But if your want to validate it one more time, please, go though the following instruction:

  • 1. Add/Edit Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 2. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento-engcom-team give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 3. Verify that the issue is reproducible on 2.2-develop branch.

    Details- Add the comment @magento-engcom-team give me 2.2-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.2-develop branch, please add the label Reproduced on 2.2.x

  • 4. If the issue is not relevant or is not reproducible any more, feel free to close it.

@okorshenko okorshenko removed their assignment Oct 6, 2018
@geerlingguy
Copy link

geerlingguy commented Apr 4, 2019

Is there any update on this? We are trying to run Magento crons using CronJobs in Kubernetes, but once workers are spawned (and literally never die unless they work on the number of messages defined in max-messages), the Kubernetes CronJob will never exit.

We are also able to replicate this running docker exec and kubectl exec when running magento crons.

(On Magento 2.3.0 and 2.3.1)

@CajuCLC
Copy link

CajuCLC commented Apr 4, 2019

Is there any update on this? We are seeing same issues with Kubernetes. Instead of just exiting the consumer if there is no message, Magento waits for messages. So in theory you might have a consumer running forever on a dev environment even if you set max-messages to 1.
And if the consumer is running when the cron runs, it hangs there forever.

@RedAtRareCandy
Copy link

RedAtRareCandy commented Jul 4, 2019

I never had this issue until I upgraded from 2.3.1 to the new 2.3.2. All of the PIDs on my server are bin/magento queue:consumers:start and end with --max-messages=10000.

Because they never exit, lfd (Login Failure Daemon) sends me emails on the hour saying that they're using excessive resources (I.e. Process Time). They simply live too long and lfd keeps letting me know about it.

Is there a work-around? Would not like to disable lfd's notification of course.

@arnoudhgz
Copy link
Contributor

Also only have this issue since upgrading to Magento 2.3.2, related issue: #23540

@jeffekg
Copy link

jeffekg commented Aug 1, 2019

Same issue here, on EE

@Finbayern
Copy link

Same here, upgraded from 2.3.1 to 2.3.2 and started receiving loads of "Excessive resources used" warnings from lfd, all from magento queue:consumers never exiting.
Reverted back to 2.3.1 for now

@hostep
Copy link
Contributor

hostep commented Aug 16, 2019

Hi folks

I've send in some proposals for improving these message queue consumer processes to have them waste fewer resources: magento/community-features#180
Feel free to leave a comment over there if you think this is a good idea or if somebody has a better idea let me know!

@hubertus2017
Copy link

hubertus2017 commented Sep 18, 2019

same problem since 2.3.2

more and more "consumers" get started, dont end and finally overload the server:

19-09-_2019_10-42-31

@gabriel3iri
Copy link

Any update on this @magento-engcom-team ?

@dunagan5887
Copy link

@magento-engcom-team I'm also experiencing this issue and am wondering if this issue has been given any priority or consideration?

@jordanvector
Copy link

EE merchant here with the same issue. This may be a bit of a dirty workaround but we wrote a quick bash script that runs every 5 minutes looks for cron jobs that are consumers and kills them. Its worked ok and kept the consumers running so far.

Script looks like this

#!/bin/bash
ps -ef | grep consumers | grep -v grep | awk '{print $2}' | xargs kill

@tdm4
Copy link

tdm4 commented Nov 22, 2019

@jordanvector pkill -f queue:consumers might work better than 4 pipes ;)

@RaphaelBronsveld
Copy link

Is there any update on this?

When we have the consumers enabled, our cron jobs stop working entirely. Anyone else have the same issue? (Magento 2.3.3)

Currently using
'cron_consumers_runner' => [ 'cron_run' => false ]
as a temporary "fix". We'll manually run the consumers when we need them.

@ilnytskyi
Copy link
Contributor

ilnytskyi commented Dec 11, 2019

We added a patch, if there is no 10 messages then stop the job
Additionally we modified crontab to run consumers separately from magento cron
it helps till better solution invented :)

Index: vendor/magento/framework-message-queue/CallbackInvoker.php
<+>UTF-8
===================================================================
--- a/vendor/magento/framework-message-queue/CallbackInvoker.php	(date 1574846505000)
+++ b/vendor/magento/framework-message-queue/CallbackInvoker.php	(date 1574846505000)
@@ -21,9 +21,20 @@
      */
     public function invoke(QueueInterface $queue, $maxNumberOfMessages, $callback)
     {
+        $noMessages = 0;
         for ($i = $maxNumberOfMessages; $i > 0; $i--) {
             do {
                 $message = $queue->dequeue();
+                if ($message === null) {
+                    $noMessages++;
+                    if ($noMessages > 10) {
+                        exit;
+                    }
+                }
             } while ($message === null && (sleep(1) === 0));
             $callback($message);
         }

UPD: 2020-02-06
Since 2.3.4 consumers have flag consumers-wait-for-messages try to use it when add your consumer to crontab

@magento-engcom-team
Copy link
Contributor

✅ Confirmed by @engcom-Charlie
Thank you for verifying the issue. Based on the provided information internal tickets MC-30139 were created

Issue Available: @engcom-Charlie, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

@magento-engcom-team magento-engcom-team added the Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development label Dec 30, 2019
@engcom-Charlie engcom-Charlie self-assigned this Dec 30, 2019
@engcom-Charlie engcom-Charlie added Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed and removed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development labels Dec 30, 2019
@magento-engcom-team
Copy link
Contributor

✅ Confirmed by @engcom-Charlie
Thank you for verifying the issue. Based on the provided information internal tickets MC-30139 were created

Issue Available: @engcom-Charlie, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

@magento-engcom-team magento-engcom-team added the Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development label Dec 30, 2019
@ghost ghost unassigned engcom-Charlie Dec 30, 2019
@sprankhub
Copy link
Member

I just had this issue and did a little research. I had this entry in my app/etc/env.php:

'queue' => [
    'consumers_wait_for_messages' => 1
]

It is described in the DevDocs:

Specifies whether consumers should continue polling for messages if the number of processed messages is less than the max-messages value. The default value of 0 prevents stuck deployments caused by long delays in message queue processing. Set the value to 1 to allow consumers to wait for messages.

I switched the value to 0 and this solved the issue. So to be honest, I think this is not a bug, but a configuration issue.

@ioweb-gr
Copy link
Contributor

It seems this project helps mitigate the situation until this is properly solved / documented.

https://github.com/magemojo/m2-ce-cron

My server load dropped from 12 to 4 after installing.

@cadencelabs-master
Copy link

I wrote a post about this, but I wanted to note that I believe there is an issue in the documentation. If you look at this file: vendor/magento/framework-message-queue/CallbackInvoker.php

You will see that Magento 2 actually makes the default for this setting 1 (contrary to the documentation)

    /**
     * Checks if consumers should wait for message from the queue
     *
     * @return bool
     */
    private function isWaitingNextMessage(): bool
    {
        return $this->deploymentConfig->get('queue/consumers_wait_for_messages', 1) === 1;
    }

--- I had to force the setting to be 0 by adding the below to env.php:

'queue' => [
        'consumers_wait_for_messages' => 0
    ],

After doing that, my other cron processes started running again and I no longer saw the parent cron job as "stuck"

Full article: https://www.cadence-labs.com/2020/03/magento-2-stuck-or-long-running-cron-after-upgrade-to-2-3-x/

@m0n1ker
Copy link

m0n1ker commented May 14, 2020

@cadencelabs-master Thank you! This is definitely the problem, and the new default causes serious issues for instances hosted in Docker.

At a minimum, the documentation for the message queues (https://devdocs.magento.com/guides/v2.3/config-guide/mq/manage-message-queues.html#start-message-queue-consumers) needs to be updated to reflect the default value actually being 1 in Magento 2.3.x.

While I'm glad to see the consumers-wait-for-messages flag, it doesn't make any sense to me why the default value for this would be 1 when the workers are launched as a part of cron:run?

@sdzhepa sdzhepa added the Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it label Sep 22, 2020
@gabrieldagama
Copy link
Contributor

Since the behavior described looks like the desired behavior and we can achieve the expected result with the configuration flag consumers-wait-for-messages I will be closing this issue for now.

@jeff-matthews
Copy link
Contributor

jeff-matthews commented Oct 7, 2020

We've updated the 2.3.x and 2.4.x devdocs accordingly. See PR magento/devdocs#8010.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Cron Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development Progress: done Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
Archived in project
Development

No branches or pull requests