-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.3 Customer module Recurring setup script performance problems. #19469
Comments
Hi @vbuck. Thank you for your report.
Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:
where @vbuck do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?
|
Hi @engcom-backlog-nazar. Thank you for working on this issue.
|
Hi @vbuck , thank you for your report. Please follow these guidelines for proper tracking of your issue. You can report Commerce-related issues in one of two ways: GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub. |
@engcom-backlog-nazar Understood. Admittedly this issue may be a bit misdirected. However, my reason for starting a discussion in the Open Source forums was two-fold:
That said, you may keep this issue closed and I will forward to the partner portal. |
Reopening, to verify with vanilla CE instance and 500k customer accounts. |
@engcom-backlog-nazar Thank you for verifying the issue. Based on the provided information internal tickets |
Hello @vbuck I see that all internal tickets related to this issue were closed. Please, feel free to reopen or create a new one if issue still exists or was not fully fixed Thank you for feedback and collaboration |
@sdzhepa Is there any status update to what happened this ticket? You state you "suppose the issue has been resolved" but I don't see anything related to that within this thread of comments/replies. Also the tags of I'm using 2.3 EE and I'm seeing >1hr update times because Magento is reindexing the Is there a reason this needs to happen? It seems like this should not happen during the update. The purpose of the update script is to install/update/modify schema between version. The actions seem exclusive and separate from each other. Can anyone elaborate to why this reindex is needed during the update? And if it is in fact needed, Can anyone elaborate on what we can do to enhance the performance of it? |
@sdzhepa @ishakhsuvarov @magento-engcom-team Any updates or info on this? Do I need to create a new ticket for this? |
I did't find any solution regarding this issue. |
Hi @dambrogia Hi @allamsettiramesh i'm reopen this as this was not fixed. |
Hi @engcom-backlog-nazar thank you for re-opening this issue. If it's not imperative that the reindex needs to happen on every I would be glad to help out creating a PR for removing the recurring data script if necessary. |
@dambrogia, thank you for this note. I got 3m for |
Hi @vbuck, @Nazar65, @o-iegorov. Thank you for your report and collaboration! The issue was fixed by Magento team. The fix was delivered into The fix will be available with the upcoming |
The issue was about a reindex in a recurring script and the solution has a reindex in a recurring script. |
@slackerzz i didn’t tested it, but seems like in most cases it will not do reindex, so performance issue is resolved. Don’t you think so? |
If the customer grid index is invalid it will perform a reindex during |
Did you rested this new solution? Does customer grid index always invalid?
I believe it should be invalid only in case if some attribute was added
…On Mon, 9 Mar 2020 at 13:04, Lorenzo Stramaccia ***@***.***> wrote:
If the customer grid index is invalid it will perform a reindex during
setup:upgrade and the store will be in maintenance for minutes during
deploy.
If this is the Magento solution i will update my patch to remove the new
RecurringData script.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#19469?email_source=notifications&email_token=AAOJOUKNGC5ABH4YZSDHJFDRGTENTA5CNFSM4GHJ3ES2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOGUVWI#issuecomment-596462297>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOJOUOZD5A3LIN67IAZI4DRGTENTANCNFSM4GHJ3ESQ>
.
|
@slackerzz Natural state of customer grid index is valid. Reindex will be performed only in case if it is invalid that seems reasonable. If you have enableb cron (that also natural for prod environments) reindex wil be performed in bacground for invalid index, so case when you will perform setup:upgrade with invalid index is very rare and for this case reindex will be performed (that is ok for that case) |
@o-iegorov I'm nearly certain this portion of your recent statement is factually incorrect. The logic in this recurring data upgrade (the "fixed" version in 2.4) doesn't account for schedule vs realtime separately, it simply calls Calling |
I agree with @davidalger, an extra condition in the if should be added to prevent a synchronous reindex action when the indexer is set to But this is a mini optimisation, the chances of having an invalid customer grid indexer while running |
@davidalger @hostep I do agree with you that it's not perfect solution, it could be improved, but it significantly improves situation when no customer attributes were affected. Feel free to create Pull Request with improvement based on your suggestions. |
@davidalger I din't say nothing about indexer mode. Moreover, customer grid indexer doesn't support update by schedule - https://docs.magento.com/m2/ce/user_guide/system/index-management.html It's recall reindex when state is invalid, that's correct behavior. When indexer is in invalid sate magento cron job will perform full reindex indexer with any kind of update mode. |
@hostep please refer my prev comment - customer grid indexer doesn't support update by schedule |
@o-iegorov Where do you see customer grid index only supports index on save? I have this index running in production today in schedule mode. This is an index using the materialized view patterns in M2, and it would make little sense for it to only support On Save.
This may be correct, but what I'm saying is that what the upgrade does is call @ihor-sviziev I don't have time at the moment to create a PR to further enhance this on 2.4 (unfortunately). On the 2.3 project that highlighted this for me with a 40 minute grid reindex, I'm simply deleting the Recurring Data script from the customer module as a workaround. I'm mainly posting for the sake of others as I read the original comment to infer something regarding the asynchronicity of the indexer as it relates to the setup upgrade routine. |
@davidalger Please read carefully provided link: It's works just because it's invalidated and reindexed in background by cron job. There are no mview processor for this indexer just dummy. Upgrade call reindexAll in case when indexer is invalid, that a rare case for production. In case when invalidation was performed by some setup script (during adding some attribute for example) reindex is should be performed. But not every setup upgrade. Reindex itself in this indexer is very tricky - for example it's creates related database tables and cannot be replaced with just invalidation. But if you know cases when current solution may be improved - please create related PR. This fix also delivered to 2.3-develop and will be part of the 2.3.6 release |
@o-iegorov Very interesting. I missed that note on the page. Thanks for the followup. Also g2k regarding 2.3.6 release. 👍 |
2.3.6?! October? |
Too bad that it takes 6 months for a confirmed and fixed issue to be released. Not to speak of the 17 months it took to actually fix it. In case you use composer this is a quick way to patch your install.
|
@MichaelThessel your solution makes it appear as though you are committing the I do agree that the lack of prioritization on this problem is a shame. If we didn't have a way to publish diffs on GitHub publicly maybe that would have forced an earlier release of the fix. Anyway, for those who don't commit Alternative Patch Method
|
@vbuck Thanks for pointing this out. I wasn't aware of the possibility to patch with composer. I went down the route you suggested and it works great. In case someone wants to implement this and has their Magento core modules in vendor here is the patch with the paths corrected: https://gist.github.com/MichaelThessel/0b0cf69dd20326491115413adf7a94b9 |
Still a problem in 2.3.5 btw. |
@LiamTrioTech: have you read this comment and this comment? It says it is fixed in 2.4.0 and will be fixed in 2.3.6 as well. |
Hi, we've recently run into this issue within a Magento installation with 1100K customers. I've been investigating, and this is what I found, just in case it is useful for someone. I know this issue is related with setup:upgrade performance related with customer_grid indexer, and this comment is about customer_grid indexer inner performance, but since it affects also setup:upgrade when reindexing all, I thought it would make sense to post it here. About this comment:
Although it's true it has only a dummy mview, indexer does not get invalidated and reindexed by cron job, but synchronously upon Customer and Customer Address save, at \Magento\Customer\Model\Customer::reindex and \Magento\Customer\Model\Address::reindex, respectively. Index only gets invalidated when customer attribute is added and used in grid / modified and used in grid changed / deleted and used in grid, so a full reindex is needed to rebuilt the grid table properly. At https://support.magento.com/hc/en-us/articles/360025481892-New-customer-records-are-not-displayed-in-the-Customers-grid-after-importing-them-from-CSV it says customer_grid index is not supported by "Update by schedule" due to performance reasons, but it does not specify any detail. Digging a little deeper, we arrive soon at https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/Customer/Model/Indexer/Source.php, the data source provider for customer grid data. It provides an iterator to supply data to be indexed:
Benchmarking this method, we found that at each step, execution time increases a bit. After many steps, time elapsed at each step can be increased even by 10x. Taking a quick look at the code shows the issue here. At each step, the same query is performed to retrieve data, with different sql LIMIT offset values. Having LIMIT [offset,] row_count, assuming a batch size of 10000, consecutive queries would look something like (very simplified):
Mysql starts building query results, and returns them as soon as it has the needed number of them. It is easy for the first query, but for the last one, it has to generate internally (due to joins, ordering, etc) the offset + 10000 results, to return only the last 10000, discarding the offset results. In short:
A real example, using a query generated by the indexer, note the offset and the elapsed time: 1.7 ms vs 20.4 s is a huge difference. Our solution looks like this:
Explained:
Result for us is as steps get executed, execution time for each of them remains almost the same. This may not make a difference for small reindexing, but it really does for databases with large customer tables. Also, we've implemented the mview system and "Update by schedule" for this indexer separately; we'll check how that works, and find what performance issues are those which weren't explained at Magento page. I'll let you know if I find something new about that. |
@adrian-martinez-interactiv4 amazing job! This is really huge step forward! @o-iegorov could you review following comment #19469 (comment)? Could you bring us more info which performance issues it were causing when used update by schedule for customer grid index? maybe we as a community could fix it? |
Does this fix will be realease in the 2.3 branch ? |
Any news for this ? It's a huge thing, I've got performance issue because of this on many large projects |
We also have this issue at Zadig & Voltaire. It's just crazy to recreate the whole customer grid flat table at every deploy we do every day. |
From what I read this is solved in versions 2.3.6 and 2.4.0. If you're running versions below these, I'd recommend upgrading. |
So when the customer grid in admin needs a reindex, the whole world should wait? I've read the thread but still, why not to schedule the indexer so that it can later be run asynchronously instead of executing it directly in the upgrade script? The upgrade itself takes seconds to get executed but all those Recurring.php take dozens of seconds or sometimes minutes. I agree that some stuff must be checked to keep the DB consistent but does it make sense, that the shop is in maintenance mode only because something in a grid or in a sales report has changed? Whatever. |
When running
bin/magento setup:upgrade
for a Magento CE 2.3.x installation(or just use Magento Open source), there is an unexpected delay in the recurring setup script execution on theMagento_Customer
module(every time when you runbin/magento setup:upgrade
) . This is more pronounced on a large data set (>500K customers).References
Preconditions (*)
Steps to reproduce (*)
bin/magento setup:upgrade
Repeat these steps and you will notice, since there is a recurring upgrade script, that it runs every time.
Expected result (*)
Actual result (*)
After ending of update you can run again
bin/magento setup:upgrade
and you will meet this problem again.I am not sure of the need/reason to run a recurring upgrade, but from the reference posted at the top of this issue it's clear the intent to is to handle reindexing on upgrades. This seems unwise and gives room for abusing recurring upgrade scripts with patch-like behavior or long-running processes which can delay deployment times.
Do you have any background regarding the nature of the change?
The text was updated successfully, but these errors were encountered: