Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.3 Customer module Recurring setup script performance problems. #19469

Closed
vbuck opened this issue Nov 29, 2018 · 59 comments
Closed

2.3 Customer module Recurring setup script performance problems. #19469

vbuck opened this issue Nov 29, 2018 · 59 comments
Assignees
Labels
Component: Customer Fixed in 2.2.x The issue has been fixed in 2.2 release line Fixed in 2.4.x The issue has been fixed in 2.4-develop branch Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release

Comments

@vbuck
Copy link

vbuck commented Nov 29, 2018

When running bin/magento setup:upgrade for a Magento CE 2.3.x installation(or just use Magento Open source), there is an unexpected delay in the recurring setup script execution on the Magento_Customer module(every time when you run bin/magento setup:upgrade) . This is more pronounced on a large data set (>500K customers).

References

Preconditions (*)

  1. Magento CE 2.2.x(or 2.3.x) -> 2.3.x upgraded codebase (pre DB upgrade)
  2. A large customer database (>500K records).

Steps to reproduce (*)

  1. After codebase upgrade, proceed to run bin/magento setup:upgrade
  2. Observe execution delay on process step:
Module 'Magento_Customer':
Running data recurring...

Repeat these steps and you will notice, since there is a recurring upgrade script, that it runs every time.

Expected result (*)

  1. No recurring data scripts run, or they are or more performant.

Actual result (*)

  1. Recurring data scripts run with each attempt to upgrade the DB.

After ending of update you can run again bin/magento setup:upgrade and you will meet this problem again.
I am not sure of the need/reason to run a recurring upgrade, but from the reference posted at the top of this issue it's clear the intent to is to handle reindexing on upgrades. This seems unwise and gives room for abusing recurring upgrade scripts with patch-like behavior or long-running processes which can delay deployment times.

Do you have any background regarding the nature of the change?

@magento-engcom-team magento-engcom-team added the Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed label Nov 29, 2018
@magento-engcom-team
Copy link
Contributor

Hi @vbuck. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me $VERSION instance

where $VERSION is version tags (starting from 2.2.0+) or develop branches (for example: 2.3-develop).
For more details, please, review the Magento Contributor Assistant documentation.

@vbuck do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

  • yes
  • no

@ghost ghost self-assigned this Nov 30, 2018
@magento-engcom-team
Copy link
Contributor

magento-engcom-team commented Nov 30, 2018

Hi @engcom-backlog-nazar. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).

    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.

  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 4. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento-engcom-team give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 5. Verify that the issue is reproducible on 2.2-develop branch.

    Details- Add the comment @magento-engcom-team give me 2.2-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.2-develop branch, please add the label Reproduced on 2.2.x

  • 6. Add label Issue: Confirmed once verification is complete.

  • 7. Make sure that automatic system confirms that report has been added to the backlog.

@ghost ghost added the Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed label Nov 30, 2018
@ghost
Copy link

ghost commented Nov 30, 2018

Hi @vbuck , thank you for your report. Please follow these guidelines for proper tracking of your issue. You can report Commerce-related issues in one of two ways:
You can use the Support portal associated with your account
or
If you are a Partner reporting on behalf of a merchant, use the Partner portal.

GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub.

@ghost ghost closed this as completed Nov 30, 2018
@vbuck
Copy link
Author

vbuck commented Nov 30, 2018

@engcom-backlog-nazar Understood. Admittedly this issue may be a bit misdirected. However, my reason for starting a discussion in the Open Source forums was two-fold:

  • I found origin of this issue in the 2.3-develop branch, so it affects Open Source
  • I thought there might be better traction here from the community

That said, you may keep this issue closed and I will forward to the partner portal.

@ishakhsuvarov ishakhsuvarov reopened this Dec 4, 2018
@ishakhsuvarov
Copy link
Contributor

Reopening, to verify with vanilla CE instance and 500k customer accounts.

@ghost ghost added Component: Customer Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release Fixed in 2.2.x The issue has been fixed in 2.2 release line Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed labels Dec 5, 2018
@magento-engcom-team
Copy link
Contributor

@engcom-backlog-nazar Thank you for verifying the issue. Based on the provided information internal tickets MAGETWO-96971 were created

@magento-engcom-team magento-engcom-team added the Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development label Dec 5, 2018
@ghost ghost removed their assignment Dec 5, 2018
@sdzhepa
Copy link
Contributor

sdzhepa commented Dec 27, 2018

Hello @vbuck

I see that all internal tickets related to this issue were closed.
And I suppose that issue has been resolved also.

Please, feel free to reopen or create a new one if issue still exists or was not fully fixed

Thank you for feedback and collaboration

@sdzhepa sdzhepa closed this as completed Dec 27, 2018
@dambrogia
Copy link

@sdzhepa Is there any status update to what happened this ticket? You state you "suppose the issue has been resolved" but I don't see anything related to that within this thread of comments/replies. Also the tags of Reproduced on 2.3.x and Fixed in 2.2.x are quite conflicting (added on Dec 5). Tagging the issue as Ready for Work on Dec 5 and then removing the assignment on Dec 5 without any reference as to what changed is also quite confusing if this "has been resolved".

I'm using 2.3 EE and I'm seeing >1hr update times because Magento is reindexing the customer_grid index on every bin/magento setup:upgrade statement (occurs in the Magento_Customer module).

Is there a reason this needs to happen? It seems like this should not happen during the update.

The purpose of the update script is to install/update/modify schema between version.
The purpose of the indexers is to enhance lookups.

The actions seem exclusive and separate from each other. Can anyone elaborate to why this reindex is needed during the update? And if it is in fact needed, Can anyone elaborate on what we can do to enhance the performance of it?

@dambrogia
Copy link

@sdzhepa @ishakhsuvarov @magento-engcom-team

Any updates or info on this? Do I need to create a new ticket for this?

@allamsettiramesh
Copy link

I did't find any solution regarding this issue.

@ghost
Copy link

ghost commented Jan 24, 2019

Hi @dambrogia Hi @allamsettiramesh i'm reopen this as this was not fixed.
selection_287

@dambrogia
Copy link

Hi @engcom-backlog-nazar thank you for re-opening this issue.

If it's not imperative that the reindex needs to happen on every setup:upgrade command, can we remove it? I also think it would be helpful to know why/when it is appropriate to reindex the users and what the thought process was behind reindexing them on every setup:upgrade.

I would be glad to help out creating a PR for removing the recurring data script if necessary.

@andkirby
Copy link

andkirby commented May 7, 2019

@dambrogia, thank you for this note. I got 3m for magento setup:upgrade just for removing a module!
M2 so fast! =)

@magento-engcom-team magento-engcom-team removed the Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release label Jun 20, 2019
@magento-engcom-team magento-engcom-team added the Fixed in 2.4.x The issue has been fixed in 2.4-develop branch label Mar 7, 2020
@magento-engcom-team
Copy link
Contributor

Hi @vbuck, @Nazar65, @o-iegorov.

Thank you for your report and collaboration!

The issue was fixed by Magento team. The fix was delivered into magento/magento2:2.4-develop branch(es).
Related commit(s):

The fix will be available with the upcoming 2.4.0 release.

@slackerzz
Copy link
Member

The issue was about a reindex in a recurring script and the solution has a reindex in a recurring script.

@ihor-sviziev
Copy link
Contributor

@slackerzz i didn’t tested it, but seems like in most cases it will not do reindex, so performance issue is resolved. Don’t you think so?

@slackerzz
Copy link
Member

If the customer grid index is invalid it will perform a reindex during setup:upgrade and the store will be in maintenance for minutes during deploy.
If this is the Magento solution i will update my patch to remove the new RecurringData script.

@ihor-sviziev
Copy link
Contributor

ihor-sviziev commented Mar 9, 2020 via email

@o-iegorov
Copy link
Contributor

@slackerzz Natural state of customer grid index is valid. Reindex will be performed only in case if it is invalid that seems reasonable. If you have enableb cron (that also natural for prod environments) reindex wil be performed in bacground for invalid index, so case when you will perform setup:upgrade with invalid index is very rare and for this case reindex will be performed (that is ok for that case)

@davidalger
Copy link
Member

If you have enableb cron (that also natural for prod environments) reindex wil be performed in bacground for invalid index, so case when you will perform setup:upgrade with invalid index is very rare and for this case reindex will be performed (that is ok for that case)

@o-iegorov I'm nearly certain this portion of your recent statement is factually incorrect. The logic in this recurring data upgrade (the "fixed" version in 2.4) doesn't account for schedule vs realtime separately, it simply calls reindexAll when it believes the indexer should be run:
0fd8a51#diff-0ac6816ed3ec11b7a9c59731fae99d4bR43-R44

Calling reindexAll in the above will simply result in the entire index being rebuilt regardless of the indexer mode. So the penalty still exists, and it would not (although I have not tested it) result in an asynchronous execution of the indexer. Unless I'm missing some other fundamental change in 2.4 codebase regarding indexers.

@hostep
Copy link
Contributor

hostep commented Mar 29, 2020

I agree with @davidalger, an extra condition in the if should be added to prevent a synchronous reindex action when the indexer is set to schedule mode. Because it's not necessary and it will be picked up asynchronously anyways.

But this is a mini optimisation, the chances of having an invalid customer grid indexer while running bin/magento setup:upgrade aren't that big probably.

@ihor-sviziev
Copy link
Contributor

@davidalger @hostep I do agree with you that it's not perfect solution, it could be improved, but it significantly improves situation when no customer attributes were affected. Feel free to create Pull Request with improvement based on your suggestions.

@o-iegorov
Copy link
Contributor

@davidalger I din't say nothing about indexer mode. Moreover, customer grid indexer doesn't support update by schedule - https://docs.magento.com/m2/ce/user_guide/system/index-management.html It's recall reindex when state is invalid, that's correct behavior. When indexer is in invalid sate magento cron job will perform full reindex indexer with any kind of update mode.

@o-iegorov
Copy link
Contributor

@hostep please refer my prev comment - customer grid indexer doesn't support update by schedule

@davidalger
Copy link
Member

@o-iegorov Where do you see customer grid index only supports index on save? I have this index running in production today in schedule mode. This is an index using the materialized view patterns in M2, and it would make little sense for it to only support On Save.

When indexer is in invalid sate magento cron job will perform full reindex.

This may be correct, but what I'm saying is that what the upgrade does is call reindexAll which does not mark the index as invalid, it runs the index synchronously. So IF the upgrade runs while the indexer is in an invalid state, or should the indexer be marked invalid by say an upgrade routine that's adding an attribute to the customer grid, the index will still run synchronously during the upgrade rather than simply letting the cron run the reindex to cleanup the invalid state.

@ihor-sviziev I don't have time at the moment to create a PR to further enhance this on 2.4 (unfortunately). On the 2.3 project that highlighted this for me with a 40 minute grid reindex, I'm simply deleting the Recurring Data script from the customer module as a workaround. I'm mainly posting for the sake of others as I read the original comment to infer something regarding the asynchronicity of the indexer as it relates to the setup upgrade routine.

@o-iegorov
Copy link
Contributor

o-iegorov commented Mar 30, 2020

@davidalger Please read carefully provided link:
image

It's works just because it's invalidated and reindexed in background by cron job. There are no mview processor for this indexer just dummy.

Upgrade call reindexAll in case when indexer is invalid, that a rare case for production. In case when invalidation was performed by some setup script (during adding some attribute for example) reindex is should be performed. But not every setup upgrade. Reindex itself in this indexer is very tricky - for example it's creates related database tables and cannot be replaced with just invalidation. But if you know cases when current solution may be improved - please create related PR.

This fix also delivered to 2.3-develop and will be part of the 2.3.6 release

@davidalger
Copy link
Member

@o-iegorov Very interesting. I missed that note on the page. Thanks for the followup. Also g2k regarding 2.3.6 release. 👍

@dan-ding
Copy link

2.3.6?! October?

@MichaelThessel
Copy link

Too bad that it takes 6 months for a confirmed and fixed issue to be released. Not to speak of the 17 months it took to actually fix it. In case you use composer this is a quick way to patch your install.

cd vendor/magento/module-customer
curl -S https://github.com/magento/magento2/commit/0fd8a5146cdf4e524150e68f89085d90f0d42be3.diff | patch -p5 
curl -S https://github.com/magento/magento2/commit/436d0ae410101e526ac9326483788153de507f26.diff | patch -p5 

@vbuck
Copy link
Author

vbuck commented Apr 30, 2020

@MichaelThessel your solution makes it appear as though you are committing the vendor directory on your VCS, or else applying these as part of a deployment pipeline step.

I do agree that the lack of prioritization on this problem is a shame. If we didn't have a way to publish diffs on GitHub publicly maybe that would have forced an earlier release of the fix.

Anyway, for those who don't commit vendor to VCS (like me), I would suggest converting Michael's steps to fit your specifications; ie:

Alternative Patch Method

  1. Fetch diffs as per commits described here: 2.3 Customer module Recurring setup script performance problems. #19469 (comment)
  2. Commit as patch files on your VCS
  3. If using Composer, follow cweagans method as per: https://devdocs.magento.com/guides/v2.3/comp-mgr/patching.html
  4. If using Magento Cloud, place patch files into m2-hotfixes, as per: https://devdocs.magento.com/cloud/project/project-patch.html

@MichaelThessel
Copy link

MichaelThessel commented May 1, 2020

@vbuck Thanks for pointing this out. I wasn't aware of the possibility to patch with composer. I went down the route you suggested and it works great. In case someone wants to implement this and has their Magento core modules in vendor here is the patch with the paths corrected:

https://gist.github.com/MichaelThessel/0b0cf69dd20326491115413adf7a94b9

@LiamTrioTech
Copy link

Still a problem in 2.3.5 btw.
Upgrade to 2.4.x?

@hostep
Copy link
Contributor

hostep commented Aug 21, 2020

@LiamTrioTech: have you read this comment and this comment? It says it is fixed in 2.4.0 and will be fixed in 2.3.6 as well.

@adrian-martinez-interactiv4
Copy link
Contributor

Hi, we've recently run into this issue within a Magento installation with 1100K customers. I've been investigating, and this is what I found, just in case it is useful for someone.

I know this issue is related with setup:upgrade performance related with customer_grid indexer, and this comment is about customer_grid indexer inner performance, but since it affects also setup:upgrade when reindexing all, I thought it would make sense to post it here.

About this comment:

It's works just because it's invalidated and reindexed in background by cron job. There are no mview processor for this indexer just dummy.

Although it's true it has only a dummy mview, indexer does not get invalidated and reindexed by cron job, but synchronously upon Customer and Customer Address save, at \Magento\Customer\Model\Customer::reindex and \Magento\Customer\Model\Address::reindex, respectively. Index only gets invalidated when customer attribute is added and used in grid / modified and used in grid changed / deleted and used in grid, so a full reindex is needed to rebuilt the grid table properly.

At https://support.magento.com/hc/en-us/articles/360025481892-New-customer-records-are-not-displayed-in-the-Customers-grid-after-importing-them-from-CSV it says customer_grid index is not supported by "Update by schedule" due to performance reasons, but it does not specify any detail.

Digging a little deeper, we arrive soon at https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/Customer/Model/Indexer/Source.php, the data source provider for customer grid data. It provides an iterator to supply data to be indexed:

    /**
     * Retrieve an iterator
     *
     * @return Traversable
     */
    public function getIterator()
    {
        $this->customerCollection->setPageSize($this->batchSize);
        $lastPage = $this->customerCollection->getLastPageNumber();
        $pageNumber = 1;
        do {
            $this->customerCollection->clear();
            $this->customerCollection->setCurPage($pageNumber);
            foreach ($this->customerCollection->getItems() as $key => $value) {
                yield $key => $value;
            }
            $pageNumber++;
        } while ($pageNumber <= $lastPage);
    }

Benchmarking this method, we found that at each step, execution time increases a bit. After many steps, time elapsed at each step can be increased even by 10x. Taking a quick look at the code shows the issue here.

At each step, the same query is performed to retrieve data, with different sql LIMIT offset values. Having LIMIT [offset,] row_count, assuming a batch size of 10000, consecutive queries would look something like (very simplified):

  • SELECT * FROM huge_table LIMIT 0, 10000
  • SELECT * FROM huge_table LIMIT 10000, 10000
  • SELECT * FROM huge_table LIMIT 20000, 10000
  • (...)
  • SELECT * FROM huge_table LIMIT 1090000, 10000
  • SELECT * FROM huge_table LIMIT 1100000, 10000

Mysql starts building query results, and returns them as soon as it has the needed number of them. It is easy for the first query, but for the last one, it has to generate internally (due to joins, ordering, etc) the offset + 10000 results, to return only the last 10000, discarding the offset results. In short:

  • Step 1: Mysql generate 10000 results, returns 10000 results.
  • Step 2: Mysql generate 20000 results, returns 10000 results.
  • Step 3: Mysql generate 30000 results, returns 10000 results.
    (...)
  • Step 109: Mysql generate 1090000 results, returns 10000 results.
  • Step 110: Mysql generate 1100000 results, returns 10000 results.

A real example, using a query generated by the indexer, note the offset and the elapsed time:
Captura de pantalla 2020-08-25 a las 5 21 01
Captura de pantalla 2020-08-25 a las 5 22 16

1.7 ms vs 20.4 s is a huge difference. Our solution looks like this:

    /**
     * Retrieve an iterator
     *
     * @return Traversable
     */
    public function getIterator()
    {
        $customerIdLastPage = ceil($this->count() / $this->customerIdsBatchSize);

        if (0 < $customerIdLastPage) {
            $customerCollection = clone $this->customerCollection;
            $customerIdPageNumber = 0;

            do {
                $customerIds = $this->customerCollection->getAllIds($this->customerIdsBatchSize, $customerIdPageNumber * $this->customerIdsBatchSize);

                foreach (array_chunk($customerIds, $this->batchSize) as $customerIdsChunk) {
                    $customerCollection->clear();
                    $customerCollection->resetData();
                    $customerCollection->getSelect()->reset(\Magento\Framework\DB\Select::WHERE);
                    $customerCollection->addFieldToFilter($this->getIdFieldName(), ['in' => array_map('intval', $customerIdsChunk)]);

                    foreach ($customerCollection->getItems() as $key => $value) {
                        yield $key => $value;
                    }
                }

                $customerIdPageNumber++;

            } while ($customerIdPageNumber <= $customerIdLastPage);
        }
    }

Explained:

  • We take advantage of $this->customerCollection having filters already applied, to retrieve which customer ids should be affected by reindex.
  • We split customer ids retrieval in chunks. We have set customerIdsBatchSize to 100000, to avoid to retrieve 1100K customer ids at once.
  • For each chunk of customer ids, we split that again in chunks of batch size, to return data in batches of that size.
  • Once we know the customer ids we have to get data for, we can remove the collection WHERE part, and replace it by the customer ids to be processed. This is possible due to filters had been already applied when retrieving customer ids.
  • Using customer id in WHERE clause allows to use column index to perform search faster, and avoids Mysql having to generate unneeded results due to offset.
  • Out of this code, we have reduce batch size from 10000 to 100, to avoid locking tables for a long time (in this case we prefer to query more often, once solved the query issue), and to generate less customer items at the same time (10000 => 100) to try to reduce memory usage (it also seems to help with execution speed, but I'm not 100% sure of this). This is optional.

Result for us is as steps get executed, execution time for each of them remains almost the same. This may not make a difference for small reindexing, but it really does for databases with large customer tables.

Also, we've implemented the mview system and "Update by schedule" for this indexer separately; we'll check how that works, and find what performance issues are those which weren't explained at Magento page. I'll let you know if I find something new about that.

@ihor-sviziev
Copy link
Contributor

@adrian-martinez-interactiv4 amazing job! This is really huge step forward!

@o-iegorov could you review following comment #19469 (comment)? Could you bring us more info which performance issues it were causing when used update by schedule for customer grid index? maybe we as a community could fix it?

@xxorax
Copy link

xxorax commented Sep 7, 2020

Does this fix will be realease in the 2.3 branch ?
It just need to apply the patch I guess.

@ageffray
Copy link

ageffray commented Dec 4, 2020

Any news for this ? It's a huge thing, I've got performance issue because of this on many large projects

@jonathanribas
Copy link

We also have this issue at Zadig & Voltaire.

It's just crazy to recreate the whole customer grid flat table at every deploy we do every day.

@fredden
Copy link
Member

fredden commented Dec 4, 2020

From what I read this is solved in versions 2.3.6 and 2.4.0. If you're running versions below these, I'd recommend upgrading.

@xpoback
Copy link
Contributor

xpoback commented Mar 7, 2021

So when the customer grid in admin needs a reindex, the whole world should wait?
Some data in an admin grid might look outdated so let's make the downtime of the whole shop longer?

I've read the thread but still, why not to schedule the indexer so that it can later be run asynchronously instead of executing it directly in the upgrade script? The upgrade itself takes seconds to get executed but all those Recurring.php take dozens of seconds or sometimes minutes. I agree that some stuff must be checked to keep the DB consistent but does it make sense, that the shop is in maintenance mode only because something in a grid or in a sales report has changed?

Whatever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Customer Fixed in 2.2.x The issue has been fixed in 2.2 release line Fixed in 2.4.x The issue has been fixed in 2.4-develop branch Issue: Clear Description Gate 2 Passed. Manual verification of the issue description passed Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Issue: Format is valid Gate 1 Passed. Automatic verification of issue format passed Issue: Ready for Work Gate 4. Acknowledged. Issue is added to backlog and ready for development Reproduced on 2.3.x The issue has been reproduced on latest 2.3 release
Projects
None yet
Development

Successfully merging a pull request may close this issue.