Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Price indexer slow for configurable BaseStockStatusSelectProcessor #33763

Closed
1 of 5 tasks
webtekindo opened this issue Aug 12, 2021 · 20 comments
Closed
1 of 5 tasks

Price indexer slow for configurable BaseStockStatusSelectProcessor #33763

webtekindo opened this issue Aug 12, 2021 · 20 comments
Assignees
Labels
Issue: needs update Additional information is require, waiting for response Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it

Comments

@webtekindo
Copy link

webtekindo commented Aug 12, 2021

Preconditions (*)

  1. Magento 2.4.2
  2. Catalog with lot of configurable products

Steps to reproduce (*)

  1. Reindex catalog_product_price

Expected result (*)

  1. Fast

Actual result (*)

  1. Slow (in my case it never ends)

While monitoring the queries while reindexing, I notice these queries took long time:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

Notice the where condition:

( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 ))

This is very slow, it returns (in my case 70.000+ products) the same products over and over for all the configurable product that this query is executed for.

It should be:

(( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648, 171655 ))

With this solution the result are only about the mention configurable product in entity_id.

Looks like the issue is in BaseStockStatusSelectProcessor

We have 300.000+ products and around 30.000+ configurable products. With this fix we can reindex in 9 minutes, without it we never manage to reindex until success.


Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

  • Severity: S0 - Affects critical data or functionality and leaves users without workaround.
  • Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
@m2-assistant
Copy link

m2-assistant bot commented Aug 12, 2021

Hi @webtekindo. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, please, review the Magento Contributor Assistant documentation.

Please, add a comment to assign the issue: @magento I am working on this


⚠️ According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@m2-assistant
Copy link

m2-assistant bot commented Aug 23, 2021

Hi @engcom-Lima. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).

    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.

  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 4. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 5. Add label Issue: Confirmed once verification is complete.

  • 6. Make sure that automatic system confirms that report has been added to the backlog.

@engcom-Lima
Copy link
Contributor

Hi @webtekindo,

Since the number of configurable products on 2.4 develop instance with sample data is not as much so wanted to check with you if there is any other way we can reproduce this on Magento 2.4-develop instance ?

Here is what we tried as of now:

  1. We reindexed catalog_product_price with sample data and reindexing worked fine.

So please check on 2.4 develop instance if you are getting the same issue with your data ?

@engcom-Lima engcom-Lima added the Issue: needs update Additional information is require, waiting for response label Aug 23, 2021
@engcom-Lima engcom-Lima added the Issue: Cannot Reproduce Cannot reproduce the issue on the latest `2.4-develop` branch label Aug 23, 2021
@webtekindo
Copy link
Author

webtekindo commented Aug 24, 2021

@engcom-Lima if you run the index this will generate queries like this:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

Where entity_id is from a configurable product. Just replace the entity_id by one configurable product from sample data, and execute the query , you will notice the result is incorrect. It will return lot of simple product not related to the configurable choosen.

See my result: https://prnt.sc/1qghqbu (these are not related to my configurable product)

It should be:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE (( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648, 171655 )) GROUP BY le.entity_id, customer_group_id, website_id

See my result: https://prnt.sc/1qghqz1 (these are correctly related to my configurable product)

The AND operator have higher precedence than the OR in MySQL.

This will improve indexing by a lot.

Hope this is clear.

@engcom-Lima
Copy link
Contributor

Hi @webtekindo,

Is your issue anywhere related to #22516 ? Because it seems like duplicate and it was fixed. Please check.

Also please advise how much time your server is taking and also about your server configurations.

Thanks

@webtekindo
Copy link
Author

webtekindo commented Sep 8, 2021

@engcom-Lima it's not related, we don't use MSI.

You can try edit:

https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/ConfigurableProduct/Model/ResourceModel/Product/Indexer/Price/BaseStockStatusSelectProcessor.php#L66

And "echo" the query

Reindex price you will see something like this:

SELECT le.entity_id, i.customer_group_id, i.website_id, MIN( final_price ), MAX( final_price ), MIN( tier_price ) FROM catalog_product_index_price_replica AS i INNER JOIN catalog_product_super_link AS l ON l.product_id = i.entity_id INNER JOIN catalog_product_entity AS le ON le.entity_id = l.parent_id INNER JOIN cataloginventory_stock_item AS si ON si.product_id = l.product_id INNER JOIN cataloginventory_stock_item AS si_parent ON si_parent.product_id = l.parent_id WHERE ( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 )) GROUP BY le.entity_id, customer_group_id, website_id

This query is not correct as I mentionned above. Just try that query in your database and you will quickly understand why it's not correct.

( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 )

in MySQL means

( si.is_in_stock = 1 ) OR ( ( si_parent.is_in_stock = 0 ) AND ( le.entity_id IN ( 171648 ) )

and it should be:

(( si.is_in_stock = 1 ) OR ( si_parent.is_in_stock = 0 )) AND ( le.entity_id IN ( 171648 )

@engcom-Lima
Copy link
Contributor

Hi @webtekindo,

The issue which I referred also didn't have MSI installed. And I tried reindexing the price by printing the query which you referred in last comment but not able to get the same query as you have provided.

Anyways, in order to do some further analysis please provide the below details also:

  1. Your server configurations.
  2. How much time your server spent on reindexing 30000+ configurable products ?

@webtekindo
Copy link
Author

webtekindo commented Sep 14, 2021

Hi @engcom-Lima,

Very strange you don't get the same query as it's literally what is standing here:

https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/ConfigurableProduct/Model/ResourceModel/Product/Indexer/Price/BaseStockStatusSelectProcessor.php#L63

That function add some join to the query, and if Magento add more Join to the same query or custom, then we will have a situation where the AND operator have higher precedence than the OR in MySQL, so this should be fixed anyway even not reproducible in default Magento. But in our case I don't think we have something overiding the default logic of the indexer.

Did you enable show out of stock?

Our server configuration is 32 CPU cores and 42G of RAM, we have more than 30.000 configurable product, but note that some of our configurable product have more than 30.000 variants in it (which is not a classic situation).

Without the fix implemented it takes forever, I don't know how long because it was never ending for sure more than 4 hours. With the fix implemented it takes around 10 minutes.

@engcom-Lima
Copy link
Contributor

Hi @webtekindo,

Thank you for your input but I won't be able to confirm this issue until I reproduce this on my system.

Please check below screenshot of the query which I am getting after I am printing the query in BaseStockStatusSelectProcessor class as you suggested. In the below screenshot you can see I am not getting AND condition as you are getting.

Screenshot 2021-09-22 at 3 15 17 PM

Please let me know if you did anything differently or I missed anything ? Also can you please confirm if you are getting this issue on 2.4-develop version also or getting on 2.4.2 only ?

Thanks

@engcom-Lima
Copy link
Contributor

Hi @webtekindo,

We have noticed that this issue has not been updated for some time. Since we are not able to reproduce it, we are closing it. Please raise a fresh ticket or reopen this ticket if you need more assistance on this.

Regards

@steven-hoffman-jomashop
Copy link

Hi @engcom-Lima,

We see this issue on 2.4.3:

INSERT INTO `catalog_product_index_price_cfg_opt_temp`
SELECT `le`.`entity_id`, `i`.`customer_group_id`, `i`.`website_id`, MIN(final_price), MAX(final_price), MIN(tier_price)
FROM `catalog_product_index_price` AS `i`
INNER JOIN `catalog_product_super_link` AS `l` ON l.product_id = i.entity_id
INNER JOIN `catalog_product_entity` AS `le` ON le.row_id = l.parent_id AND (le.created_in <= '1637600400' AND le.updated_in > '1637600400')
INNER JOIN `cataloginventory_stock_item` AS `si` ON si.product_id = l.product_id
INNER JOIN `cataloginventory_stock_item` AS `si_parent` ON si_parent.product_id = l.parent_id
WHERE (si.is_in_stock = 1) OR (si_parent.is_in_stock = 0) AND (le.entity_id IN (502243, 251917, 251927, 691473, 691475, 691476, 691507, 691505))
GROUP BY `le`.`entity_id`,`customer_group_id`,`website_id` 
ON DUPLICATE KEY UPDATE `min_price` = VALUES(`min_price`), `max_price` = VALUES(`max_price`), `tier_price` = VALUES(`tier_price`)

You can see that the WHERE clause is WHERE X OR Y AND Z; as per mysql docs AND is higher precedence then OR.
Thus the above WHERE clause WHERE X OR Y AND Z is really: WHERE X OR (Y AND Z) which is not correct.

You can see that \Magento\ConfigurableProduct\Model\ResourceModel\Product\Indexer\Price\Configurable::fillTemporaryOptionsTable adds a where clause on the le.entity_id.

Also, you can see that the $this->baseSelectProcessor->process($select) call is made before the where adds the le.entity_id filter. (Line 214 vs line 229).
You should be able to 'reproduce' by echoing the select after line 230; though I am not 100% of the sequence needed to hit line 229. (Full reindex vs partial).
(You definitely need a configurable with options)


Also, in general may I recomend that you use the dev:query-log:enable command to track all query made to the DB?

@steven-hoffman-jomashop

@engcom-Lima, How can we re-open this issue?

@engcom-Echo engcom-Echo reopened this Jan 4, 2022
@m2-community-project m2-community-project bot removed Issue: Cannot Reproduce Cannot reproduce the issue on the latest `2.4-develop` branch Issue: needs update Additional information is require, waiting for response labels Jan 4, 2022
@m2-assistant
Copy link

m2-assistant bot commented Jan 4, 2022

Hi @engcom-Echo. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

    1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
      DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.
    1. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.
    1. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.
    1. Verify that the issue is reproducible on 2.4-develop branch
      Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
      - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
      - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

@engcom-Echo
Copy link
Contributor

Hi @steven-hoffman-jomashop,

Thank you for providing this info.

Based on it, I enabled db logging and did reindexing after enabling 'Display Out of Stock Products' from Admin Panel. I checked all queries that were made to db in logs for Magento 2.4-develop and the query that I got is below:

INSERT INTO `catalog_product_index_price_cfg_opt_temp` SELECT `le`.`entity_id`, `i`.`customer_group_id`, `i`.`website_id`, 
MIN(final_price), MAX(final_price), MIN(tier_price) FROM `catalog_product_index_price_replica` AS `i`
INNER JOIN `catalog_product_super_link` AS `l` ON l.product_id = i.entity_id
INNER JOIN `catalog_product_entity` AS `le` ON le.entity_id = l.parent_id
INNER JOIN `cataloginventory_stock_status` AS `child_stock_default` ON child_stock_default.product_id = l.product_id
INNER JOIN `cataloginventory_stock_status` AS `parent_stock_default` ON parent_stock_default.product_id = le.entity_id
WHERE (le.entity_id IN (69987,69988,.......)) AND (child_stock_default.stock_status = 1 OR parent_stock_default.stock_status = 0) GROUP BY `le`.`entity_id`,`customer_group_id`,`website_id` 
ON DUPLICATE KEY UPDATE `min_price` = VALUES(`min_price`), `max_price` = VALUES(`max_price`), `tier_price` = VALUES(`tier_price`)

Above query is made with 1000+ configurable products but I am commenting based on the query logic which is in question. So, as you can see AND OR combination is already correct as per above query to DB. For confirming this, you can also check the same thing in Magento latest version or you can also comment if I missed anything in finding the required query. Accordingly I will check if this issue is there in the older versions like 2.4.2 and 2.4.3.

Thanks

@engcom-Echo engcom-Echo added the Issue: needs update Additional information is require, waiting for response label Jan 10, 2022
@engcom-Echo engcom-Echo added the Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it label Jan 10, 2022
@steven-hoffman-jomashop

Hi @engcom-Echo,

It seems that the codebase was recently changed in commit 8fe11d7#diff-1d5f1e397520e3931003c3ac5f9ca5a547e23b67640799f9ec824eeebaeaf03f.

See: (On dec 19) e8c31f5

It does appear that there was an internal ticket that was being worked on; it does not appear to be visible on this platform. (GitHub issues or pull requests)

If you can confirm that this is a bug on version 2.4.3-p1 and lower that seems to be the next helpful step.

-SH

@m2-assistant
Copy link

m2-assistant bot commented Feb 7, 2022

Hi @engcom-Hotel. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: 👇

  • 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).

    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.

  • 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • 4. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!

  • 5. Add label Issue: Confirmed once verification is complete.

  • 6. Make sure that automatic system confirms that report has been added to the backlog.

@engcom-Hotel
Copy link
Contributor

Hello @steven-hoffman-jomashop @webtekindo,

We will try to reproduce the issue in Magento 2.4-develop as well as 2.4.3-p1 and get back to you on this.

Thanks

@engcom-Hotel
Copy link
Contributor

Hello @webtekindo,

We have tried to reproduce this issue in Magento 2.4-develop branch with enable query log. But for me also the same query is not available there in the logs.

It might be possible that the issue is resolved in the current develop branch. We request you to please try to reproduce the issue in Magento 2.4-develop branch and let us know in case you are still able to reproduce the issue.

Thanks

@engcom-Hotel
Copy link
Contributor

Dear @webtekindo,

We have noticed that this issue has not been updated for a period of 14 Days. Hence we assume that this issue is fixed now, so we are closing it. Please raise a fresh ticket or reopen this ticket if you need more assistance on this.

Regards

@sebfie
Copy link

sebfie commented May 10, 2023

@steven-hoffman-jomashop Did you make a patch or something? Hard to find how to fix it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: needs update Additional information is require, waiting for response Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
None yet
Development

No branches or pull requests

6 participants