Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect index->reltuples after VACUUM #217

Merged
merged 2 commits into from
Oct 9, 2023

Conversation

lss602726449
Copy link
Contributor

@lss602726449 lss602726449 commented Sep 22, 2023

closes: #214


Change logs

We enforce all QE to update stats after vacuum in the index, and all QD will collect stats and update the pg_class. Otherwises, only some QE will update stats and send back to QD which left index stats in incomplete state.

Why are the changes needed?

Describe why the changes are necessary.

Does this PR introduce any user-facing change?

no

How was this patch tested?

cherrypick

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • Learn the coding contribution guide, including our code conventions, workflow and more.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳

Given the scenario of updating stats during VACUUM:

1) coordinator vacuums and updates stats of its own;
2) then coordinator dispatches vacuum to segments;
3) coordinator combines stats received from segments
   to overwrite the stats of its own.

Because upstream introduced a feature which could skip
full index scan uring cleanup of B-tree indexes when
possible (refer to: postgres/postgres@857f9c3),
there was a case in QD-QEs distributed deployment that
some QEs could skip full index scan and stop updating
statistics, resulting in QD being unable to collect all
QEs' stats thus overwrote a paritial accumulated value
to index->reltuples. More interesting, it usually happened
starting from the 3rd time of consecutively VACUUM after
fresh inserts due to above skipping index scan criteria.

This patch is intended to prevent from above issue by two aspects:
on QE, do not skip full index scan, report current statistics
to QD as requested;
on QD, restrict updating statistics only when collecting all QEs'
data completely, if the reporting QE number doesn't match
the total number of dispatched QEs, no stats update happens.
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ lss602726449
❌ haolinw
You have signed the CLA already but the status is still pending? Let us recheck it.

my-ship-it
my-ship-it previously approved these changes Sep 22, 2023
avamingli
avamingli previously approved these changes Sep 22, 2023
There is conflict in _bt_vacuum_needs_cleanup between CloudberryDB and
GPDB. We update the _bt_vacuum_needs_cleanup in CloudberryDB. We enforce
update stats when Gp_role is GP_ROLE_EXECUTE, and set estimated_count to
false. which in the scan_index  We call vac_update_relstats to send back
stats to QE.

Fix serveral tests due to update stats after vacuum. We keep these tests
result same with GPDB.
Copy link
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@my-ship-it my-ship-it merged commit 70ad653 into apache:main Oct 9, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] reltuples in pg_class may be wrong for index
4 participants