Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not use pgcompacttable? #30

Open
piskvorky opened this issue Feb 13, 2020 · 1 comment
Open

Why not use pgcompacttable? #30

piskvorky opened this issue Feb 13, 2020 · 1 comment

Comments

@piskvorky
Copy link

piskvorky commented Feb 13, 2020

We're considering enabling pgcompacttable in our production DB, to reclaim disk space. I know this is a strange question, but what are the downsides (if any) to using pgcompacttable?

Why wouldn't one want to enable this extension? What are some common pitfalls / cons?

I guess there must be some, or else this would be already built into postgres, enabled by default 🤓

@Melkij
Copy link
Member

Melkij commented Feb 13, 2020

I guess there must be some, or else this would be already built into postgres, enabled by default.

Because it require much more work and much more time. Any nontrivial built-in feature in postgresql is time consuming. You can find my name on the reviewers list of "reindex concurrently" feature in last year (this feature was merged in pg12). In overall, it was in development and under discussions since something near 2012.

Well, not remember active discussion about non-blocking rewrite of the tables in pgsql-hackers list. A feature waits for someone to become its author.

Also pgcompacttable is not an extension. This is an external tool and runs plain SQL. pgcompacttable uses our knowledge of the pg implementation details to do its job and does not modify the database at a low level. We perform useless updates on the table - as a side effect, pg will write new tuple version in an empty place near the beginning of the table. We use regular vacuum - at the end of vacuum PG will truncate empty pages at the end of the table. We use create index concurrently and drop index concurrently to rebuild indexes - exactly same thing as reindex concurrently does for now. (reindex concurrently is cheaper, it can process primary keys with dependencies)

Why wouldn't one want to enable this extension? What are some common pitfalls / cons?

No one wants to use pg_repack or pgcompacttable when there is no problem with bloat =) Avoid long transactions on primary, avoid long transactions on hot_standby_feedback=on replicas; aggressive enough autovacuum settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants