-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multixid wraparound handling #6520
Labels
c/compute
Component: compute, excluding postgres itself
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
Comments
hlinnaka
added
c/storage/pageserver
Component: storage: pageserver
c/cloud/compute
labels
Jun 11, 2024
jcsp
added
c/compute
Component: compute, excluding postgres itself
and removed
c/cloud/compute
labels
Jun 19, 2024
hlinnaka
added a commit
that referenced
this issue
Jun 30, 2024
Whenever we see an XLOG_MULTIXACT_CREATE_ID WAL record, we need to update the nextMulti and NextMultiOffset fields in the pageserver's copy of the CheckPoint struct, to cover the new multi-XID. In PostgreSQL, this is done by updating an in-memory struct during WAL replay, but because in Neon you can start a compute node at any LSN, we need to have an up-to-date value pre-calculated in the pageserver at all times. We do the same for nextXid. However, we had a bug in WAL ingestion code that does that: the multi-XIDs will wrap around at 2^32, just like XIDs, so we need to do the comparisons in a wraparound-aware fashion. Fix that, and add tests. Fixes issue #6520 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
hlinnaka
added a commit
that referenced
this issue
Jun 30, 2024
…6528) Whenever we see an XLOG_MULTIXACT_CREATE_ID WAL record, we need to update the nextMulti and NextMultiOffset fields in the pageserver's copy of the CheckPoint struct, to cover the new multi-XID. In PostgreSQL, this is done by updating an in-memory struct during WAL replay, but because in Neon you can start a compute node at any LSN, we need to have an up-to-date value pre-calculated in the pageserver at all times. We do the same for nextXid. However, we had a bug in WAL ingestion code that does that: the multi-XIDs will wrap around at 2^32, just like XIDs, so we need to do the comparisons in a wraparound-aware fashion. Fix that, and add tests. Fixes issue #6520 Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
@hlinnaka I see a fix merged recently, is it done? Is there any way to verify that this fix was enough? |
Yes. I'll close this.
Well, the fix included tests. But that doesn't guarantee we didn't miss something of course. I think it's as good as it gets. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
c/compute
Component: compute, excluding postgres itself
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
In
walingest.rs
, we have this:That doesn't do the right thing if multixid has wrapped around. Need to do wraparound aware comparison, as iin Postgres' MultiXactAdvanceNextMXact function, or as we do for XIDs in
update_next_xid
.The text was updated successfully, but these errors were encountered: