Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue a checkpoint when head revision moved outside an application transaction #2139

Merged
merged 2 commits into from
Nov 21, 2024

Conversation

vroldanbet
Copy link
Contributor

@vroldanbet vroldanbet commented Nov 21, 2024

it's possible for pg_current_snapshot() (used for HeadRevision) in PG to return a new xmin:xmax outside of application transactions. For example, running ANALYZE; could bump it.

In this situation, we need to emit a Checkpoint, in case clients triggered a call to Watch API based on a different HeadRevision from what they had cached locally (e.g. relevant to compute change deltas between revisions).

The proposal is to call HeadRevision if there are no changes. This turns the Watch API heartbeat into 2 queries instead of 1, which can add an extra load to the database.

@vroldanbet vroldanbet changed the title issue a checkpoint when head revision moved outside an application tr… issue a checkpoint when head revision moved outside an application transaction Nov 21, 2024
@github-actions github-actions bot added the area/datastore Affects the storage system label Nov 21, 2024
@vroldanbet vroldanbet force-pushed the pg-checkpoint-offband-revision-bumps branch 2 times, most recently from 9d930ab to 8b9d0ae Compare November 21, 2024 18:06
@github-actions github-actions bot added the area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) label Nov 21, 2024
@vroldanbet vroldanbet force-pushed the pg-checkpoint-offband-revision-bumps branch 2 times, most recently from c7a9c69 to 5a0030d Compare November 21, 2024 18:35
@vroldanbet vroldanbet marked this pull request as ready for review November 21, 2024 18:36
@vroldanbet vroldanbet force-pushed the pg-checkpoint-offband-revision-bumps branch from 5a0030d to d664a7f Compare November 21, 2024 18:36
@vroldanbet vroldanbet self-assigned this Nov 21, 2024
Copy link
Contributor

@ecordell ecordell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work finding this! very subtle bug.

left a couple of small comments, looks good overall

internal/datastore/postgres/revisions.go Show resolved Hide resolved
internal/testserver/datastore/postgres.go Show resolved Hide resolved
pkg/datastore/test/watch.go Outdated Show resolved Hide resolved
pkg/datastore/test/watch.go Outdated Show resolved Hide resolved
@@ -197,9 +219,18 @@ func (pgd *pgDatastore) Watch(
return updates, errs
}

func (pgd *pgDatastore) getNewRevisions(ctx context.Context, afterTX postgresRevision) ([]postgresRevision, error) {
func (pgd *pgDatastore) getNewRevisions(ctx context.Context, afterTX postgresRevision, returnHeadRevision bool) ([]postgresRevision, *postgresRevision, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I think I'd find it easier to come back to this code if it were split into getNewRevisions and getNewRevisionsWithHead

Then you wouldn't have to do any nil checking for the optionalHeadRevision (because you'd just return an error from getNewRevisionsWithHead)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this method is only used in 1 place, that's the postgres watch loop. And we pass returnHeadRevision arg to it as true if the client requested datastore.WatchCheckpoints, which is optional (e.g. SpiceDB Watch API does not currently return checkpoints), so it was added as an optimization in case someone does not need checkpoints.

so that refactor would be useful if we used the method in 2 different places, and one of them needed either always head, or never head, but that's not the case - it's entirely driven by the Datastore.Watch call inputs. Does that make sense?

internal/datastore/postgres/watch.go Outdated Show resolved Hide resolved
@vroldanbet vroldanbet enabled auto-merge November 21, 2024 19:15
…ansaction

it's possible for `pg_current_snapshot()` (used for HeadRevision) in PG
to return a new xmin:xmax outside of application transactions. For
example, running `ANALYZE;` could bump it.

In this situation, we need to emit a Checkpoint, in case clients
triggered a call to Watch API based on a different HeadRevision
from what they had cached locally (e.g. relevant to compute
change deltas between revisions).

The proposal is to compute head revision in the same transaction
where we compute new revisions, but only if checkpoints were requested
as an optimization. If the transaction determines there are no new
SpiceDB transactions, we return the compute head revision, if and only
if checkpoints were requested. This Watch API poll loop into 2 queries
instead of 1, which can add an extra load to the database, but at
least it only happens when checkpoints are being requested.
@vroldanbet vroldanbet force-pushed the pg-checkpoint-offband-revision-bumps branch from d664a7f to 74dc7b2 Compare November 21, 2024 19:20
@vroldanbet vroldanbet added this pull request to the merge queue Nov 21, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 21, 2024
@vroldanbet vroldanbet added this pull request to the merge queue Nov 21, 2024
Merged via the queue into main with commit 5dc47f5 Nov 21, 2024
22 checks passed
@vroldanbet vroldanbet deleted the pg-checkpoint-offband-revision-bumps branch November 21, 2024 19:42
@github-actions github-actions bot locked and limited conversation to collaborators Nov 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants