Skip to content

Add a Snapshot Repository page to the Operate section #518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

pcholakov
Copy link
Contributor

@pcholakov pcholakov commented Feb 4, 2025

Builds on #515

Closes #509

Copy link

cloudflare-workers-and-pages bot commented Feb 4, 2025

Deploying documentation with  Cloudflare Pages  Cloudflare Pages

Latest commit: 11bbbd8
Status: ✅  Deploy successful!
Preview URL: https://4e3a2183.documentation-beg.pages.dev
Branch Preview URL: https://feat-snapshot-repository.documentation-beg.pages.dev

View logs

@pcholakov pcholakov force-pushed the feat/snapshot-repository branch 2 times, most recently from 595462f to 24f4225 Compare February 4, 2025 15:27
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well written snapshot documentation @pcholakov 👏. It reads really nicely! LGTM. +1 for merging.

Before merging, could you fix the link to the snapshotting documentation on the cluster deployment page?

When a node starts up with pre-existing partition state and finds that the partition's log has been trimmed to a point beyond the most recent locally-applied LSN, the node will attempt to download the latest snapshot from the configured repository. If a suitable snapshot is available, the processor will re-bootstrap its local state and resume applying the log.

<Admonition type="note" title="Handling log trim gap startup errors">
If you observe repeated `Shutting partition processor down because it encountered a trim gap in the log.` errors in the Restate server log, that is an indication that a processor is failing to start up due to missing log records. In order to record, you must ensure that a snapshot repository is correctly configured and accessible from the the node experiencing errors. If no snapshots were taken previously, you will need to configure another node which has the necessary state, to first publish a snapshot for the affected partition(s).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"In order to resolve this failure, you must ..."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I meant to write "recover" there 😅 Yours reads better, thank you!

@pcholakov pcholakov force-pushed the feat/snapshot-repository branch 2 times, most recently from 6cc3901 to a4c3b84 Compare February 5, 2025 10:45
@pcholakov pcholakov force-pushed the feat/snapshot-repository branch from a4c3b84 to 7a761be Compare February 5, 2025 10:48
Base automatically changed from feat/cluster-into to main February 5, 2025 13:32
@pcholakov pcholakov force-pushed the feat/snapshot-repository branch from 7a761be to 11bbbd8 Compare February 5, 2025 13:34
@pcholakov
Copy link
Contributor Author

Well written snapshot documentation @pcholakov 👏. It reads really nicely! LGTM. +1 for merging.

Before merging, could you fix the link to the snapshotting documentation on the cluster deployment page?

Thanks @tillrohrmann - I split the initial page in two, a short one on configuring the repository that I put under /deploy/snapshotting as you originally might have envisioned, while /operate/logs-and-snapshots covers primarily how logs trimming works together with snapshotting. Going to merge this without waiting for another review but happy to address anything as a follow up!

@pcholakov pcholakov merged commit 14d2392 into main Feb 5, 2025
2 checks passed
@pcholakov pcholakov deleted the feat/snapshot-repository branch February 5, 2025 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to configure snapshots and trimming
2 participants