Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust): post commit hook (v2), create checkpoint hook #2391

Merged
merged 9 commits into from
Apr 7, 2024

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Apr 6, 2024

Description

Introduces a post commit, which can do additional actions before returning the FinalizedCommit.

Current commit hook will creates a checkpoint if it meets the condition of the interval.

Also bumping the default interval to 100 commits. 10 commits can be a bit aggressive

Related Issue(s)

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Apr 6, 2024
@ion-elgreco ion-elgreco changed the title feat(rust): post commit hook (v2) + create checkpoint feat(rust): post commit hook (v2), create checkpoint hook Apr 6, 2024
@delta-io delta-io deleted a comment from github-actions bot Apr 6, 2024
@github-actions github-actions bot added the binding/python Issues for the Python package label Apr 6, 2024
@ion-elgreco ion-elgreco marked this pull request as ready for review April 6, 2024 17:25
@ion-elgreco ion-elgreco requested a review from Blajda April 6, 2024 17:25
Comment on lines +305 to +310
#[derive(Clone, Debug, Copy)]
/// Properties for post commit hook.
pub struct PostCommitHookProperties {
create_checkpoint: bool,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this struct and just keep the config in the commit builder and commit properties for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to add support for log retention cleanup, auto compact and auto vacuüm afterwards. So perhaps still good to keep this then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound good in that case.

@Blajda
Copy link
Collaborator

Blajda commented Apr 7, 2024

Overall it looks good to me. In the future if we are going to expand on post commit hooks then we should maybe refactor so the commit code is responsible for returning the snapshot version of that delta table that was successfully committed. This would remove the burden of updating the table from each operation too.

@ion-elgreco
Copy link
Collaborator Author

ion-elgreco commented Apr 7, 2024

@Blajda that's a good idea!

That would essentially make it easier to solve the issue for concurrent writers where sometimes it tries to advance the state after a commit, but the delta of the commit version and the table version is greater than 1

@ion-elgreco ion-elgreco enabled auto-merge (squash) April 7, 2024 16:15
@ion-elgreco ion-elgreco merged commit 5eade5e into delta-io:main Apr 7, 2024
23 checks passed
Blajda pushed a commit that referenced this pull request Apr 27, 2024
# Description
We advance the state in the post commit now, so it's done in a single
location as per suggestion from @Blajda here:
#2391 (comment)

This PR also supersedes this one:
#2280

# Related Issue(s)
- fixes #2279
- fixes #2262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Checkpoints based on table config when writing data
2 participants