Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] V2 Checkpoint implementation and Protocol are out of sync #2214

Closed
2 of 8 tasks
dhruvarya-db opened this issue Oct 20, 2023 · 0 comments
Closed
2 of 8 tasks

[BUG] V2 Checkpoint implementation and Protocol are out of sync #2214

dhruvarya-db opened this issue Oct 20, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@dhruvarya-db
Copy link
Collaborator

Bug

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Describe the problem

The implementation of V2 Checkpoint has a few inconsistencies with the Delta Spec (Thanks to @ebyhr for pointing this out). It does not write some fields in the V2 Checkpoint-related actions:

  1. flavor in checkpointMetadata action Spec Link
  2. type in sidecar action Spec Link
  3. Also, the specification requires that the sidecar’s relative file path should be specified under the field fileName in the sidecar action. But the implementation writes this under the field name path.

Given that V2 Checkpoints have not been out for long, we should update the PROTOCOL to match the implementation.

Willingness to contribute

The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?

  • Yes. I can contribute a fix for this bug independently.
  • Yes. I would be willing to contribute a fix for this bug with guidance from the Delta Lake community.
  • No. I cannot contribute a bug fix at this time.
@dhruvarya-db dhruvarya-db added the bug Something isn't working label Oct 20, 2023
tdas pushed a commit that referenced this issue Nov 14, 2023
…mplementation

Follow-up for #2214.

The V2 Checkpoint implementation does not match with what is expected in the PROTOCOL in some places.
It does not write some fields in the V2 Checkpoint-related actions:
1. flavor in checkpointMetadata
2. type in sidecar
Also,
3. The implementation writes a field called `version` (checkpoint version) in checkpointMetadata and relies on it but the PROTOCOL does not specify any such fields.
4. The PROTOCOL requires that the sidecar’s relative file path should be specified under the field `fileName` in the sidecar action. But the implementation writes this under the field name `path`.

This PR updates the specification so that it correctly reflects the implementation.

Closes #2249

GitOrigin-RevId: 39a11840e6eae8fcf24b792b39f83cf9f2cb8dd4
@tdas tdas closed this as completed Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants