Skip to content

Conversation

@sririshindra
Copy link
Contributor

While testing rollback_to_snaphot feature, we found that this is feature is failing in some cases.

In the setSetCurrentSnapshot method 'null' is being passed for timestamp which eventually causes the failure of the following precondition check. The timeStamp should also be passed along with the Snapshot just like here

This PR fixes this bug in 0.13.x. I tested this with our internal tests and also by running the unit tests in Iceberg. This issue no longer exists in master because of 4089. The current fix is ported from here

Steps to reproduce the rollback_to_snapshot bug in spark.

Each insert operation below is done separately in its own spark app.

sql("INSERT INTO %s VALUES (3, 'zzz', 1)" % table1)    
sql("INSERT INTO %s VALUES (6, 'www', 1)" % table1)
sql("INSERT INTO %s VALUES (9, 'zyz', 1)" % table1)
sql("INSERT INTO %s VALUES (2, 'yy', 2)" % table1)
sql("INSERT INTO %s VALUES (4, 'xx', 2)" % table1)
sql("INSERT INTO %s VALUES (1, 'x', 3)" % table1)`

launch a new spark shell and follow the following steps

results, _ = sql("select id, data, id2 from %s order by id2, id" % table1) # some internal code to get the results.
results, _ = sql("select snapshot_id from %s.%s.snapshots" % (DATABASE, table1))
snapshots = self.get_rows(results) # some internal code to get snapshots in a python list.
sql("CALL spark_catalog.system.rollback_to_snapshot('%s', %s)" % (table1, str(snapshots[2][0])))`

@sririshindra
Copy link
Contributor Author

@rdblue As you suggested in #4088 I opened this PR on top of 0.13.x branch to address the rollback issue. Please review it at your convenience.

if (base.snapshot(newSnapshot.snapshotId()) != null) {
// this is a rollback operation
updated = base.replaceCurrentSnapshot(newSnapshot.snapshotId());
} else if (stageOnly) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you port the changes from the other fix? That removed methods from TableMetadata rather than adding new ones. I think it would be best to keep 0.13.x and master in sync rather than having slightly different fixes.

Thanks, @sririshindra!

Copy link
Contributor

@wypoon wypoon Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue, would the following be acceptable? (from line 285)

            TableMetadata.Builder update = TableMetadata.buildFrom(base);
            if (base.snapshot(newSnapshot.snapshotId()) != null) {
              // this is a rollback or cherrypick operation
              update.setCurrentSnapshot(newSnapshot.snapshotId());
              } else if (stageOnly) {
              update.addSnapshot(newSnapshot);
            } else {
              update.setCurrentSnapshot(newSnapshot);
            }

            TableMetadata updated = update.build();
            if (updated.changes().isEmpty()) {
              // do not commit if the metadata has not changed. for example, this may happen when setting the current
              // snapshot to an ID that is already current. note that this check uses identity.
              return;
            }

I know the point of #4089 is to replace setCurrentSnapshot with setBranchSnapshot, but in 0.13.x, we don't support branching yet, and the commits #4089 builds on are not present. The above mirrors your change to SnapshotProducer in #4089.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, calling the right setCurrentSnapshot is a good way to backport this. Thanks!

@rdblue
Copy link
Contributor

rdblue commented Feb 17, 2022

Looks good! I'll merge when tests are passing.

@sririshindra
Copy link
Contributor Author

Looks good! I'll merge when tests are passing.

Thanks @rdblue

@rdblue rdblue merged commit e2af91c into apache:0.13.x Feb 17, 2022
@rdblue rdblue added this to the Iceberg 0.13.2 Release milestone Feb 17, 2022
Comment on lines +504 to +507
public TableMetadata replaceCurrentSnapshot(long snapshotId) {
return new Builder(this).setCurrentSnapshot(snapshotId).build();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was an oversight, right? We don't need this new method; it is not used.
In fact, with the above change to SnapshotProducer, even replaceCurrentSnapshot(Snapshot) is no longer used and can be removed, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. I missed this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #4155 to remove the unused methods.

wypoon added a commit to wypoon/iceberg that referenced this pull request Feb 17, 2022
vanliu-tx pushed a commit to BKBASE-Plugin/iceberg that referenced this pull request Mar 30, 2022
samarthjain pushed a commit to samarthjain/incubator-iceberg that referenced this pull request Apr 6, 2022
puchengy pushed a commit to puchengy/iceberg that referenced this pull request Apr 17, 2022
puchengy added a commit to pinterest/iceberg that referenced this pull request Apr 18, 2022
(cherry picked from commit e2af91c)

Co-authored-by: Rishi <sririshindra@gmail.com>
vanliu-tx pushed a commit to BKBASE-Plugin/iceberg that referenced this pull request May 11, 2022
sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants