Skip to content

Conversation

@yyanyy
Copy link
Contributor

@yyanyy yyanyy commented Apr 27, 2021

In addition to fixing the NPE problem, this PR attempts to follow the current approach (of nullable containsNaN) to update spec with the current state. We can continue the discussion of how to handle it here and I'll make sure to reflect it in the actual spec before merge.

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right approach for this fix, one of the Spec changes seems unrelated to this issues (field field) but I have no problem with fixing typos in the same pr.

@chenjunjiedada
Copy link
Collaborator

Should we add a unit test to read a v1 manifest file?

@RussellSpitzer
Copy link
Member

Should we add a unit test to read a v1 manifest file?

I would also like to see this, but I think we probably need a dedicated backwards compatibility suite somewhere

@jackye1995
Copy link
Contributor

we probably need a dedicated backwards compatibility suite somewhere

+1. As we are thinking about v3 now, I am trying to implement this in a way that can be flexible for future spec versions, will send a PR about this soon.

@yyanyy
Copy link
Contributor Author

yyanyy commented Apr 27, 2021

Thank you for the quick review and feedback everyone!

Regarding backward compatibility tests for reading manifest files, I think there are a few things worth mentioning:

  1. Iceberg does have dedicated tests on verifying backwards compatibility for both manifest file and manifest list.
  2. even with these backward compatible tests, this issue can still occur (I did update the test in the original PR). The reason is I think these are two separate issues:
    • My understanding of the problem that's fixed by this specific PR is not related to generally reading of the manifest files/manifest list itself, but rather the display of a metadata table manifests table which is a separate logic.
    • The problem before the fix was that the schema we specified for this manifest table is wrong comparing to the correct specification we have for the actual requirement of manifest files here, and this mismatch caused the NPE problem since we require an originally nullable field to be not null.
    • If we reused the schema construct we defined from ManifestFile (which is the source of truth in persisting manifest list file) in ManifestsTable we can avoid this problem completely, but currently ManifestFile doesn't want to expose them out, and has to specifically define the ID to ensure there's no duplication. I think we can still reuse them if we want, but that requires some code changes, and the current approach of duplicating definition of the same concepts in multiple places seems like a more widely adopted approach across Iceberg project.
    • I think the ultimate way of preventing this specific problem is to create tests around metadata tables, and for this specific problem, we create v1 and v2 normal table, write data to them, and then query the manifest table and ensure output meets expected. However I think we don't have any test around metadata tables today.

I think the follow up item after this PR should be to create an issue to add metadata tables tests to make sure they don't break when reading different versions of tables. Any comment/feedback/suggestions?

Copy link
Collaborator

@chenjunjiedada chenjunjiedada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to follow up with a separated PR.

@RussellSpitzer
Copy link
Member

I was wondering about the other NPE in #2495, i was hoping that would have been caught by some backwards compatibility tests. I also have no problem with doing more testing and such in another pr

@yyanyy
Copy link
Contributor Author

yyanyy commented Apr 28, 2021

I was wondering about the other NPE in #2495, i was hoping that would have been caught by some backwards compatibility tests. I also have no problem with doing more testing and such in another pr

I think the hard part is that this specific change (and some other spec changes) is adding a field regardless of table versions, and to ensure backward compatibility which means to test on a table before this logic is introduced, we need to have a fine grain control of creating these metadata files to mimic the old behavior when creating the table, so that we can drop things introduced in the new logic. And this basically means to create the table from scratch.

  • Tests like TestManifestListVersions covers part of that but it only test the basic interaction, no the end to end table operation exists in such tests that could catch problem in ManifestFileUtil which is used during table commit.
  • Some tests extending TableTestBase in core package try to test backward compatibility by running tests with table v1 and v2 format, but again they won't capture logic introduced in all versions of the spec.

I think a way of achieving it could be to add resource files (including data file and all metadata files all the way to the json file) with all optional/later introduced fields missing, and these resource files could be readily readable as an existing table (instead of creating tables via code as those tests currently do). And then we let tests that extend TableTestBase to interact with this table. I've created an issue to track this: #2542


I'm merging in this change to unblock bug fixes, but please feel free to continue the discussion here/in #2542 or ping me directly on slack. Thank you again @chenjunjiedada @RussellSpitzer @jackye1995 for the review and quick response! And thanks Jack for creating the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NPE when reading metadata for manifests.

4 participants