-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Iceberg fixes for reading table metadata #2810
Conversation
3ca6c31
to
56a21c0
Compare
What more do we need here? (I think there's some more test coverage that we could use here?) |
ca9eba9
to
b4f7e5e
Compare
Added test coverage. There are issues though. If we overwrite the existing data (say running the |
testdata/generate_pyiceberg.py
Outdated
How to run this script: | ||
====================== | ||
|
||
$ python3 -m venv venv | ||
$ source venv/bin/activate | ||
$ pip install "pyiceberg[pyarrow,sql-sqlite]" | ||
$ pip install botocore | ||
$ python ./testdata/generate_pyiceberg.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the data set ends up being pretty big, and while I don't think it really matters, I think it'd be good to avoid just committing lots of test data to the repo just cause, when we have other options (e.g. uploading the data to the test bucket, or generating as part of a fixture in one of the pytests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thinking the same thing.
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
5e411ec
to
c7bdfaa
Compare
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
@tychoish made the required changes for pytest |
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great, one concern about the size of test data.
the current test failure is orthogonal to this PR and I can push through it.
is there anything that needs to be done here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
45mb feels big for a test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll upload it on GCS. We have a script to handle big data files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a simple HTTP url, added it into the fixture to download the data.
Signed-off-by: Vaibhav <vrongmeal@gmail.com>
Fixes #2603