-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reading empty Parquet DataPage #10121
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@yingsu00 can you take a look at this fix? Thanks! |
ea56126
to
fb84bca
Compare
@yingsu00 can you take a look at this PR? |
@yingsu00, @nmahadevuni Can you review this fix as well? |
TEST_F(ParquetReaderTest, testEmptyDataPage) { | ||
const std::string sample(getExampleFilePath("snappy.parquet")); | ||
|
||
facebook::velox::dwio::common::ReaderOptions readerOptions{leafPool_.get()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should put the entire test suite under a facebook::velox::dwio::test
namespace so you don't need to specify the full path to everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this to dwio::common::ReaderOptions
. There are a couple of using namespace ...
on top. This is now consistent with the velox/dwio/dwrf/test/ReaderTest.cpp
test. Both are not under any namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned up the remaining as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@majetideepak Thanks for fixing the bug! It looks good, just one nit.
@@ -1215,3 +1208,24 @@ TEST_F(ParquetReaderTest, testLzoDataPage) { | |||
.str(), | |||
"31232"); | |||
} | |||
|
|||
TEST_F(ParquetReaderTest, testEmptyDataPage) { | |||
const std::string sample(getExampleFilePath("snappy.parquet")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename this file to empty_v2datapage.parquet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @majetideepak !
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@bikramSingh91 merged this pull request in 06c8a0c. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Resolves the compression issue 7 here #9560