-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot read parquet file #1515
Comments
and this is my parquet file |
Can confirm the issue with the given file:
The column path that can't be found is ["tags", "name"] but should be ["table_info", "tags", "name"]. |
The current But after the fix, you still get:
It is another issue, I think. |
Thank you so much for the help. For the new problem, should I or someone else create a new github issue or just fix under this one. |
It seems a known limitation, but I don't find related issue though. I think it is another issue, maybe you can create a new one. |
Describe the bug
I want to read the parquet I generated, which get the error when I use the "get_row_iter" api, I get this error:
thread 'main' panicked at 'called
Option::unwrap()
on aNone
value', /home/yzhao/.cargo/registry/src/github.com-1ecc6299db9ec823/parquet-11.0.0/src/record/reader.rs:132:52To Reproduce
Steps to reproduce the behavior:
This is my schema:
message table {
REPEATED group table_info {
REQUIRED BYTE_ARRAY name;
REPEATED group cols {
REQUIRED BYTE_ARRAY name;
REQUIRED INT32 type;
OPTIONAL INT32 length;
}
REPEATED group tags {
REQUIRED BYTE_ARRAY name;
REQUIRED INT32 type;
OPTIONAL INT32 length;
}
}
}
I can successfully read the parquet if I change the schema to :
message table {
REPEATED group table_info {
REQUIRED BYTE_ARRAY name;
REPEATED group cols {
REQUIRED BYTE_ARRAY name;
REQUIRED INT32 type;
OPTIONAL INT32 length;
}
}
}
Expected behavior
For my generated parquet file, I can successfully use mac's parquet-tools to read them:
Additional context
The text was updated successfully, but these errors were encountered: