Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to read the parquet file #10395

Closed
weixiuli opened this issue Jul 4, 2024 · 3 comments
Closed

Failed to read the parquet file #10395

weixiuli opened this issue Jul 4, 2024 · 3 comments
Labels
build triage Newly created issue that needs attention.

Comments

@weixiuli
Copy link

weixiuli commented Jul 4, 2024

Problem description

io.glutenproject.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: ( vs. )
Retriable: False
Expression: header == strings + numBytes
Context: Split [Hive: oss://•d_di/dtm=20240623/part-00994-c947690b-d0a8-4bc9-8a30-f4c45b41963b.c000.zstd.parquet 0 - 17091072] Task Gluten_Stage_1_TID_7
Top-Level Context: Same as context.
Function: prepareDictionary
File: /root/zhongqing/git/gluten/ep/build-velox/build/velox_ep/velox/dwio/parquet/reader/PageReader.cpp
Line: 425
Stack trace:
# 0  _ZN8facebook5velox7process10StackTraceC1Ei
# 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
# 3  _ZN8facebook5velox7parquet10PageReader17prepareDictionaryERKNS1_6thrift10PageHeaderE
# 4  _ZN8facebook5velox7parquet10PageReader10seekToPageEl
# 5  _ZN8facebook5velox7parquet10PageReader11rowsForPageERNS0_4dwio6common21SelectiveColumnReaderEbbRN5folly5RangeIPKiEERPKm

System information

Velox System Info v0.0.2
Commit: adc5219
CMake Version: 3.20.2
System: Linux-5.4.119-19-0009.11
Arch: x86_64
C++ Compiler: /opt/rh/gcc-toolset-9/root/usr/bin/c++
C++ Compiler Version: 9.2.1
C Compiler: /opt/rh/gcc-toolset-9/root/usr/bin/cc
C Compiler Version: 9.2.1
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt

CMake log

No response

@weixiuli weixiuli added build triage Newly created issue that needs attention. labels Jul 4, 2024
@weixiuli weixiuli changed the title Failed to read parquet file Failed to read the parquet file Jul 4, 2024
@yingsu00
Copy link
Collaborator

yingsu00 commented Jul 5, 2024

@weixiuli Will you be able to attach a Parquet file that can reproduce this error? Thanks!

@weixiuli
Copy link
Author

weixiuli commented Jul 8, 2024

@weixiuli Will you be able to attach a Parquet file that can reproduce this error? Thanks!

The PR #9223 may can fix this issue, i have cherry-pick it.

@majetideepak
Copy link
Collaborator

Please re-open if #9223 does not fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build triage Newly created issue that needs attention.
Projects
None yet
Development

No branches or pull requests

3 participants