Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doris 冷存HDFS频繁出现block读取失败导致SQL执行失败[Bug] #44236

Open
3 tasks done
fdf1779 opened this issue Nov 19, 2024 · 1 comment
Open
3 tasks done

Comments

@fdf1779
Copy link

fdf1779 commented Nov 19, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.1.2

What's Wrong?

使用Doris进行冷备,数据量过大,Doris查询并发较高,在读取时频繁出现missing blocks的错误,但是实际上HDFS的block是正常的,只是在读取并发过大的时候出现了错误,如果每次减少查询分区,分多次查询就可以正常返回,因此确定并非HDFSblock块丢失,HDFS也未显示有块丢失,而是并发较高时读取出现异常后Doris直接进行失败返回。
image

What You Expected?

1、增加Doris冷存HDFS时的写入失败重试最好是针对HDFS block级别的
2、增加在冷存HDFS完成的tablet进行完整性校验,保证上报的完整性
3、增加冷存HDFS的失败tablet自动清除
4、增加冷存HDFS block块级别的读取异常的自动重试功能,最好是允许配置最大重试次数
5、允许用户配置tablet丢失在允许数据缺失的情况下返回剩余部分的数据。类似于spark的spark.sql.files.ignoreCorruptFiles=true

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@fdf1779
Copy link
Author

fdf1779 commented Nov 19, 2024

2、增加在冷存HDFS完成的tablet进行完整性校验,保证上报的完整性
还有就是在冷存HDFS时,如果冷存前是多副本存储在冷存后会出现HDFS和Doris各存在一个副本的情况,因此希望能够自动去重本地磁盘的副本,只保存HDFS远端存储的文件,而且实际的be元数据最好还是保留多副本(在两个节点保留元数据)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant