[Feature]: Self-Optimizing scan files from metadata instead of from file info cache #1093

wangtaohz · 2023-02-09T09:34:16Z

Description

Self-Optimizing scan files from metadata with TableScan API.

Use case/motivation

Now, Self-Optimizing of KeyedTable and UnkeyedTable scan files from the file info cache, making the data consistency rely too much on the correctness of the file info cache, and affecting the stability of Self-Optimizing.

Describe the solution

For KeyedTable, use Table.newScan() API to get all data files and delete files.

For ArcticKeyed, use KeyedTable.newScan() API to get all insert files, delete files and base files.

Subtasks

No response

Related issues

No response

Are you willing to submit a PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

wangtaohz added the type:feature Feature Requests label Feb 9, 2023

wangtaohz mentioned this issue Feb 13, 2023

[ARCTIC-1093] Self-Optimizing scan files from metadata instead of from file info cache #1100

Merged

3 tasks

zhoujinsong closed this as completed in #1100 Feb 14, 2023

wangtaohz mentioned this issue Feb 16, 2023

[ARTIC-1903] fix Minor Optimizing scans files with iceberg sequence and add more test cases for hive support table #1125

Merged

3 tasks

zhoujinsong added this to the Release 0.4.1 milestone Feb 21, 2023

zhoujinsong added the priority:blocker security, data-loss, correctness, etc. label Feb 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Self-Optimizing scan files from metadata instead of from file info cache #1093

[Feature]: Self-Optimizing scan files from metadata instead of from file info cache #1093

wangtaohz commented Feb 9, 2023

[Feature]: Self-Optimizing scan files from metadata instead of from file info cache #1093

[Feature]: Self-Optimizing scan files from metadata instead of from file info cache #1093

Comments

wangtaohz commented Feb 9, 2023

Description

Use case/motivation

Describe the solution

Subtasks

Related issues

Are you willing to submit a PR?

Code of Conduct