Skip to content

Conversation

@gaborkaszab
Copy link
Collaborator

#14640 introduced a new api to scan partition stats. The next step is to replace the usage of the old reader function to the new stat api. The work is divided into 2 steps:

  • Current PR: Replace the usage of PartitionStatsHandler.readPartitionStatsFile() in the prod code. See the tests still pass
  • Follow-up PR: Replace the usage of the above in the test code too (including replacing core/PartitionStats usage with api/PartitionStatistics)

&& Objects.equals(stats1.lastUpdatedSnapshotId(), stats2.lastUpdatedSnapshotId());
}

@SuppressWarnings("checkstyle:CyclomaticComplexity")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporarily needed because the tests write stats using PartitionStatistics, however the tests use still use PartitionStats on the read path. When comparing expectations with actual stats we need this function. Will be dropped with the follow-up PR


private static final int STATS_COUNT = 13;

public BasePartitionStatistics(StructLike partition, int specId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be package private?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

/** Used by internal readers to instantiate this class with a projection schema. */
public BasePartitionStatistics(Types.StructType projection) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

value,
(existingEntry, newEntry) -> {
existingEntry.appendStats(newEntry);
((BasePartitionStatistics) existingEntry).appendStats(newEntry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like this. Why are we sure that this is an instance of BasePartitionStatistics?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored the code to move the liveEntry(), appendStats() etc. functions into PartitionStatsHandler and populate the stats there using the setter inherited from StructLike. This eliminated the need for the casts.

@gaborkaszab gaborkaszab force-pushed the main_use_partition_stat_scan_api branch from c2b4a14 to f9beb4a Compare January 12, 2026 11:40
@github-actions github-actions bot added the API label Jan 12, 2026
@gaborkaszab gaborkaszab force-pushed the main_use_partition_stat_scan_api branch from f9beb4a to 584f943 Compare January 12, 2026 11:41
@gaborkaszab gaborkaszab force-pushed the main_use_partition_stat_scan_api branch from 584f943 to 85124af Compare January 12, 2026 11:42
@pvary pvary merged commit 7f81e1e into apache:main Jan 12, 2026
54 of 56 checks passed
@pvary
Copy link
Contributor

pvary commented Jan 12, 2026

Merged to main.
Thanks @gaborkaszab for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants