Skip to content

Conversation

@singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented May 8, 2025

About the change

Solves #13005

PR for resuming the work for scan planning (previous pr : #11180), presently since the client is not ready it can't be used even if IRC Server (for ex Apache Polaris ) supports it. Scan planning unblocks many use-cases such as inter-op.

This is Part 1 of changes #13004 (comment) have rest 2 in my local will send them over part by part !

Parsers :
[1] ContentFileParser for Delete and DataFile is re-used, there is one API change though that now the fromJson and toJson require specsById, as unless the file is parsed we don;t know which spec it belongs to in rest case
[2] TableScanResponseParser is introduced which is a stateful parser, introduced via #13191

[3] Request (and its corresponding models)
[a] PlanTableScanRequestParser : parser for https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L4370
[b] FetchScanTasksRequestParser : parser for https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L4434
4 Response (and its corresponding models)

[a] FetchPlanningResultResponseParser : parser for https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L4700
[b] FetchScanTasksResponseParser : parser for https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L4707
[c] PlanTableScanResponseParser : parser for https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L4693.

Acknowledgement

I connected with the author i.e @rahil-c to see if he plans to resume it, apparently he doesn't have cycles and was kind enough to allow me to push this to the end.

I just rebased the pr, on the current main and go break the pr into smaller chunks while addressing the pending feedback from review, I have something working locally, will publish them soon

(SideNote : I was one of the reviewers to the PR too, will be interesting to see the other side 😌)

Co-author: @rahil-c

@singhpk234 singhpk234 force-pushed the 11180 branch 6 times, most recently from 28b5fda to 18de49b Compare May 13, 2025 03:32
@singhpk234 singhpk234 closed this May 13, 2025
@singhpk234 singhpk234 reopened this May 13, 2025
@singhpk234
Copy link
Contributor Author

singhpk234 commented May 13, 2025

Breaking this change in to 3 logical changes, have all 3 working locally, will send these as we go !

  • Request / Response Models, Parsers
  • Plumbing (RestTable /RestScan) to core module
  • Spark Integ

Have couple of things to propose to spec as well, will do it in the Plumbing and Spark Integ phase of the PR

@singhpk234 singhpk234 force-pushed the 11180 branch 4 times, most recently from be65a34 to d445830 Compare May 15, 2025 05:48
@singhpk234 singhpk234 marked this pull request as ready for review May 15, 2025 15:38
@github-actions github-actions bot removed the API label May 18, 2025
@singhpk234 singhpk234 force-pushed the 11180 branch 3 times, most recently from 84b879e to 75db076 Compare May 18, 2025 21:23
@singhpk234 singhpk234 closed this May 18, 2025
@singhpk234 singhpk234 reopened this May 18, 2025
@singhpk234 singhpk234 changed the title Support Scan Planning in Rest Client Part 1: Support Scan Planning in Rest Client May 19, 2025
@singhpk234 singhpk234 force-pushed the 11180 branch 2 times, most recently from 320f764 to 39da580 Compare May 30, 2025 05:13
@singhpk234 singhpk234 force-pushed the 11180 branch 2 times, most recently from fb035f7 to 834fa2b Compare May 30, 2025 20:48
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @singhpk234! I think this is close, just one thing that I think should not be public

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @singhpk234! Minor style comment but overall looks good to me. I will hold for a bit in case @danielcweeks or any others have any remaining comments.

PartitionData partitionData = null;
if (jsonNode.has(PARTITION)) {
partitionData = new PartitionData(spec.partitionType());
partitionData = new PartitionData(specsById.get(specId).partitionType());
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor followup: Just noticed this but in the new structure we should probably have a clearer message rather than a potential NPE in case the file somehow references a spec ID that doesn't exist for some reason. It's unlikely but I think it's generally better to have clearer error messages so that if it does happen a user knows what to check rather than having to go deeper in the implementation. We can also probably extract specsById.get(specId) into a separate variable since I see it's called three times.

@singhpk234 singhpk234 force-pushed the 11180 branch 2 times, most recently from 68774f6 to 75cb595 Compare August 14, 2025 20:40
Copy link
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, but +1. Thanks @singhpk234 !

@amogh-jahagirdar amogh-jahagirdar merged commit d5476ca into apache:main Aug 15, 2025
42 checks passed
@amogh-jahagirdar amogh-jahagirdar changed the title Part 1: Support Scan Planning in Rest Client Core: Request/Response models and parsers for REST Scan Planning Aug 15, 2025
@amogh-jahagirdar
Copy link
Contributor

Thanks @singhpk234 @rahil-c for the implementation and patience in following through on this. Thanks @danielcweeks for the review!

@rahil-c
Copy link
Contributor

rahil-c commented Aug 16, 2025

Thanks @singhpk234 for pushing this thru, happy to see it finally get merged!
Thanks @amogh-jahagirdar reviewing this for such a long time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants