Skip to content

Conversation

@uros-db
Copy link
Contributor

@uros-db uros-db commented Oct 15, 2025

What changes were proposed in this pull request?

Introduce two new physical types to Spark:

  • PhysicalGeographyType
  • PhysicalGeometryType

This PR also adds appropriate mapping from the logical geospatial types (introduced in: #52491) to the new physical types.

Why are the changes needed?

Extending the implementation of GEOMETRY and GEOGRAPHY types in Spark, laying the groundwork for full geospatial data type support.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new tests to:

  • GeographyValSuite
  • GeometryValSuite

Also, added appropriate test cases to:

  • GeographyTypeSuite
  • GeographyTypeSuite

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 15, 2025
Copy link
Contributor Author

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkaravel @szehon-ho Please review.

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks mostly good, some comment/questions

@uros-db uros-db requested a review from szehon-ho October 22, 2025 06:59
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 96093bd Oct 23, 2025
cloud-fan pushed a commit that referenced this pull request Oct 28, 2025
…ross Catalyst

### What changes were proposed in this pull request?
Added Geography and Geometry accessors to core row/column interfaces, extended codegen and physical type handling to properly recognize geospatial types, enabled writing/reading of Geography and Geometry values in unsafe writer, and added other necessary plumbing for Geography and Geometry in projection/row utilities in order to thread through the new accessors.

Note that the GEOMETRY and GEOGRAPHY physical types were recently included to Spark SQL as part of: #52629.

### Why are the changes needed?
To provide first-class support for GEOGRAPHY and GEOMETRY within Catalyst.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added new tests to:
  - `GenerateUnsafeProjectionSuite.scala`
  - `UnsafeRowWriterSuite.scala`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #52723 from uros-db/geo-interfaces.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Oct 28, 2025
…cution classes

### What changes were proposed in this pull request?
Introduce internal server-side `Geography` and `Geometry` execution classes in catalyst.

Note that the corresponding low-level physical holders (`GeographyVal` and `GeometryVal`) for geospatial types have been previously added as part of #52629.

### Why are the changes needed?
Establishing a clear internal execution layer for geospatial operations in catalyst and unblocking downstream work for implementing built-in ST functions and geospatial storage support.

### Does this PR introduce _any_ user-facing change?
No. This PR does not introduce any new public API, catalyst expressions, nor user-facing SQL functions. Those will be added in the future.

### How was this patch tested?
Added new Java test suites for the execution classes:
- `GeographyExecutionSuite`
- `GeometryExecutionSuite`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #52737 from uros-db/geo-server-classes.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Yicong-Huang pushed a commit to Yicong-Huang/spark that referenced this pull request Oct 30, 2025
### What changes were proposed in this pull request?
Introduce two new physical types to Spark:
- `PhysicalGeographyType`
- `PhysicalGeometryType`

This PR also adds appropriate mapping from the logical geospatial types (introduced in: apache#52491) to the new physical types.

### Why are the changes needed?
Extending the implementation of GEOMETRY and GEOGRAPHY types in Spark, laying the groundwork for full geospatial data type support.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added new tests to:
- `GeographyValSuite`
- `GeometryValSuite`

Also, added appropriate test cases to:
- `GeographyTypeSuite`
- `GeographyTypeSuite`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52629 from uros-db/geo-physical-types.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Yicong-Huang pushed a commit to Yicong-Huang/spark that referenced this pull request Oct 30, 2025
…ross Catalyst

### What changes were proposed in this pull request?
Added Geography and Geometry accessors to core row/column interfaces, extended codegen and physical type handling to properly recognize geospatial types, enabled writing/reading of Geography and Geometry values in unsafe writer, and added other necessary plumbing for Geography and Geometry in projection/row utilities in order to thread through the new accessors.

Note that the GEOMETRY and GEOGRAPHY physical types were recently included to Spark SQL as part of: apache#52629.

### Why are the changes needed?
To provide first-class support for GEOGRAPHY and GEOMETRY within Catalyst.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added new tests to:
  - `GenerateUnsafeProjectionSuite.scala`
  - `UnsafeRowWriterSuite.scala`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52723 from uros-db/geo-interfaces.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Yicong-Huang pushed a commit to Yicong-Huang/spark that referenced this pull request Oct 30, 2025
…cution classes

### What changes were proposed in this pull request?
Introduce internal server-side `Geography` and `Geometry` execution classes in catalyst.

Note that the corresponding low-level physical holders (`GeographyVal` and `GeometryVal`) for geospatial types have been previously added as part of apache#52629.

### Why are the changes needed?
Establishing a clear internal execution layer for geospatial operations in catalyst and unblocking downstream work for implementing built-in ST functions and geospatial storage support.

### Does this PR introduce _any_ user-facing change?
No. This PR does not introduce any new public API, catalyst expressions, nor user-facing SQL functions. Those will be added in the future.

### How was this patch tested?
Added new Java test suites for the execution classes:
- `GeographyExecutionSuite`
- `GeometryExecutionSuite`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52737 from uros-db/geo-server-classes.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants