Skip to content

Conversation

@nikolamand-db
Copy link
Contributor

@nikolamand-db nikolamand-db commented Feb 20, 2024

What changes were proposed in this pull request?

Only occurrence of _LEGACY_ERROR_TEMP_1175 appears under conversion from Spark data types to Parquet. All supported documented Spark data types are covered in the conversion function (VarcharType and CharType are not present but they are passed down as string before reaching the conversion function), so under normal circumstances user can't force this error.

Convert the error class to INTERNAL_ERROR.

Why are the changes needed?

Remove legacy error classes as part of activity in SPARK-37935.

Does this PR introduce any user-facing change?

If the Spark works correctly, user shouldn't be able to run into INTERNAL_ERROR by using the public API.

How was this patch tested?

Added test to QueryCompilationErrorsSuite and tested with sbt:

project sql
testOnly *QueryCompilationErrorsSuite

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Feb 20, 2024
errorClass = "_LEGACY_ERROR_TEMP_1175",
messageParameters = Map("dataType" -> field.dataType.catalogString))
errorClass = "UNSUPPORTED_DATATYPE",
messageParameters = Map("typeName" -> field.dataType.catalogString))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you quote the type using toSQLType() (the function uses slightly different converter of Catalyst type to strings. That's more consistent to other errors).

}
checkError(
exception = intercept[AnalysisException] {
converter.convertField(StructField("test", dummyDataType))
Copy link
Member

@MaxGekk MaxGekk Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reproduce the error using public API? If not, we should think of converting to an internal error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGekk Replaced it with INTERNAL_ERROR, please check the updated PR description.

nikolamand-db and others added 2 commits February 21, 2024 21:56
…ompilationErrors.scala

Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
@MaxGekk
Copy link
Member

MaxGekk commented Feb 22, 2024

+1, LGTM. Merging to master.
Thank you, @nikolamand-db and @HyukjinKwon for review.

@MaxGekk MaxGekk closed this in 78f7c30 Feb 22, 2024
@MaxGekk
Copy link
Member

MaxGekk commented Feb 22, 2024

@nikolamand-db Congratulations with your first contribution to Apache Spark!

@nikolamand-db nikolamand-db deleted the nikolamand-db/SPARK-42328 branch February 22, 2024 09:12
ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
### What changes were proposed in this pull request?

Only occurrence of `_LEGACY_ERROR_TEMP_1175` appears under conversion from Spark data types to Parquet. All supported documented [Spark data types](https://spark.apache.org/docs/latest/sql-ref-datatypes.html) are covered in the [conversion function](https://github.com/apache/spark/blob/3e0808c33f185c13808ce2d547ce9ba0057d31a6/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L517-L745) (`VarcharType` and `CharType` are not present but they are passed down as string before reaching the conversion function), so under normal circumstances user can't force this error.

Convert the error class to `INTERNAL_ERROR`.

### Why are the changes needed?

Remove legacy error classes as part of activity in [SPARK-37935](https://issues.apache.org/jira/browse/SPARK-37935).

### Does this PR introduce _any_ user-facing change?

If the Spark works correctly, user shouldn't be able to run into `INTERNAL_ERROR` by using the public API.

### How was this patch tested?

Added test to `QueryCompilationErrorsSuite` and tested with sbt:
```
project sql
testOnly *QueryCompilationErrorsSuite
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#45183 from nikolamand-db/nikolamand-db/SPARK-42328.

Authored-by: Nikola Mandic <nikola.mandic@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants