Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46810][DOCS] Align error class terminology with SQL standard #44902

Closed

Conversation

nchammas
Copy link
Contributor

@nchammas nchammas commented Jan 26, 2024

What changes were proposed in this pull request?

  • Clarify the error class terminology in our internal errors README per the proposal in SPARK-46810.
  • Rename error-classes.json to error-conditions.json and update the codebase accordingly.
  • Rename error-categories.json to error-classes.json and update the codebase accordingly.
  • Improve the formatting of the code snippets in the errors README.

Why are the changes needed?

We should use error class terminology consistently and unambiguously, and we should stick as closely to the SQL standard as possible.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing unit tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the DOCS label Jan 26, 2024
@nchammas
Copy link
Contributor Author

cc @MaxGekk and @srielau

@nchammas nchammas changed the title [SPARK-46810][DOCS] Explain error class terminology in internal README [SPARK-46810][DOCS] Align error class terminology with SQL standard Jan 28, 2024
@nchammas
Copy link
Contributor Author

@MaxGekk - Given the recent discussion on SPARK-46810, shall I expand this PR to include renaming errorClass across the codebase? Or would you like to break that into its own PR?

@MaxGekk
Copy link
Member

MaxGekk commented Mar 14, 2024

Or would you like to break that into its own PR?

Let's do that in a separate PR because I expect massive changes. It should be easier to review only one kind of changes.

@nchammas
Copy link
Contributor Author

I've created SPARK-47429 to track renaming errorClass to errorCondition.

The build is passing, and this is ready for review, @MaxGekk.

@nchammas
Copy link
Contributor Author

Since error-classes.json is frequently updated, it will repeatedly cause merge conflicts here. I will refrain from resolving those merge conflicts until this PR is actively under review.

@nchammas
Copy link
Contributor Author

@MaxGekk - Is there something I can do to move this along?

Once this is merged in, I am happy to work on SPARK-47429, as I noted in the comments there. (But it's also fine if you'd prefer to work on it yourself.)

@srielau
Copy link
Contributor

srielau commented Apr 1, 2024

@MaxGekk - Is there something I can do to move this along?

Once this is merged in, I am happy to work on SPARK-47429, as I noted in the comments there. (But it's also fine if you'd prefer to work on it yourself.)

@nchammas Not sure if we can count on a response from @MaxGekk right now. @cloud-fan How can we proceed here?

@cloud-fan
Copy link
Contributor

The JSON field still has "subClass", do we need to rename it?

@nchammas
Copy link
Contributor Author

The JSON field still has "subClass", do we need to rename it?

I was planning to change that as part of SPARK-47429, since that affects the internal API and will likely require a large diff. If you agree with that, I will update the description for SPARK-47429 accordingly.

@cloud-fan
Copy link
Contributor

oh, sounds good to do it later. Let's rebase this PR and then merge it ASAP.

@nchammas
Copy link
Contributor Author

@cloud-fan - Ready for review.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in c5b8e60 Apr 16, 2024
@nchammas nchammas deleted the SPARK-46810-error-class-terminology branch April 16, 2024 16:00

Fixing this will require renaming `SparkException.errorClass` to `SparkException.errorCondition` and making similar changes to `ErrorClassesJsonReader` and other parts of the codebase. We will address this in [SPARK-47429]. Until that is complete, we will have to live with the fact that a string like `DATATYPE_MISSING_SIZE` is called an "error condition" in our user-facing documentation but an "error class" in the code.

For more details, please see [SPARK-46810][SPARK-46810].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to write [SPARK-46810] twice here? Is it because of the special syntax for markdown? Or do we actually want to write: [SPARK-46810] [SPARK-47429]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I meant to link to SPARK-46810. The first [SPARK-46810] is the text I want to link, and the second [SPARK-46810] is the reference to the link URL I have defined just below in the same document:

[SPARK-46810]: https://issues.apache.org/jira/browse/SPARK-46810

HyukjinKwon pushed a commit that referenced this pull request Apr 18, 2024
### What changes were proposed in this pull request?
The pr is following up #44902, to make some `reference files links` clickable.

### Why are the changes needed?
Convenient for developers to navigate directly when read this file `README.md`.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #46105 from panbingkun/SPARK-46810_FOLLOWUP.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
MaxGekk added a commit that referenced this pull request Sep 10, 2024
…kError()`

### What changes were proposed in this pull request?
In the PR, I propose to rename the `errorClass` parameter of the `checkError()` functions (and related and similar functions) to `condition` to follow the naming convention introduced by SPARK-46810.

For example:

```scala
checkError(
  exception = e,
  condition = "UNRESOLVED_COLUMN.WITH_SUGGESTION",
  parameters = Map("objectName" -> "`non_exist`", "proposal" -> "`i`, `j`"))
```

### Why are the changes needed?
To follow new naming convention introduced by #44902.

### Does this PR introduce _any_ user-facing change?
No, the changes affect **only in tests**.

### How was this patch tested?
By running the existing test suites and GAs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #48027 from MaxGekk/rename-checkError-condition.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
IvanK-db pushed a commit to IvanK-db/spark that referenced this pull request Sep 20, 2024
…kError()`

### What changes were proposed in this pull request?
In the PR, I propose to rename the `errorClass` parameter of the `checkError()` functions (and related and similar functions) to `condition` to follow the naming convention introduced by SPARK-46810.

For example:

```scala
checkError(
  exception = e,
  condition = "UNRESOLVED_COLUMN.WITH_SUGGESTION",
  parameters = Map("objectName" -> "`non_exist`", "proposal" -> "`i`, `j`"))
```

### Why are the changes needed?
To follow new naming convention introduced by apache#44902.

### Does this PR introduce _any_ user-facing change?
No, the changes affect **only in tests**.

### How was this patch tested?
By running the existing test suites and GAs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#48027 from MaxGekk/rename-checkError-condition.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…kError()`

### What changes were proposed in this pull request?
In the PR, I propose to rename the `errorClass` parameter of the `checkError()` functions (and related and similar functions) to `condition` to follow the naming convention introduced by SPARK-46810.

For example:

```scala
checkError(
  exception = e,
  condition = "UNRESOLVED_COLUMN.WITH_SUGGESTION",
  parameters = Map("objectName" -> "`non_exist`", "proposal" -> "`i`, `j`"))
```

### Why are the changes needed?
To follow new naming convention introduced by apache#44902.

### Does this PR introduce _any_ user-facing change?
No, the changes affect **only in tests**.

### How was this patch tested?
By running the existing test suites and GAs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#48027 from MaxGekk/rename-checkError-condition.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…kError()`

### What changes were proposed in this pull request?
In the PR, I propose to rename the `errorClass` parameter of the `checkError()` functions (and related and similar functions) to `condition` to follow the naming convention introduced by SPARK-46810.

For example:

```scala
checkError(
  exception = e,
  condition = "UNRESOLVED_COLUMN.WITH_SUGGESTION",
  parameters = Map("objectName" -> "`non_exist`", "proposal" -> "`i`, `j`"))
```

### Why are the changes needed?
To follow new naming convention introduced by apache#44902.

### Does this PR introduce _any_ user-facing change?
No, the changes affect **only in tests**.

### How was this patch tested?
By running the existing test suites and GAs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#48027 from MaxGekk/rename-checkError-condition.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants