Skip to content

Conversation

@anchovYu
Copy link
Contributor

@anchovYu anchovYu commented Apr 20, 2022

What changes were proposed in this pull request?

Improve the error messages for cast failures in ANSI.
As mentioned in https://issues.apache.org/jira/browse/SPARK-38929, this PR targets two cast-to types: numeric types and date types.

  • For numeric(int, smallint, double, float, decimal ..) types, it embeds the cast-to types in the error message. For example,
    Invalid input value for type INT: '1.0'. To return NULL instead, use 'try_cast'. If necessary set %s to false to bypass this error.
    
    It uses the toSQLType and toSQLValue to wrap the corresponding types and literals.
  • For date types, it does similarly as above. For example,
    Invalid input value for type TIMESTAMP: 'a'. To return NULL instead, use 'try_cast'. If necessary set spark.sql.ansi.enabled to false to bypass this error.
    

Why are the changes needed?

To improve the error message in general.

Does this PR introduce any user-facing change?

It changes the error messages.

How was this patch tested?

The related unit tests are updated.

Authored-by: Xinyi Yu xinyi.yu@databricks.com
Signed-off-by: Max Gekk max.gekk@gmail.com
(cherry picked from commit f76b3e7)

### What changes were proposed in this pull request?
Improve the error messages for cast failures in ANSI.
As mentioned in https://issues.apache.org/jira/browse/SPARK-38929, this PR targets two cast-to types: numeric types and date types.
* For numeric(`int`, `smallint`, `double`, `float`, `decimal` ..) types, it embeds the cast-to types in the error message. For example,
  ```
  Invalid input value for type INT: '1.0'. To return NULL instead, use 'try_cast'. If necessary set %s to false to bypass this error.
  ```
  It uses the `toSQLType` and `toSQLValue` to wrap the corresponding types and literals.
* For date types, it does similarly as above. For example,
  ```
  Invalid input value for type TIMESTAMP: 'a'. To return NULL instead, use 'try_cast'. If necessary set spark.sql.ansi.enabled to false to bypass this error.
  ```

### Why are the changes needed?
To improve the error message in general.

### Does this PR introduce _any_ user-facing change?
It changes the error messages.

### How was this patch tested?
The related unit tests are updated.

Closes apache#36241 from anchovYu/ansi-error-improve.

Authored-by: Xinyi Yu <xinyi.yu@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
(cherry picked from commit f76b3e7)
@anchovYu
Copy link
Contributor Author

Hi @MaxGekk , this is the cherry-picked PR. Thank you!

@MaxGekk
Copy link
Member

MaxGekk commented Apr 20, 2022

+1, LGTM. Merging to 3.3.
Thank you, @anchovYu.

MaxGekk pushed a commit that referenced this pull request Apr 20, 2022
### What changes were proposed in this pull request?

Improve the error messages for cast failures in ANSI.
As mentioned in https://issues.apache.org/jira/browse/SPARK-38929, this PR targets two cast-to types: numeric types and date types.
* For numeric(`int`, `smallint`, `double`, `float`, `decimal` ..) types, it embeds the cast-to types in the error message. For example,
  ```
  Invalid input value for type INT: '1.0'. To return NULL instead, use 'try_cast'. If necessary set %s to false to bypass this error.
  ```
  It uses the `toSQLType` and `toSQLValue` to wrap the corresponding types and literals.
* For date types, it does similarly as above. For example,
  ```
  Invalid input value for type TIMESTAMP: 'a'. To return NULL instead, use 'try_cast'. If necessary set spark.sql.ansi.enabled to false to bypass this error.
  ```

### Why are the changes needed?
To improve the error message in general.

### Does this PR introduce _any_ user-facing change?
It changes the error messages.

### How was this patch tested?
The related unit tests are updated.

Authored-by: Xinyi Yu <xinyi.yudatabricks.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit f76b3e7)

Closes #36275 from anchovYu/ansi-error-improve-3.3.

Authored-by: Xinyi Yu <xinyi.yu@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
@MaxGekk MaxGekk closed this Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants