Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Sep 12, 2022

What changes were proposed in this pull request?

In the PR, I propose to remove the error class INDEX_OUT_OF_BOUNDS from error-classes.json and the exception SparkIndexOutOfBoundsException. And replace the last one by a SparkException w/ the error class INTERNAL_ERROR because the exception should not be raised in regular cases.

ArrayDataIndexedSeq throws the exception from apply(), and ArrayDataIndexedSeq can be created from ArrayData.toSeq only. The last one is invoked from 2 places:

  1. The Slice expression ( or slice function):
    val data = arr.toSeq[AnyRef](elementType)
    new GenericArrayData(data.slice(startIndex, startIndex + lengthInt))

where any access to the produced array is guarded:

spark-sql> set spark.sql.ansi.enabled=true;
spark.sql.ansi.enabled	true
Time taken: 2.415 seconds, Fetched 1 row(s)
spark-sql> SELECT slice(array(1, 2, 3, 4), 2, 2)[4];
...
org.apache.spark.SparkArrayIndexOutOfBoundsException: [INVALID_ARRAY_INDEX] The index 4 is out of bounds. The array has 2 elements. Use the SQL function `get()` to tolerate accessing element at invalid index and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
SELECT slice(array(1, 2, 3, 4), 2, 2)[4]
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	at org.apache.spark.sql.errors.QueryExecutionErrors$.invalidArrayIndexError(QueryExecutionErrors.scala:239)
	at org.apache.spark.sql.catalyst.expressions.GetArrayItem.nullSafeEval(complexTypeExtractors.scala:271)

see

if (index >= baseValue.numElements() || index < 0) {
if (failOnError) {
throw QueryExecutionErrors.invalidArrayIndexError(
index, baseValue.numElements, getContextOrNull())

  1. MapObjects.convertToSeq:

where any access to the produced IndexedSeq is guarded via map-way access in

inputCollection.iterator.map { element =>
row.update(0, element)
lambdaFunction.eval(row)
}

Why are the changes needed?

To improve code maintenance.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

By running the affected test suite:

$ build/sbt "core/testOnly *SparkThrowableSuite"
$ build/sbt "test:testOnly *ArrayDataIndexedSeqSuite"

@MaxGekk MaxGekk changed the title [WIP][SPARK-38734][SQL] Remove the error class INDEX_OUT_OF_BOUNDS [SPARK-38734][SQL] Remove the error class INDEX_OUT_OF_BOUNDS Sep 12, 2022
@MaxGekk MaxGekk marked this pull request as ready for review September 12, 2022 17:12
@MaxGekk
Copy link
Member Author

MaxGekk commented Sep 12, 2022

cc @panbingkun

MaxGekk referenced this pull request Sep 12, 2022
…xecutionErrors

### What changes were proposed in this pull request?
When we refactor the query execution errors to use error classes in QueryExecutionErrors, we need define some exception that mix SparkThrowable into a base Exception type.
according the example [SparkArithmeticException](https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75)

Add SparkXXXException as follows:
- `SparkClassNotFoundException`
- `SparkConcurrentModificationException`
- `SparkDateTimeException`
- `SparkFileAlreadyExistsException`
- `SparkFileNotFoundException`
- `SparkNoSuchMethodException`
- `SparkIndexOutOfBoundsException`
- `SparkIOException`
- `SparkSecurityException`
- `SparkSQLException`
- `SparkSQLFeatureNotSupportedException`

Refactor some exceptions in QueryExecutionErrors to use error classes and new exception for testing new exception

Some added by [PR](#33538) as follows:

- `SparkUnsupportedOperationException`
- `SparkIllegalStateException`
- `SparkNumberFormatException`
- `SparkIllegalArgumentException`
- `SparkArrayIndexOutOfBoundsException`
- `SparkNoSuchElementException`

### Why are the changes needed?
[SPARK-36336](https://issues.apache.org/jira/browse/SPARK-36336)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existed ut test

Closes #33573 from Peng-Lei/SPARK-36336.

Authored-by: PengLei <peng.8lei@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@MaxGekk
Copy link
Member Author

MaxGekk commented Sep 13, 2022

Merging to master. Thank you, @HyukjinKwon @cloud-fan for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants