-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-12340][SQL]fix Int overflow in the SparkPlan.executeTake, RDD.take and AsyncRDDActions.takeAsync #10562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2067,4 +2067,16 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { | |
| ) | ||
| } | ||
| } | ||
|
|
||
| test("SPARK-12340: overstep the bounds of Int in SparkPlan.executeTake") { | ||
| val rdd = sqlContext.sparkContext.parallelize(1 to 3 , 3 ) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also remove the extra space before comma here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for the sql part i'd just move this into the existing limit test case, and add a line of comment explaining this. also you should explain in the comment why 2147483638 is chosen as a value. |
||
| rdd.toDF("key").registerTempTable("spark12340") | ||
| checkAnswer( | ||
| sql("select key from spark12340 limit 2147483638"), | ||
| Row(1) :: Row(2) :: Row(3) :: Nil | ||
| ) | ||
| assert(rdd.take(2147483638).size === 3) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should have a unit test in RDDSuite for the rdd tests, not in SQLQuerySuite. |
||
| assert(rdd.takeAsync(2147483638).get.size === 3) | ||
| } | ||
|
|
||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this change necessary? When can partsScanned go above 2B?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right.
partScannedcannot exceed the value oftotalParts.I'll return it to
Int.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a legit problem here. Imagine
totalPartsis close toInt.MaxValue, and imaginepartsScannedis close tototalParts. Addingp.sizeto it below could cause it to roll over. I think this change is needed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's never possible -- if we have anywhere near 2B partitions, the scheduler won't be fast enough to schedule them. As a matter of fact, if we have anywhere larger than a few millions, the scheduler will likely crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, in practice this all but certainly won't happen. Note that this patch was already committed to
mastermaking this aLong. It doesn't hurt and is very very theoretically more correct locally. I suppose I don't think it's worth updating again, but I do not feel strongly about it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to change it back since it is so little work, so this does not start a trend to change all ints to longs for no reason. Note that this also raise questions about why this value can be greater than int.max when somebody reads this code in the future.
Also @srowen even if totalParts is close to int.max, I don't think partsScanned can be greater than int.max because we never scan more parts than the number of parts available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok you were referring to partsScanned + numPartsToTry - we should just cast that to long to minimize the impact.