-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19437] Rectify spark executor id in HeartbeatReceiverSuite. #16779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c8403ea to
06efcac
Compare
|
CC @zsxwing |
|
Test build #3553 has finished for PR 16779 at commit
|
|
@jinxing64 Good catch. Since you are touching this file, could you also replace other |
|
LGTM pending tests |
|
@zsxwing |
|
ok to test |
|
Test build #72297 has finished for PR 16779 at commit
|
|
Thanks! Merging to master. |
|
Thanks a lot for reviewing this. |
## What changes were proposed in this pull request?
The current code in `HeartbeatReceiverSuite`, executorId is set as below:
```
private val executorId1 = "executor-1"
private val executorId2 = "executor-2"
```
The executorId is sent to driver when register as below:
```
test("expire dead hosts should kill executors with replacement (SPARK-8119)") {
...
fakeSchedulerBackend.driverEndpoint.askSync[Boolean](
RegisterExecutor(executorId1, dummyExecutorEndpointRef1, "1.2.3.4", 0, Map.empty))
...
}
```
Receiving `RegisterExecutor` in `CoarseGrainedSchedulerBackend`, the executorId will be compared with `currentExecutorIdCounter` as below:
```
case RegisterExecutor(executorId, executorRef, hostname, cores, logUrls) =>
if (executorDataMap.contains(executorId)) {
executorRef.send(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
context.reply(true)
} else {
...
executorDataMap.put(executorId, data)
if (currentExecutorIdCounter < executorId.toInt) {
currentExecutorIdCounter = executorId.toInt
}
...
```
`executorId.toInt` will cause NumberformatException.
This unit test can pass currently because of `askWithRetry`, when catching exception, RPC will call again, thus it will go `if` branch and return true.
**To fix**
Rectify executorId and replace `askWithRetry` with `askSync`, refer to apache#16690
## How was this patch tested?
This fix is for unit test and no need to add another one.(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Author: jinxing <jinxing@meituan.com>
Closes apache#16779 from jinxing64/SPARK-19437.
What changes were proposed in this pull request?
The current code in
HeartbeatReceiverSuite, executorId is set as below:The executorId is sent to driver when register as below:
Receiving
RegisterExecutorinCoarseGrainedSchedulerBackend, the executorId will be compared withcurrentExecutorIdCounteras below:executorId.toIntwill cause NumberformatException.This unit test can pass currently because of
askWithRetry, when catching exception, RPC will call again, thus it will goifbranch and return true.To fix
Rectify executorId and replace
askWithRetrywithaskSync, refer to #16690How was this patch tested?
This fix is for unit test and no need to add another one.(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)