-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-11486][tests] Port RecoveryITCase to new code base #7683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsBot commandsThe @flinkbot bot supports the following commands:
|
bf63261 to
a876d24
Compare
| */ | ||
| @Test | ||
| public void testNewTaskExecutorJoinsCluster() throws Exception { | ||
| public void testCleanClusterStateAfterTaskExecutorTermination() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is not very intuitive, I think testJobReExecutionAfterTaskExecutorTermination would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I will apply the changes.
960e758 to
5f75aa8
Compare
| CommonTestUtils.waitUntilCondition( | ||
| jobIsRunning(() -> miniCluster.getExecutionGraph(jobGraph.getJobID())), | ||
| Deadline.fromNow(TESTING_TIMEOUT), | ||
| 20L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bug in waitUntilCondition which leads to 20 being ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Will fix it while merging this PR.
|
Thanks for the review @GJL. I'll address your comment while merging this PR. |
…ception, Deadline, long) Properly pass the retryIntervalMillis to the sleep call.
- "recover a task manager failure" --> TaskExecutorITCase#testJobRecoveryWithFailingTaskExecutor - "recover once failing forward job" --> JobRecoveryITCase#testTaskFailureRecovery - "recover once failing forward job with slot sharing" --> JobRecoveryITCase#testTaskFailureWithSlotSharingRecovery
Increase the retry interval for TaskExecutorITCase and ZooKeeperLeaderElectionITCase since they take so long that the low retry interval won't have a big effect apart from a higher CPU load.
5f75aa8 to
21fabef
Compare
- "recover a task manager failure" --> TaskExecutorITCase#testJobRecoveryWithFailingTaskExecutor - "recover once failing forward job" --> JobRecoveryITCase#testTaskFailureRecovery - "recover once failing forward job with slot sharing" --> JobRecoveryITCase#testTaskFailureWithSlotSharingRecovery This closes apache#7683.
What is the purpose of the change
Port
RecoveryITCaseto new code base.This PR is based on #7676.
Brief change log
Verifying this change
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (no)Documentation