Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup test_listen_and_serv_op #11126

Merged

Conversation

Yancey1989
Copy link
Contributor

@Yancey1989 Yancey1989 commented Jun 1, 2018

Fixed #11127


# raise SIGTERM to pserver
self._raise_signal(os.getpid(), signal.SIGTERM)
os.kill(pid, signal.SIGINT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal shoule be SIGTERM here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

retry_times = self.ps_timeout
while True:
time.sleep(1)
assert retry_times >= 0, "wait ps ready failed"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert line and the time.sleep(1) line can be exchanged to detect the timeout ealier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# FIXME(Yancey1989): this test would cost much more time on CUDAPlace
# since load cudnn libraries, so we use a longer timeout to make this
# unit test stability.
set_tests_properties(test_listen_and_serv_op PROPERTIES TIMEOUT 30)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 seconds is enough for distributed tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in this unit test, pserver would be killed when pserver is ready.

def _wait_ps_ready(self, pid):
retry_times = self.ps_timeout
while True:
time.sleep(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can sleep time < 1s and the test will be faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but the pserver would cost about 1 seconds until become ready. BTW, the most time-consuming place is to load and initialize the cuda libraries.

@Yancey1989 Yancey1989 merged commit 524c81e into PaddlePaddle:develop Jun 4, 2018
@Yancey1989 Yancey1989 deleted the polish_test_listen_and_serv_op branch June 4, 2018 06:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants