[Serve] Mark `long_running_serve_failure` test as `stable` #32063

shrekris-anyscale · 2023-01-30T20:02:28Z

Signed-off-by: Shreyas Krishnaswamy shrekris@anyscale.com

Why are these changes needed?

The long_running_serve_failure release test is marked as unstable due to recent failures. Recently, #31945 and #32011 have resolved the root causes of these failures. After those changes, the test ran successfully for 15+ hours without failure. This change limits the test's iterations, so it doesn't run forever, and it marks the test as stable.

Related issue number

Closes #31741

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Release tests
  - This change modifies the long_running_serve_failure test.

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>

cadedaniel

LGTM assuming the dashboard change that broke last time is fixed #31741 (comment)

architkulkarni

Remove iteration += 1 from line 169?

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>

architkulkarni · 2023-01-30T20:18:13Z

Oh I guess it's a nit, I thought it would halve the number of iterations but it doesn't actually

shrekris-anyscale · 2023-01-30T20:18:43Z

Remove iteration += 1 from line 169?

Good catch, I removed it.

shrekris-anyscale · 2023-01-30T22:42:02Z

Two tests are failing, but they're unrelated–

linkcheck unrelated
tutorial_rllib is failing on master:

This is ready to merge. @sihanwang41 @architkulkarni

…ct#32063) The long_running_serve_failure release test is marked as unstable due to recent failures. Recently, ray-project#31945 and ray-project#32011 have resolved the root causes of these failures. After those changes, the test ran successfully for 15+ hours without failure. This change limits the test's iterations, so it doesn't run forever, and it marks the test as stable.

…32181) #32063 fixed some issues with the long_running_serve_failure release test and then marked it stable. The test ran successfully afterwards (see test run), but the CI failed to access logs from the cluster and reported the test as errored. The logs were inaccessible on the cluster due to an issue with the cluster setup. Since this test can run without persisting logs, this change drops the logging requirement for this test. Related issue number Closes #32169

…ct#32063) The long_running_serve_failure release test is marked as unstable due to recent failures. Recently, ray-project#31945 and ray-project#32011 have resolved the root causes of these failures. After those changes, the test ran successfully for 15+ hours without failure. This change limits the test's iterations, so it doesn't run forever, and it marks the test as stable. Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

…ay-project#32181) ray-project#32063 fixed some issues with the long_running_serve_failure release test and then marked it stable. The test ran successfully afterwards (see test run), but the CI failed to access logs from the cluster and reported the test as errored. The logs were inaccessible on the cluster due to an issue with the cluster setup. Since this test can run without persisting logs, this change drops the logging requirement for this test. Related issue number Closes ray-project#32169 Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

Limit iterations to 350

fffffa3

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>

shrekris-anyscale requested review from cadedaniel, architkulkarni and sihanwang41 January 30, 2023 20:02

shrekris-anyscale assigned cadedaniel, architkulkarni and sihanwang41 Jan 30, 2023

Mark test stable

25d57bd

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>

cadedaniel approved these changes Jan 30, 2023

View reviewed changes

architkulkarni approved these changes Jan 30, 2023

View reviewed changes

architkulkarni requested changes Jan 30, 2023

View reviewed changes

architkulkarni approved these changes Jan 30, 2023

View reviewed changes

Stop incrementing iteration

0de29af

Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>

sihanwang41 approved these changes Jan 30, 2023

View reviewed changes

shrekris-anyscale added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Jan 30, 2023

architkulkarni merged commit b350f8d into ray-project:master Jan 30, 2023

shrekris-anyscale mentioned this pull request Feb 1, 2023

[Serve] Remove logging requirement for long_running_serve_failure #32181

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve] Mark `long_running_serve_failure` test as `stable` #32063

[Serve] Mark `long_running_serve_failure` test as `stable` #32063

shrekris-anyscale commented Jan 30, 2023

cadedaniel left a comment

architkulkarni left a comment •

edited

Loading

architkulkarni commented Jan 30, 2023

shrekris-anyscale commented Jan 30, 2023

shrekris-anyscale commented Jan 30, 2023

[Serve] Mark long_running_serve_failure test as stable #32063

[Serve] Mark long_running_serve_failure test as stable #32063

Conversation

shrekris-anyscale commented Jan 30, 2023

Why are these changes needed?

Related issue number

Checks

cadedaniel left a comment

Choose a reason for hiding this comment

architkulkarni left a comment • edited Loading

Choose a reason for hiding this comment

architkulkarni commented Jan 30, 2023

shrekris-anyscale commented Jan 30, 2023

shrekris-anyscale commented Jan 30, 2023

[Serve] Mark `long_running_serve_failure` test as `stable` #32063

[Serve] Mark `long_running_serve_failure` test as `stable` #32063

architkulkarni left a comment •

edited

Loading