Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Stop and Wait for workspace at the end of each tests #12572

Merged
merged 4 commits into from
Sep 2, 2022

Conversation

utam0k
Copy link
Contributor

@utam0k utam0k commented Sep 1, 2022

Description

Stop and Wait for workspace at the end of each tests to be stabilization

Related Issue(s)

Relates #12248

How to test

Run the test

Release Notes

NONE

Documentation

Werft options:

  • /werft with-preview

@utam0k utam0k requested a review from a team September 1, 2022 02:51
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Sep 1, 2022
@utam0k utam0k force-pushed the to/stablization-test branch 4 times, most recently from 6c3a9de to 35b08fa Compare September 1, 2022 09:18
@utam0k utam0k force-pushed the to/stablization-test branch from 35b08fa to 5c4e0d7 Compare September 1, 2022 10:43
@utam0k utam0k force-pushed the to/stablization-test branch from 5c4e0d7 to f3d954e Compare September 1, 2022 11:49
@jenting
Copy link
Contributor

jenting commented Sep 1, 2022

Copy link
Contributor

@kylos101 kylos101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the total time it takes to run tests assuming no failures? I ask because with increased timeouts, we want to make sure the test run doesn't take too long.

FYI, @jenting shared a test run that had failed tests:
https://werft.gitpod-dev.com/job/gitpod-custom-to-stablization-test.18#integration%20test:test-workspace

Also, I just ran a separate test run in werft, but had trouble with the environment, and notified the platform team:
https://gitpod.slack.com/archives/C03E52788SU/p1662051454479299

I added some questions to the PR, thanks a lot for putting this together, @utam0k !

@@ -137,6 +137,8 @@ args=()
args+=( "-kubeconfig=/home/gitpod/.kube/config" )
args+=( "-namespace=default" )
[[ "$USERNAME" != "" ]] && args+=( "-username=$USERNAME" )
args+=( "-timeout=60m" )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough time? I ask because the werft job ran for 67 minutes.
https://werft.gitpod-dev.com/job/gitpod-custom-to-stablization-test.17

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is, because these args pass each test component.

@@ -137,6 +137,8 @@ args=()
args+=( "-kubeconfig=/home/gitpod/.kube/config" )
args+=( "-namespace=default" )
[[ "$USERNAME" != "" ]] && args+=( "-username=$USERNAME" )
args+=( "-timeout=60m" )
args+=( "-p=1" )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to run each test binary in parallel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, opposite. this makes the test serial.

select {
case err := <-execErrs:
if err != nil {
return nil, closer, err
}
return nil, closer, fmt.Errorf("agent stopped unexepectedly")
case <-time.After(1 * time.Second):
case <-time.After(30 * time.Second):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is, this is the wait time in-between test binaries. Yes? In other words, this is not the wait time each individual test that we run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accused this time means waiting time for the start-up of the agent

@utam0k
Copy link
Contributor Author

utam0k commented Sep 2, 2022

FYI https://werft.gitpod-dev.com/job/gitpod-custom-to-stablization-test.18

@jenting @kylos101
This PR improves stabilization, not complete fixing the integration test. I have already found the below issues

@jenting
Copy link
Contributor

jenting commented Sep 2, 2022

FYI https://werft.gitpod-dev.com/job/gitpod-custom-to-stablization-test.18

@jenting @kylos101 This PR improves stabilization, not complete fixing the integration test. I have already found the below issues

@utam0k I add the port-forwarding issue link 😉

@utam0k
Copy link
Contributor Author

utam0k commented Sep 2, 2022

FYI https://werft.gitpod-dev.com/job/gitpod-custom-to-stablization-test.18

@jenting @kylos101 This PR improves stabilization, not complete fixing the integration test. I have already found the below issues

@utam0k I add the port-forwarding issue link 😉

Oh, I missed it. But I'm trying to fix it on #12248 so that I will close your issue. Thanks

@jenting
Copy link
Contributor

jenting commented Sep 2, 2022

Thanks for your effort @utam0k

I thought all the integration test cases failed for the same reason #10671
But with this PR introduced changes, it seems no. And I summarize two reasons for the flaky integration test

  • we can't have multiple workspaces when running the integration test (I think we could support it, no?)
  • the timeout issue (probably due to preview env CPU/Memory/Disk are slow...)

@utam0k
Copy link
Contributor Author

utam0k commented Sep 2, 2022

Thanks for your effort @utam0k

I thought all the integration test cases failed for the same reason #10671 But with this PR introduced changes, it seems no. And I summarize two reasons for the flaky integration test

  • we can't have multiple workspaces when running the integration test (I think we could support it, no?)
  • the timeout issue (probably due to preview env CPU/Memory/Disk are slow...)

Yes, I thought both were the same reason, lack of resources. Ideally, we can run all integration tests in parallel, but first I think we should focus on passing the integration; next, we can try it.

Copy link
Contributor

@jenting jenting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Let's see how it improves the integration test robustness 💪

@roboquat roboquat merged commit f04a405 into main Sep 2, 2022
@roboquat roboquat deleted the to/stablization-test branch September 2, 2022 01:25
@kylos101
Copy link
Contributor

kylos101 commented Sep 2, 2022

Thanks for your effort @utam0k
I thought all the integration test cases failed for the same reason #10671 But with this PR introduced changes, it seems no. And I summarize two reasons for the flaky integration test

  • we can't have multiple workspaces when running the integration test (I think we could support it, no?)
  • the timeout issue (probably due to preview env CPU/Memory/Disk are slow...)

Yes, I thought both were the same reason, lack of resources. Ideally, we can run all integration tests in parallel, but first I think we should focus on passing the integration; next, we can try it.

Indeed, @utam0k , great point. In other words, get tests to pass (so we have confidence), then make test runs faster (so we can have shorter feedback loops). Some related thoughts.

@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note-none size/XL team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants