Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-to-end tests fail non-deterministically, timeout errors. #1026

Closed
mattbonnell opened this issue Jan 31, 2021 · 4 comments
Closed

End-to-end tests fail non-deterministically, timeout errors. #1026

mattbonnell opened this issue Jan 31, 2021 · 4 comments

Comments

@mattbonnell
Copy link
Contributor

Describe the bug

I noticed while submitting PR #1024 that the ci/circleci: test-e2e-[db] tests seem to fail non-deterministically on some timeout error. I also noticed this while submitting PR #1022.

On consecutive runs of the pipeline, one of these tests would fail seemingly randomly:
Screen Shot 2021-01-31 at 1 04 26 PM
Screen Shot 2021-01-31 at 3 31 39 PM

Until eventually, they all passed (no code changes). This nondeterministic timeout behaviour indicates that there might be some config issue that could be resolved by increasing the timeout duration at this failing stage of the pipeline

Reproducing the bug
Seeing as the issue is non-deterministic, reproducibility is not guaranteed, but you'll probably experience the same thing at some point.
Steps to reproduce the behavior:

  1. Submit a PR with no code changes
  2. Wait for the pipeline to fail on one of these e2e tests

Server logs

waiting for 1 resources: http-get://127.0.0.1:4457/
wait-on(6578) Timed out waiting for: http-get://127.0.0.1:4457/; exiting with error
Error: Timed out waiting for: http-get://127.0.0.1:4457/
    at MergeMapSubscriber.project (/go/src/github.com/ory/kratos/node_modules/wait-on/lib/wait-on.js:130:25)
    at MergeMapSubscriber._tryNext (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/operators/mergeMap.js:69:27)
    at MergeMapSubscriber._next (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/operators/mergeMap.js:59:18)
    at MergeMapSubscriber.Subscriber.next (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/Subscriber.js:66:18)
    at AsyncAction.dispatch [as work] (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/observable/timer.js:31:16)
    at AsyncAction._execute (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/scheduler/AsyncAction.js:71:18)
    at AsyncAction.execute (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/scheduler/AsyncAction.js:59:26)
    at AsyncScheduler.flush (/go/src/github.com/ory/kratos/node_modules/rxjs/internal/scheduler/AsyncScheduler.js:52:32)
    at listOnTimeout (internal/timers.js:549:17)
    at processTimers (internal/timers.js:492:7)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! @ wait-on: `wait-on "-l" "-t" "30000" "http-get://127.0.0.1:4434/health/ready" "http-get://127.0.0.1:4455/health" "http-get://127.0.0.1:4445/health/ready" "http-get://127.0.0.1:4446/" "http-get://127.0.0.1:4456/health" "http-get://127.0.0.1:4457/" "http-get://127.0.0.1:4437/mail"`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the @ wait-on script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/circleci/.npm/_logs/2021-01-31T17_07_48_756Z-debug.log


Exited with code exit status 1

Server configuration

Expected behavior

Deterministic behaviour

Environment

CircleCI pipeline

@aeneasr
Copy link
Member

aeneasr commented Feb 1, 2021

Yes absolutely, unfortunately I don't really know what causes the timeout. I think it might be some issue with connectivity? We probably have to improve the failure logging in the e2e pipeline to catch what's going on (4457 is the mailhog port if I am not mistaken)

@xuwang-wish
Copy link

I got the same issue when i run end to end test (./test/e2e/run.sh mysql), but for me it's not "non-deterministic", rather, i always got the time-out.

@aeneasr
Copy link
Member

aeneasr commented Sep 14, 2021

That is an unrelated issue, as this one is tracking flaky test failures. If you think you have a bug, and it’s not related to e.g. broken Docker on your machine, please open an appropriate issue with as much details as possible and reproducible steps. But given that this works for everyone else, it’s probably an issue on your local machine.

@aeneasr
Copy link
Member

aeneasr commented Oct 19, 2021

Resolved

@aeneasr aeneasr closed this as completed Oct 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants