Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E Tests: Improve Prometheus configurability and make query tests more resilient #5181

Merged
merged 5 commits into from
Mar 16, 2022

Conversation

matej-g
Copy link
Collaborator

@matej-g matej-g commented Feb 24, 2022

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

I have noticed that some of the query E2E tests are prone to flakiness, I've seen a couple CI failures (e.g. https://github.com/thanos-io/thanos/runs/5262364980?check_suite_focus=true#step:5:383) as well as failed runs locally.

The tests fail due to an out of bounds error upon trying to remote write, excerpt:

=== CONT  TestSidecarStorePushdown
Error:     query_test.go:702: query_test.go:702:
        
         unexpected error: server returned HTTP status 400 Bad Request: out of bounds
        
03:21:07 prometheus-p1: level=error ts=2022-02-20T03:21:07.379Z caller=write_handler.go:53 component=web msg="Out of order sample from remote write" err="out of bounds"

I suspect this is due to the fact that in some test cases, we're remote writing time series with timestamps in the past. However, since with the current default Prometheus configuration, Prometheus is configured to scrape itself as well. I suspect that on occasion, if Prometheus is scraped before remote write, this makes the minimum valid time to be set to 'now', which causes the remote write requests with past timestamps to return out of bounds error.

The change to mitigate here is to make the self-scraping of the test Prometheus instance optional, as for these mentioned test cases this might interfere with remote write requests. Additionally, I moved the Prometheus config method to e2ethanos package from the query test file, since it's used across multiple E2E tests and fits better there. Lastly, I bumped slightly the wait time for minio to be ready, since I'm still seeing on occasion error when minio is not ready to accept requests.

Verification

Ran E2E locally multiple times with success (as opposed to hitting out of bounds error before).

- Move prom config method to shared e2ethanos package
- Make scraping Prometheus instance optional

Signed-off-by: Matej Gera <matejgera@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>
test/e2e/e2ethanos/services.go Outdated Show resolved Hide resolved
test/e2e/e2ethanos/services.go Outdated Show resolved Hide resolved
Signed-off-by: Matej Gera <matejgera@gmail.com>
@matej-g matej-g requested a review from hanjm February 25, 2022 13:20
saswatamcode
saswatamcode previously approved these changes Mar 2, 2022
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling test flakiness! 🙂

bwplotka
bwplotka previously approved these changes Mar 2, 2022
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽 thanks! just minor optional nit - up to you

test/e2e/e2ethanos/services.go Outdated Show resolved Hide resolved
Signed-off-by: Matej Gera <matejgera@gmail.com>
@matej-g matej-g dismissed stale reviews from bwplotka and saswatamcode via a8aa8a1 March 3, 2022 13:48
GiedriusS
GiedriusS previously approved these changes Mar 4, 2022
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 let's fix the conflicts

@matej-g
Copy link
Collaborator Author

matej-g commented Mar 16, 2022

Done, should be good to go now!

@matej-g matej-g requested review from GiedriusS and bwplotka March 16, 2022 15:19
@GiedriusS GiedriusS merged commit 9ec283d into thanos-io:main Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants