Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 test/e2e,cmd/test: scrape metrics for test servers and e2e tests #2774

Merged
merged 1 commit into from
Feb 17, 2023

Conversation

s-urbaniak
Copy link
Contributor

Summary

This adds support for scraping metrics via Prometheus for e2e tests and local shard testing. If the environment variable PROMETHEUS_URL is set, scraping targets are being configured automatically.

For e2e tests, ARTIFACT_DIR needs to be set additionally.

This is the first part of adding scraping support. The second part will add wiring in CI.

Related issue(s)

@openshift-ci openshift-ci bot requested review from csams and davidfestal February 10, 2023 11:27
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
Copy link
Contributor

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the bits in e.g. make that actually start Prometheus, etc?

cmd/test-server/kcp/shard.go Outdated Show resolved Hide resolved
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
@s-urbaniak
Copy link
Contributor Author

s-urbaniak commented Feb 10, 2023

@stevekuznetsov re:

Where are the bits in e.g. make that actually start Prometheus, etc?

this is part 2, as mentioned in the summary 😛 working on that now, wanted to have the code in first.

EDIT: I made all the necessary changes in the Makefile in this PR as well.

@s-urbaniak s-urbaniak force-pushed the prometheus-e2e branch 11 times, most recently from d44f05a to f16f379 Compare February 13, 2023 15:28
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the make targets as written won't actually stop prometheus when e2e tests fail. Added comments where I think changes are needed.

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@s-urbaniak s-urbaniak force-pushed the prometheus-e2e branch 2 times, most recently from 253b6e0 to 09f40d2 Compare February 15, 2023 15:21
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
hack/test-e2e.sh Outdated Show resolved Hide resolved
test/e2e/framework/kcp.go Outdated Show resolved Hide resolved
@s-urbaniak s-urbaniak force-pushed the prometheus-e2e branch 3 times, most recently from 952eeec to c474764 Compare February 16, 2023 12:27
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 16, 2023
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 16, 2023
Makefile Outdated Show resolved Hide resolved
@s-urbaniak s-urbaniak force-pushed the prometheus-e2e branch 4 times, most recently from 848ecb4 to 8c3160d Compare February 17, 2023 10:16
@sttts
Copy link
Member

sttts commented Feb 17, 2023

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2023
@ncdc
Copy link
Member

ncdc commented Feb 17, 2023

Do we want tests to fail if there are (unrelated) issues with Prometheus?

2023-02-17T11:17:20.9252004Z --- FAIL: TestSyncerTunnel (47.06s)
2023-02-17T11:17:20.9258266Z     tunnels_test.go:59: Saving test artifacts for test "TestSyncerTunnel" under "/tmp/e2e/TestSyncerTunnel/1144179117".
2023-02-17T11:17:20.9259388Z     tunnels_test.go:59: Starting kcp servers...
2023-02-17T11:17:20.9315003Z     kcp.go:687: running: /home/runner/work/kcp/kcp/bin/kcp start --root-directory /tmp/TestSyncerTunnel3196772349/003/kcp/main --secure-port=41259 --embedded-etcd-client-port=43919 --embedded-etcd-peer-port=33053 --embedded-etcd-wal-size-bytes=5000 --kubeconfig-path=/tmp/TestSyncerTunnel3196772349/003/kcp/main/admin.kubeconfig --feature-gates=CustomResourceValidationExpressions=true,KCPSyncerTunnel=true --audit-log-path /tmp/e2e/TestSyncerTunnel/1144179117/artifacts/kcp/main/kcp.audit -v=4 --token-auth-file /tmp/TestSyncerTunnel3196772349/001/auth-tokens.csv --audit-policy-file /tmp/TestSyncerTunnel3196772349/002/audit-policy.yaml
2023-02-17T11:17:20.9322783Z     kcp.go:1190: waiting for readiness for server at https://10.1.0.39:41259
2023-02-17T11:17:20.9323443Z     kcp.go:1233: success contacting https://10.1.0.39:41259/readyz
2023-02-17T11:17:20.9328461Z     kcp.go:1233: success contacting https://10.1.0.39:41259/livez
2023-02-17T11:17:20.9329524Z     kcp.go:1211: server at https://10.1.0.39:41259 is ready
2023-02-17T11:17:20.9339402Z     kcp.go:229: 
2023-02-17T11:17:20.9339988Z         	Error Trace:	kcp.go:229
2023-02-17T11:17:20.9340564Z         	            				kcp.go:446
2023-02-17T11:17:20.9346086Z         	            				kcp.go:139
2023-02-17T11:17:20.9346790Z         	            				tunnels_test.go:59
2023-02-17T11:17:20.9351680Z         	Error:      	Received unexpected error:
2023-02-17T11:17:20.9352643Z         	            	kcp prometheus scrape target never became healthy
2023-02-17T11:17:20.9373765Z         	Test:       	TestSyncerTunnel
2023-02-17T11:17:20.9374264Z     kcp.go:674: cleanup: canceling context
2023-02-17T11:17:20.9383333Z     kcp.go:678: cleanup: waiting for shutdownComplete
2023-02-17T11:17:20.9383929Z     kcp.go:682: cleanup: received shutdownComplete
2023-02-17T11:17:20.9392662Z FAIL
2023-02-17T11:17:20.9393232Z FAIL	github.com/kcp-dev/kcp/test/e2e/syncer	142.785s
2023-02-17T11:17:20.9393768Z ok  	github.com/kcp-dev/kcp/test/e2e/syncer/dns	108.474s
2023-02-17T11:17:20.9405062Z ?   	github.com/kcp-dev/kcp/test/e2e/syncer/dns/workspace1	[no test files]
2023-02-17T11:17:20.9405698Z ?   	github.com/kcp-dev/kcp/test/e2e/syncer/dns/workspace2	[no test files]
2023-02-17T11:17:21.0786016Z ok  	github.com/kcp-dev/kcp/test/e2e/virtual/apiexport	1.374s
2023-02-17T11:17:40.0307249Z ok  	github.com/kcp-dev/kcp/test/e2e/virtual/initializingworkspaces	1.087s
2023-02-17T11:17:40.5568997Z ok  	github.com/kcp-dev/kcp/test/e2e/virtual/syncer	0.977s
2023-02-17T11:18:00.5678430Z ok  	github.com/kcp-dev/kcp/test/e2e/watchcache	0.974s
2023-02-17T11:18:00.7017815Z ok  	github.com/kcp-dev/kcp/test/e2e/workspacetype	1.135s
2023-02-17T11:18:00.7088121Z FAIL
2023-02-17T11:18:02.5264217Z make: *** [Makefile:348: test-e2e-sharded] Error 1
2023-02-17T11:18:02.5511521Z I0217 11:18:02.548097   27348 shard.go:242] "gathering shard metrics" shard="kcp-0"
2023-02-17T11:18:02.5541775Z + EXIT_CODE=2

@sttts
Copy link
Member

sttts commented Feb 17, 2023

/retest

@sttts
Copy link
Member

sttts commented Feb 17, 2023

"kcp prometheus scrape target never became healthy" - what does this mean exactly?

@s-urbaniak
Copy link
Contributor Author

looking

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2023
@s-urbaniak
Copy link
Contributor Author

@ncdc @sttts I removed the hard check which verifies if the target is up. I am not sure why it was not detected (it is present in the scraped metrics) but it's not worth checking it. The only risk we have is that we'll miss metrics in case of e2e tests that run very fast.

@ncdc
Copy link
Member

ncdc commented Feb 17, 2023

Re-tagging as @sttts was happy with it 😄
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2023
@openshift-merge-robot openshift-merge-robot merged commit 3730234 into kcp-dev:main Feb 17, 2023
@s-urbaniak s-urbaniak deleted the prometheus-e2e branch February 17, 2023 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants