Skip to content

Commit a9de745

Browse files
committed
Automate longevity test
Problem: NFR tests are a burden to run manually, taking a lot of time and effort. Solution: Automate the longevity test to make it easier and faster for a developer to run this test. This test will be run separately from the other NFR tests, due to the fact that it is long-lived. It should not be run in the pipeline. There is still a manual step of collecting dashboard results. Also separated out functional and nfr tests in the Makefile and README to better separate the two types of tests. These changes force NFR tests to be run in a GKE environment.
1 parent 2c8f750 commit a9de745

35 files changed

+389
-237
lines changed

Diff for: .gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,6 @@ internal/mode/static/nginx/modules/coverage
5151

5252
# Credential files
5353
**/gha-creds-*.json
54+
55+
# SSH config files
56+
*.ssh

Diff for: .yamllint.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ rules:
4141
.github/
4242
deploy/manifests/nginx-gateway.yaml
4343
deploy/manifests/crds
44-
tests/longevity/manifests/cronjob.yaml
44+
tests/suite/manifests/longevity/cronjob.yaml
4545
.goreleaser.yml
4646
new-line-at-end-of-file: enable
4747
new-lines: enable

Diff for: tests/Makefile

+53-25
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,10 @@ help: Makefile ## Display this help
3232
create-kind-cluster: ## Create a kind cluster
3333
cd .. && make create-kind-cluster
3434

35+
.PHONY: delete-kind-cluster
36+
delete-kind-cluster: ## Delete kind cluster
37+
kind delete cluster
38+
3539
.PHONY: build-images
3640
build-images: ## Build NGF and NGINX images
3741
cd .. && make PREFIX=$(PREFIX) TAG=$(TAG) build-images
@@ -48,51 +52,75 @@ load-images: ## Load NGF and NGINX images on configured kind cluster
4852
load-images-with-plus: ## Load NGF and NGINX Plus images on configured kind cluster
4953
cd .. && make PREFIX=$(PREFIX) TAG=$(TAG) load-images-with-plus
5054

51-
test: ## Run the system tests against your default k8s cluster
52-
go test -v ./suite $(GINKGO_FLAGS) -args --gateway-api-version=$(GW_API_VERSION) \
53-
--gateway-api-prev-version=$(GW_API_PREV_VERSION) --image-tag=$(TAG) --version-under-test=$(NGF_VERSION) \
54-
--plus-enabled=$(PLUS_ENABLED) --ngf-image-repo=$(PREFIX) --nginx-image-repo=$(NGINX_PREFIX) \
55-
--pull-policy=$(PULL_POLICY) --k8s-version=$(K8S_VERSION) --service-type=$(GW_SERVICE_TYPE) \
56-
--is-gke-internal-lb=$(GW_SVC_GKE_INTERNAL)
55+
.PHONY: setup-gcp-and-run-tests
56+
setup-gcp-and-run-tests: create-gke-router create-and-setup-vm run-tests-on-vm ## Create and setup a GKE router and GCP VM for tests and run the functional tests
5757

58-
.PHONY: delete-kind-cluster
59-
delete-kind-cluster: ## Delete kind cluster
60-
kind delete cluster
58+
.PHONY: setup-gcp-and-run-nfr-tests
59+
setup-gcp-and-run-nfr-tests: create-gke-router create-and-setup-vm nfr-test ## Create and setup a GKE router and GCP VM for tests and run the NFR tests
6160

62-
.PHONY: run-tests-on-vm
63-
run-tests-on-vm: ## Run the tests on a GCP VM
64-
bash scripts/run-tests-gcp-vm.sh
61+
.PHONY: create-gke-cluster
62+
create-gke-cluster: ## Create a GKE cluster
63+
bash scripts/create-gke-cluster.sh $(CI)
6564

6665
.PHONY: create-and-setup-vm
6766
create-and-setup-vm: ## Create and setup a GCP VM for tests
6867
bash scripts/create-and-setup-gcp-vm.sh
6968

70-
.PHONY: cleanup-vm
71-
cleanup-vm: ## Delete the test GCP VM and delete the firewall rule
72-
bash scripts/cleanup-vm.sh
73-
7469
.PHONY: create-gke-router
7570
create-gke-router: ## Create a GKE router to allow egress traffic from private nodes (allows for external image pulls)
7671
bash scripts/create-gke-router.sh
7772

78-
.PHONY: cleanup-router
79-
cleanup-router: ## Delete the GKE router
80-
bash scripts/cleanup-router.sh
73+
.PHONY: sync-files-to-vm
74+
sync-files-to-vm: ## Syncs your local NGF files with the NGF repo on the VM
75+
bash scripts/sync-files-to-vm.sh
8176

82-
.PHONY: setup-gcp-and-run-tests
83-
setup-gcp-and-run-tests: create-gke-router create-and-setup-vm run-tests-on-vm ## Create and setup a GKE router and GCP VM for tests and run the tests
77+
.PHONY: run-tests-on-vm
78+
run-tests-on-vm: ## Run the functional tests on a GCP VM
79+
bash scripts/run-tests-gcp-vm.sh
80+
81+
.PHONY: nfr-test
82+
nfr-test: ## Run the NFR tests on a GCP VM
83+
bash scripts/run-tests-gcp-vm.sh true
84+
85+
.PHONY: start-longevity-test
86+
start-longevity-test: ## Start the longevity test to run for 4 days in GKE
87+
START_LONGEVITY=true $(MAKE) nfr-test
88+
89+
.PHONY: stop-longevity-test
90+
stop-longevity-test: ## Stops the longevity test and collects results
91+
STOP_LONGEVITY=true $(MAKE) nfr-test
92+
93+
.PHONY: .vm-nfr-test
94+
.vm-nfr-test: ## Runs the NFR tests on the GCP VM (called by `nfr-test`)
95+
go test -v ./suite -ginkgo.label-filter "nfr" $(GINKGO_FLAGS) -ginkgo.v -args --gateway-api-version=$(GW_API_VERSION) \
96+
--gateway-api-prev-version=$(GW_API_PREV_VERSION) --image-tag=$(TAG) --version-under-test=$(NGF_VERSION) \
97+
--plus-enabled=$(PLUS_ENABLED) --ngf-image-repo=$(PREFIX) --nginx-image-repo=$(NGINX_PREFIX) \
98+
--pull-policy=$(PULL_POLICY) --k8s-version=$(K8S_VERSION) --service-type=$(GW_SERVICE_TYPE) \
99+
--is-gke-internal-lb=$(GW_SVC_GKE_INTERNAL)
100+
101+
.PHONY: test
102+
test: ## Runs the functional tests on your default k8s cluster
103+
go test -v ./suite -ginkgo.label-filter "functional" $(GINKGO_FLAGS) -args --gateway-api-version=$(GW_API_VERSION) \
104+
--gateway-api-prev-version=$(GW_API_PREV_VERSION) --image-tag=$(TAG) --version-under-test=$(NGF_VERSION) \
105+
--plus-enabled=$(PLUS_ENABLED) --ngf-image-repo=$(PREFIX) --nginx-image-repo=$(NGINX_PREFIX) \
106+
--pull-policy=$(PULL_POLICY) --k8s-version=$(K8S_VERSION) --service-type=$(GW_SERVICE_TYPE) \
107+
--is-gke-internal-lb=$(GW_SVC_GKE_INTERNAL)
84108

85109
.PHONY: cleanup-gcp
86110
cleanup-gcp: cleanup-router cleanup-vm delete-gke-cluster ## Cleanup all GCP resources
87111

88-
.PHONY: create-gke-cluster
89-
create-gke-cluster: ## Create a GKE cluster
90-
bash scripts/create-gke-cluster.sh $(CI)
112+
.PHONY: cleanup-router
113+
cleanup-router: ## Delete the GKE router
114+
bash scripts/cleanup-router.sh
115+
116+
.PHONY: cleanup-vm
117+
cleanup-vm: ## Delete the test GCP VM and delete the firewall rule
118+
bash scripts/cleanup-vm.sh
91119

92120
.PHONY: delete-gke-cluster
93121
delete-gke-cluster: ## Delete the GKE cluster
94122
bash scripts/delete-gke-cluster.sh
95123

96124
.PHONY: add-local-ip-to-cluster
97125
add-local-ip-to-cluster: ## Add local IP to the GKE cluster master-authorized-networks
98-
bash scripts/add-local-ip-to-cluster.sh
126+
bash scripts/add-local-ip-auth-networks.sh

Diff for: tests/README.md

+79-16
Original file line numberDiff line numberDiff line change
@@ -4,33 +4,36 @@ The tests in this directory are meant to be run on a live Kubernetes environment
44
are similar to the existing [conformance tests](../conformance/README.md), but will verify things such as:
55

66
- NGF-specific functionality
7-
- Non-Functional requirements testing (such as performance, scale, etc.)
7+
- Non-Functional requirements (NFR) testing (such as performance, scale, etc.)
88

99
When running locally, the tests create a port-forward from your NGF Pod to localhost using a port chosen by the
1010
test framework. Traffic is sent over this port. If running on a GCP VM targeting a GKE cluster, the tests will create an
1111
internal LoadBalancer service which will receive the test traffic.
1212

13+
**Important**: NFR tests can only be run on a GKE cluster.
14+
1315
Directory structure is as follows:
1416

1517
- `framework`: contains utility functions for running the tests
16-
- `suite`: contains the test files
1718
- `results`: contains the results files
19+
- `scripts`: contain scripts used to set up the environment and run the tests
20+
- `suite`: contains the test files
1821

19-
**Note**: Existing NFR tests will be migrated into this testing `suite` and results stored in the `results` directory.
22+
> Note: Existing NFR tests will be migrated into this testing `suite` and results stored in the `results` directory.
2023
2124
## Prerequisites
2225

2326
- Kubernetes cluster.
2427
- Docker.
2528
- Golang.
2629

27-
If running the tests on a VM (`make create-vm-and-run-tests` or `make run-tests-on-vm`):
30+
If running NFR tests, or running functional tests in GKE:
2831

2932
- The [gcloud CLI](https://cloud.google.com/sdk/docs/install)
3033
- A GKE cluster (if `master-authorized-networks` is enabled, please set `ADD_VM_IP_AUTH_NETWORKS=true` in your vars.env file)
3134
- Access to GCP Service Account with Kubernetes admin permissions
3235

33-
**Note**: all commands in steps below are executed from the `tests` directory
36+
> Note: all commands in steps below are executed from the `tests` directory
3437
3538
```shell
3639
make
@@ -52,9 +55,14 @@ delete-kind-cluster Delete kind cluster
5255
help Display this help
5356
load-images-with-plus Load NGF and NGINX Plus images on configured kind cluster
5457
load-images Load NGF and NGINX images on configured kind cluster
55-
run-tests-on-vm Run the tests on a GCP VM
56-
setup-gcp-and-run-tests Create and setup a GKE router and GCP VM for tests and run the tests
57-
test Run the system tests against your default k8s cluster
58+
nfr-test Run the NFR tests on a GCP VM
59+
run-tests-on-vm Run the functional tests on a GCP VM
60+
setup-gcp-and-run-nfr-tests Create and setup a GKE router and GCP VM for tests and run the NFR tests
61+
setup-gcp-and-run-tests Create and setup a GKE router and GCP VM for tests and run the functional tests
62+
start-longevity-test Start the longevity test to run for 4 days in GKE
63+
stop-longevity-test Stops the longevity test and collects results
64+
sync-files-to-vm Syncs your local NGF files with the NGF repo on the VM
65+
test Runs the functional tests on your default k8s cluster
5866
```
5967

6068
**Note:** The following variables are configurable when running the below `make` commands:
@@ -78,6 +86,8 @@ test Run the system tests against your default k8s clu
7886

7987
This can be done in a cloud provider of choice, or locally using `kind`.
8088

89+
**Important**: NFR tests can only be run on a GKE cluster.
90+
8191
To create a local `kind` cluster:
8292

8393
```makefile
@@ -128,7 +138,7 @@ make build-images-with-plus load-images-with-plus TAG=$(whoami)
128138

129139
## Step 3 - Run the tests
130140

131-
### 3a - Run the tests locally
141+
### 3a - Run the functional tests locally
132142

133143
```makefile
134144
make test TAG=$(whoami)
@@ -142,9 +152,9 @@ make test TAG=$(whoami) PLUS_ENABLED=true
142152

143153
### 3b - Run the tests on a GKE cluster from a GCP VM
144154

145-
This step only applies if you would like to run the tests on a GKE cluster from a GCP based VM.
155+
This step only applies if you are running the NFR tests, or would like to run the functional tests on a GKE cluster from a GCP based VM.
146156

147-
Before running the below `make` command, copy the `scripts/vars.env-example` file to `scripts/vars.env` and populate the
157+
Before running the below `make` commands, copy the `scripts/vars.env-example` file to `scripts/vars.env` and populate the
148158
required env vars. `GKE_SVC_ACCOUNT` needs to be the name of a service account that has Kubernetes admin permissions.
149159

150160
In order to run the tests in GCP, you need a few things:
@@ -153,30 +163,81 @@ In order to run the tests in GCP, you need a few things:
153163
- this assumes that your GKE cluster is using private nodes. If using public nodes, you don't need this.
154164
- GCP VM and firewall rule to send ingress traffic to GKE
155165

166+
To just set up the VM with no router (this will not run the tests):
167+
168+
```makefile
169+
make create-and-setup-vm
170+
```
171+
172+
Otherwise, you can set up the VM, router, and run the tests with a single command. See the options in the sections below.
173+
174+
By default, the tests run using the version of NGF that was `git cloned` during the setup. If you want to make
175+
incremental changes and copy your local changes to the VM to test, you can run
176+
177+
```makefile
178+
make sync-files-to-vm
179+
```
180+
181+
#### Functional Tests
182+
156183
To set up the GCP environment with the router and VM and then run the tests, run the following command:
157184

158185
```makefile
159186
make setup-gcp-and-run-tests
160187
```
161188

162-
If you just need a VM and no router (this will not run the tests):
189+
To use an existing VM to run the tests, run the following
163190

164191
```makefile
165-
make create-and-setup-vm
192+
make run-tests-on-vm
193+
```
194+
195+
#### NFR tests
196+
197+
To set up the GCP environment with the router and VM and then run the tests, run the following command:
198+
199+
200+
```makefile
201+
make setup-gcp-and-run-nfr-tests
166202
```
167203

168204
To use an existing VM to run the tests, run the following
169205

170206
```makefile
171-
make run-tests-on-vm
207+
make nfr-test
172208
```
173209

210+
##### Longevity testing
211+
212+
This test is run on its own (and also not in a pipeline) due to its long-running nature. It will run for 4 days before
213+
the tester must collect the results and complete the test.
214+
215+
To start the longevity test, set up your VM (`create-and-setup-vm`) and run
216+
217+
```makefile
218+
make start-longevity-test
219+
```
220+
221+
> Note: If you want to re-run the longevity test, you need to clear out the `cafe.example.com` entry from the `/etc/hosts` file on your VM.
222+
223+
You can verify the test is working by checking nginx logs to see traffic flow, and check that the cronjob is running and redeploying apps.
224+
225+
To complete the longevity test and collect results, first visit the [GCP Monitoring Dashboards](https://console.cloud.google.com/monitoring/dashboards) page and select the `NGF Longevity Test` dashboard. Take PNG screenshots of each chart for the time period in which your test ran, and save those to be added to the results file.
226+
227+
Next, run:
228+
229+
```makefile
230+
make stop-longevity-test
231+
```
232+
233+
This will tear down the test and collect results into a file, where you can add the PNGs of the dashboard.
234+
174235
### Common test amendments
175236

176-
To run all tests with the label "performance", use the GINKGO_LABEL variable:
237+
To run all tests with the label "my-label", use the GINKGO_LABEL variable:
177238

178239
```makefile
179-
make test TAG=$(whoami) GINKGO_LABEL=performance
240+
make test TAG=$(whoami) GINKGO_LABEL=my-label
180241
```
181242

182243
or to pass a specific flag, e.g. run a specific test, use the GINKGO_FLAGS variable:
@@ -185,6 +246,8 @@ or to pass a specific flag, e.g. run a specific test, use the GINKGO_FLAGS varia
185246
make test TAG=$(whoami) GINKGO_FLAGS='-ginkgo.focus "writes the system info to a results file"'
186247
```
187248

249+
> Note: if filtering on NFR tests (or functional tests on GKE), set the filter in the appropriate field in your `vars.env` file.
250+
188251
If you are running the tests in GCP, add your required label/ flags to `scripts/var.env`.
189252

190253
You can also modify the tests code for a similar outcome. To run a specific test, you can "focus" it by adding the `F`

Diff for: tests/framework/results.go

+9
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,15 @@ func WriteResults(resultsFile *os.File, metrics *Metrics) error {
7777
return reporter.Report(resultsFile)
7878
}
7979

80+
// WriteContent writes basic content to the results file.
81+
func WriteContent(resultsFile *os.File, content string) error {
82+
if _, err := fmt.Fprintln(resultsFile, content); err != nil {
83+
return err
84+
}
85+
86+
return nil
87+
}
88+
8089
// NewCSVEncoder returns a vegeta CSV encoder.
8190
func NewCSVEncoder(w io.Writer) vegeta.Encoder {
8291
return vegeta.NewCSVEncoder(w)

0 commit comments

Comments
 (0)