Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve e2e test reliability #2580

Merged
merged 48 commits into from
Feb 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
d6b3f05
Reduce concurrent executions from 6 to 5
JorTurFer Jan 29, 2022
2be3910
Update e2e test to 8
JorTurFer Jan 29, 2022
b8f0844
Increase az pipelines timeouts
JorTurFer Jan 29, 2022
d3ec3a0
Increase mysql timeouts
JorTurFer Jan 29, 2022
93409a2
Increase predictkube timeouts
JorTurFer Jan 29, 2022
2f2c965
Increase selenium timeouts
JorTurFer Jan 29, 2022
0327f0f
Increase azure-queue timeouts
JorTurFer Jan 29, 2022
8502c3a
Increase redis-streams timeouts
JorTurFer Jan 30, 2022
ccddc65
Increase open-stack timeouts
JorTurFer Jan 30, 2022
21184ca
Remove unecessary import
JorTurFer Jan 30, 2022
638bb54
modify az pipeline e2e test
JorTurFer Jan 30, 2022
a7e4542
update changelog
JorTurFer Jan 30, 2022
f04baa4
Increase redis-streams timeouts
JorTurFer Jan 30, 2022
36d4ed3
fix style
JorTurFer Jan 30, 2022
0455578
Increase azure-queue timeout
JorTurFer Jan 30, 2022
dc7e0b4
Reduce concurrency on tests
JorTurFer Jan 30, 2022
4ccc689
Increase time for integration test and concurrency on e2e
JorTurFer Jan 30, 2022
c29efc1
reduce e2e concurrency from 4 to 3
JorTurFer Jan 30, 2022
3cef8b8
Incrase concurrency to 6 and azure-queue timeouts
Jan 31, 2022
747a9f8
concurrency to 4 & increase selenium timeouts
Jan 31, 2022
9843bad
Increase some timeouts
JorTurFer Jan 31, 2022
577975a
8 concurrent tests
JorTurFer Jan 31, 2022
d33f82e
Change timeout during integration tests
JorTurFer Jan 31, 2022
66a3d13
Update timeoutz
Feb 1, 2022
e158bb8
test with 2 concurrent test
Feb 1, 2022
9449f15
Increase timeout in an integration test
Feb 1, 2022
5821ce9
set concurrency to 6
Feb 1, 2022
a7fa172
Remove resources from e2e pods
JorTurFer Feb 1, 2022
d4b0c66
Increase cassandra timeouts
JorTurFer Feb 1, 2022
4a69ff6
Increase selenium timeouts
JorTurFer Feb 1, 2022
68b1d98
Increase redis timeouts
JorTurFer Feb 1, 2022
26eec99
Increase ava timeouts
Feb 2, 2022
3cbfe2a
Reduce waiting time selenium
Feb 2, 2022
ebd1447
Update new relic test
Feb 2, 2022
86db63c
set max instances for cassandra test to 2
JorTurFer Feb 2, 2022
7333e2a
Increase argo timeouts
JorTurFer Feb 2, 2022
ddcbb9e
Add some configs to selenium
JorTurFer Feb 2, 2022
899d0b6
undo change in integration test
JorTurFer Feb 2, 2022
4e3a488
Update selenium test
JorTurFer Feb 3, 2022
7a18cdb
Add retry for failing e2e tests
Feb 4, 2022
95338db
Update retry system
Feb 4, 2022
3ce106a
fix and error waiting
Feb 4, 2022
1455c79
FORCE TEST FAILURE
JorTurFer Feb 5, 2022
cf24adf
Undo test failure probe
JorTurFer Feb 5, 2022
162e09a
Fix azure-queue-restore-original-replicas.test.ts
JorTurFer Feb 10, 2022
1621292
Update azure queue tests to avoid condition races
JorTurFer Feb 10, 2022
16ba72d
Fix some styles
Feb 11, 2022
149cada
Improve azure queues
Feb 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@

### Improvements

- TODO ([#XXX](https://github.com/kedacore/keda/issue/XXX))
- Improve e2e tests reliability ([#2580](https://github.com/kedacore/keda/issues/2580))

### Breaking Changes

Expand Down
621 changes: 431 additions & 190 deletions config/crd/bases/keda.sh_scaledjobs.yaml

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion tests/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"require": [
"ts-node/register"
],
"timeout": "10m"
"timeout": "30m"
},
"scripts": {
"test": "ava"
Expand Down
40 changes: 37 additions & 3 deletions tests/run-all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ E2E_REGEX=${E2E_TEST_REGEX:-*.test.ts}
DIR=$(dirname "$0")
cd $DIR

concurrent_tests_limit=5
concurrent_tests_limit=6
pids=()
lookup=()
failed_count=0
Expand All @@ -23,18 +23,52 @@ function run_tests {
for test_case in $(find scalers -name "$E2E_REGEX" | shuf)
do
counter=$((counter+1))
./node_modules/.bin/ava $test_case > "${test_case}.log" 2>&1 &
./node_modules/.bin/ava $test_case > "${test_case}.1.log" 2>&1 &
pid=$!
echo "Running $test_case with pid: $pid"
pids+=($pid)
lookup[$pid]=$test_case
# limit concurrent runs
if [[ "$counter" -gt "$concurrent_tests_limit" ]]; then
if [[ "$counter" -ge "$concurrent_tests_limit" ]]; then
wait_for_jobs
counter=0
pids=()
fi
done

wait_for_jobs

# Retry failing tests
if [ ${#failed_lookup[@]} -ne 0 ]; then

printf "\n\n##############################################\n"
printf "##############################################\n\n"
printf "FINISHED FIRST EXECUTION, RETRYING FAILING TESTS"
printf "\n\n##############################################\n"
printf "##############################################\n\n"

retry_lookup=("${failed_lookup[@]}")
counter=0
pids=()
failed_count=0
failed_lookup=()

for test_case in "${retry_lookup[@]}"
do
counter=$((counter+1))
./node_modules/.bin/ava $test_case > "${test_case}.2.log" 2>&1 &
pid=$!
echo "Rerunning $test_case with pid: $pid"
pids+=($pid)
lookup[$pid]=$test_case
# limit concurrent runs
if [[ "$counter" -ge "$concurrent_tests_limit" ]]; then
wait_for_jobs
counter=0
pids=()
fi
done
fi
}

function mark_failed {
Expand Down
6 changes: 0 additions & 6 deletions tests/scalers/activemq.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -135,12 +135,6 @@ spec:
name: mqtt
protocol: TCP
resources:
requests:
memory: 500Mi
cpu: 200m
limits:
memory: 1000Mi
cpu: 400m
volumeMounts:
- name: activemq-config
mountPath: /opt/apache-activemq-5.16.3/webapps/api/WEB-INF/classes/jolokia-access.xml
Expand Down
12 changes: 6 additions & 6 deletions tests/scalers/argo-rollouts.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,31 +74,31 @@ test.serial(`Rollouts should scale to 5 (the max) with HTTP Requests exceeding i

// keda based rollout should start scaling up with http requests issued
let replicaCount = '0'
for (let i = 0; i < 60 && replicaCount !== '5'; i++) {
t.log(`Waited ${5 * i} seconds for prometheus-based rollout to scale up`)
for (let i = 0; i < 120 && replicaCount !== '5'; i++) {
t.log(`Waited ${10 * i} seconds for prometheus-based rollout to scale up`)
const jobLogs = sh.exec(`kubectl logs -l job-name=generate-requests -n ${testNamespace}`).stdout
t.log(`Logs from the generate requests: ${jobLogs}`)

replicaCount = sh.exec(
`kubectl get rollouts.argoproj.io/keda-test-app --namespace ${testNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
if (replicaCount !== '5') {
await sleep(5000)
await sleep(10000)
}
}

t.is('5', replicaCount, 'Replica count should be maxed at 5')

for (let i = 0; i < 50 && replicaCount !== '0'; i++) {
for (let i = 0; i < 90 && replicaCount !== '0'; i++) {
replicaCount = sh.exec(
`kubectl get rollouts.argoproj.io/keda-test-app --namespace ${testNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
if (replicaCount !== '0') {
await sleep(5000)
await sleep(10000)
}
}

t.is('0', replicaCount, 'Replica count should be 0 after 3 minutes')
t.is('0', replicaCount, 'Replica count should be 0 after 15 minutes')
})

test.after.always.cb('clean up argo-rollouts testing deployment', t => {
Expand Down
3 changes: 0 additions & 3 deletions tests/scalers/artemis-helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -159,9 +159,6 @@ spec:
image: docker.io/vromero/activemq-artemis:2.6.2
imagePullPolicy:
resources:
requests:
cpu: 100m
memory: 256Mi
env:
- name: ARTEMIS_PASSWORD
valueFrom:
Expand Down
59 changes: 32 additions & 27 deletions tests/scalers/azure-pipelines.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ const projectName = process.env['AZURE_DEVOPS_PROJECT']
const buildDefinitionID = process.env['AZURE_DEVOPS_BUILD_DEFINITON_ID']
const poolName = process.env['AZURE_DEVOPS_POOL_NAME']

let poolID: number

test.before(async t => {
if (!organizationURL || !personalAccessToken || !projectName || !buildDefinitionID || !poolName) {
t.fail('AZURE_DEVOPS_ORGANIZATION_URL, AZURE_DEVOPS_PAT, AZURE_DEVOPS_PROJECT, AZURE_DEVOPS_BUILD_DEFINITON_ID and AZURE_DEVOPS_POOL_NAME environment variables are required for azure pipelines tests')
Expand All @@ -25,7 +27,7 @@ test.before(async t => {

let taskAgent: ta.ITaskAgentApiBase = await connection.getTaskAgentApi();
let agentPool: ti.TaskAgentPool[] = await taskAgent.getAgentPools(poolName)
let poolID: number = agentPool[0].id
poolID = agentPool[0].id

if(!poolID) {
t.fail("failed to convert poolName to poolID")
Expand All @@ -40,38 +42,43 @@ test.before(async t => {
.replace('{{AZP_URL}}', organizationURL))
sh.exec(`kubectl create namespace ${defaultNamespace}`)
t.is(0, sh.exec(`kubectl apply -f ${deployFile.name} --namespace ${defaultNamespace}`).code, 'creating a deployment should work.')
})

test.serial('Deployment should have 1 replicas on start', async t => {
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 120, 1000), 'replica count should start out as 1')
})


test.serial('Deployment should have 0 replicas after scale', async t => {
// wait for the first agent to be registered in the agent pool
await sleep(20 * 1000)

const scaledObjectFile = tmp.fileSync()
fs.writeFileSync(scaledObjectFile.name, poolIdScaledObject
.replace('{{AZP_POOL_ID}}', poolID.toString()))
t.is(0, sh.exec(`kubectl apply -f ${scaledObjectFile.name} --namespace ${defaultNamespace}`).code, 'creating ScaledObject with poolId should work.')
})

test.serial('Deployment should have 1 replicas on start', async t => {
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 120, 1000), 'replica count should start out as 1')
t.true(await waitForDeploymentReplicaCount(0, 'test-deployment', defaultNamespace, 120, 1000), 'replica count should be 0 if no pending jobs')
})

test.serial('PoolID: Deployment should scale to 3 replicas after queueing 3 jobs', async t => {

test.serial('PoolID: Deployment should scale to 1 replica after queueing job', async t => {
let authHandler = azdev.getPersonalAccessTokenHandler(personalAccessToken);
let connection = new azdev.WebApi(organizationURL, authHandler);
let build: ba.IBuildApi = await connection.getBuildApi();
var definitionID = parseInt(buildDefinitionID)

// wait for the first agent to be registered in the agent pool
await sleep(20 * 1000)
await build.queueBuild(null, projectName, null, null, null, definitionID)

for(let i = 0; i < 3; i++) {
await build.queueBuild(null, projectName, null, null, null, definitionID)
}

t.true(await waitForDeploymentReplicaCount(3, 'test-deployment', defaultNamespace, 30, 5000), 'replica count should be 3 after starting 3 jobs')
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 30, 5000), 'replica count should be 1 after starting a job')
})

test.serial('PoolID: Deployment should scale to 1 replica after finishing 3 jobs', async t => {
test.serial('PoolID: Deployment should scale to 0 replicas after finishing job', async t => {
// wait 10 minutes for the jobs to finish and scale down
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 60, 10000), 'replica count should be 1 after finishing 3 jobs')
t.true(await waitForDeploymentReplicaCount(0, 'test-deployment', defaultNamespace, 120, 10000), 'replica count should be 0 after finishing')
})

test.serial('PoolName: Deployment should scale to 3 replicas after queueing 3 jobs', async t => {
test.serial('PoolName: Deployment should scale to 1 replica after queueing job', async t => {
const poolNameScaledObjectFile = tmp.fileSync()
fs.writeFileSync(poolNameScaledObjectFile.name, poolNameScaledObject
.replace('{{AZP_POOL}}', poolName))
Expand All @@ -82,16 +89,14 @@ test.serial('PoolName: Deployment should scale to 3 replicas after queueing 3 jo
let build: ba.IBuildApi = await connection.getBuildApi();
var definitionID = parseInt(buildDefinitionID)

for(let i = 0; i < 3; i++) {
await build.queueBuild(null, projectName, null, null, null, definitionID)
}
await build.queueBuild(null, projectName, null, null, null, definitionID)

t.true(await waitForDeploymentReplicaCount(3, 'test-deployment', defaultNamespace, 30, 5000), 'replica count should be 3 after starting 3 jobs')
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 30, 5000), 'replica count should be 1 after starting a job')
})

test.serial('PoolName: should scale to 1 replica after finishing 3 jobs', async t => {
test.serial('PoolName: should scale to 0 replicas after finishing job', async t => {
// wait 10 minutes for the jobs to finish and scale down
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', defaultNamespace, 60, 10000), 'replica count should be 1 after finishing 3 jobs')
t.true(await waitForDeploymentReplicaCount(0, 'test-deployment', defaultNamespace, 120, 10000), 'replica count should be 0 after finishing')
})

test.after.always('clean up azure-pipelines deployment', t => {
Expand Down Expand Up @@ -157,9 +162,9 @@ metadata:
spec:
scaleTargetRef:
name: test-deployment
minReplicaCount: 1
maxReplicaCount: 3
pollingInterval: 50
minReplicaCount: 0
maxReplicaCount: 1
pollingInterval: 30
cooldownPeriod: 60
advanced:
horizontalPodAutoscalerConfig:
Expand All @@ -179,9 +184,9 @@ metadata:
spec:
scaleTargetRef:
name: test-deployment
minReplicaCount: 1
maxReplicaCount: 3
pollingInterval: 50
minReplicaCount: 0
maxReplicaCount: 1
pollingInterval: 30
cooldownPeriod: 60
advanced:
horizontalPodAutoscalerConfig:
Expand Down
48 changes: 20 additions & 28 deletions tests/scalers/azure-queue-restore-original-replicas.test.ts
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@
import * as azure from 'azure-storage'
import * as fs from 'fs'
import * as sh from 'shelljs'
import * as tmp from 'tmp'
import test from 'ava'
import {waitForDeploymentReplicaCount} from "./helpers";

const defaultNamespace = 'azure-queue-restore-original-replicas-test'
const queueName = 'queue-name-restore'
const connectionString = process.env['TEST_STORAGE_CONNECTION_STRING']

test.before(t => {
if (!connectionString) {
t.fail('TEST_STORAGE_CONNECTION_STRING environment variable is required for queue tests')
}

const queueSvc = azure.createQueueService(connectionString)
queueSvc.messageEncoder = new azure.QueueMessageEncoder.TextBase64QueueMessageEncoder()
queueSvc.createQueueIfNotExists(queueName, _ => {})

sh.config.silent = true
const base64ConStr = Buffer.from(connectionString).toString('base64')
const tmpFile = tmp.fileSync()
Expand All @@ -23,11 +30,8 @@ test.before(t => {
)
})

test.serial('Deployment should have 2 replicas on start', t => {
const replicaCount = sh.exec(
`kubectl get deployment.apps/test-deployment --namespace ${defaultNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
t.is(replicaCount, '2', 'replica count should start out as 2')
test.serial('Deployment should have 2 replicas on start', async t => {
t.true(await waitForDeploymentReplicaCount(2, 'test-deployment', defaultNamespace, 15, 1000), 'replica count should be 2 after 15 seconds')
})

test.serial('Creating ScaledObject should work', t => {
Expand All @@ -44,34 +48,16 @@ test.serial('Creating ScaledObject should work', t => {

test.serial(
'Deployment should scale to 0 and then shold be back to 2 after deletion of ScaledObject',
t => {
let replicaCount = '100'
for (let i = 0; i < 50 && replicaCount !== '0'; i++) {
replicaCount = sh.exec(
`kubectl get deployment.apps/test-deployment --namespace ${defaultNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
if (replicaCount !== '0') {
sh.exec('sleep 5s')
}
}
t.is('0', replicaCount, 'Replica count should be 0')

async t => {
t.true(await waitForDeploymentReplicaCount(0, 'test-deployment', defaultNamespace, 120, 1000), 'replica count should be 0 after 2 minutes')

t.is(
0,
sh.exec(`kubectl delete scaledobject.keda.sh/test-scaledobject --namespace ${defaultNamespace}`).code,
'deletion of ScaledObject should work.'
)

for (let i = 0; i < 50 && replicaCount !== '2'; i++) {
replicaCount = sh.exec(
`kubectl get deployment.apps/test-deployment --namespace ${defaultNamespace} -o jsonpath="{.spec.replicas}"`
).stdout
if (replicaCount !== '2') {
sh.exec('sleep 5s')
}
}
t.is('2', replicaCount, 'Replica count should be back at orignal 2')
t.true(await waitForDeploymentReplicaCount(2, 'test-deployment', defaultNamespace, 120, 1000), 'replica count should be 2 after 2 minutes')
}
)

Expand All @@ -86,7 +72,13 @@ test.after.always.cb('clean up azure-queue deployment', t => {
sh.exec(`kubectl delete ${resource} --namespace ${defaultNamespace}`)
}
sh.exec(`kubectl delete namespace ${defaultNamespace}`)
t.end()

// delete test queue
const queueSvc = azure.createQueueService(connectionString)
queueSvc.deleteQueueIfExists(queueName, err => {
t.falsy(err, 'should delete test queue successfully')
t.end()
})
})

const deployYaml = `apiVersion: v1
Expand Down Expand Up @@ -145,5 +137,5 @@ spec:
triggers:
- type: azure-queue
metadata:
queueName: queue-name
queueName: ${queueName}
connectionFromEnv: AzureWebJobsStorage`
7 changes: 5 additions & 2 deletions tests/scalers/azure-queue-trigger-auth.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import test from 'ava'
import {waitForDeploymentReplicaCount} from "./helpers";

const testNamespace = 'azure-queue-auth-test'
const queueName = 'queue-name'
const queueName = 'queue-name-trigger'
const connectionString = process.env['TEST_STORAGE_CONNECTION_STRING']

test.before(async t => {
Expand Down Expand Up @@ -44,7 +44,10 @@ test.serial(
)

// Scaling out when messages available
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', testNamespace, 60, 1000), 'replica count should be 3 after 1 minute')
t.true(await waitForDeploymentReplicaCount(1, 'test-deployment', testNamespace, 60, 1000), 'replica count should be 1 after 1 minute')

queueSvc.clearMessages(queueName, _ => {})

// Scaling in when no available messages
t.true(await waitForDeploymentReplicaCount(0, 'test-deployment', testNamespace, 300, 1000), 'replica count should be 0 after 5 minute')
}
Expand Down
Loading