Add CI multicluster + OIDC environment #2619

antgamdia · 2021-03-29T14:32:47Z

Description of the change

(note: this PR supersedes #2610. It's better to split the multicluster environment and the general improvements)

This PR updates our integration tests to use the multicluster environment so that we can detect issues and errors (as proposed in #2376).

The environment is based on the one we already use for dev purposes, that is: dex/openldap + two kind clusters.
There are, also, a couple of tweaks since the integration tests run inside a container; therefore, the ingress URL is not available and there are some issues related to the cookies.

Benefits

Testing Kubeapps in multicluster environments with OIDC.

Possible drawbacks

As the complexity of the CI increases, the flakiness of the overall CI systems might also increase.

Applicable issues

fixes CI doesn't test deployment to other clusters so it breaks. #2376

Additional information

This PR depends upon #2618 to be merged.

As an improvement, we can also consider including the OIDC via Pinniped. It shouldn't be a big deal since all the multicluster effort is already done in this PR.

andresmgot · 2021-03-29T15:55:25Z

integration/use-cases/create-private-registry.js

  );

  // wait for the loading msg to disappear
-  await page.waitForFunction(() => !document.querySelector(".margin-t-xxl"));
+  await page.waitForFunction(() => !document.querySelector(".margin-t-xxl cds-progress-circle"));


you don't need to use the class margin-t-xxl

I still think this not the best way of doing this though, the progress circle may appear and disappear if things are re-requested.

The expect(page).toClick(... should already wait for the loading gif to disappear

andresmgot

Thanks @antgamdia. This is looking good, just some minor comments.

andresmgot · 2021-03-30T10:22:19Z

integration/use-cases/add-multicluster-deployment.js

+    page,
+    process.env.USE_MULTICLUSTER_OIDC_ENV,
+    "/",
+    process.env.ADMIN_TOKEN,


I would not add this process.env.ADMIN_TOKEN (just to make clear this test won't work with a token)

andresmgot · 2021-03-30T10:24:43Z

integration/use-cases/create-private-registry.js

+  // wait for the loading msg to disappear
+  await page.waitForFunction(() => !document.querySelector("cds-progress-circle"));


unless there is a reason that waiting for the button doesn't work I won't add these kind of waits, it just makes the tests more difficult to maintain.

andresmgot · 2021-03-30T10:25:17Z

integration/use-cases/create-private-registry.js

  try {
    // TODO(andresmgot): Remove this line once 2.3 is released
    await expect(page).toClick("cds-button", { text: "Add new credentials" });
-  } catch(e) {
-    await expect(page).toClick(".btn-info-outline", { text: "Add new credentials" });
+  } catch (e) {
+    await expect(page).toClick(".btn-info-outline", {
+      text: "Add new credentials",
+    });
  }


we can remove this try-catch now

When testing in my fork I didn't build the new CI images (for saving time), and I would still need these lines. Now they can be safely removed, thanks for the reminder!

you forgot to remove the try-catch?

No, I removed them indeed. However, the CI started failing since the selectors weren't been found, so I reverted the change.

https://app.circleci.com/pipelines/github/kubeapps/kubeapps/2614/workflows/dee98a52-3d20-47a6-9a76-a1fa16351eaf/jobs/42596/artifacts

FAIL use-cases/operator-deployment.js (108.546 s) RUNS ... ● Deploys an Operator TimeoutError: Element div.modal-dialog.modal-md > div > div.modal-body > div > div > cds-button:nth-child(2) (text: "Delete") not found waiting for function failed: timeout 30000ms exceeded

But, since the tests were failing because of other reasons, I'm gonna trigger again a new build and let's see what happens.
If it is still failing I guess it will be something related to the built image.

andresmgot · 2021-03-30T10:25:50Z

integration/use-cases/create-private-registry.js

  try {
    // TODO(andresmgot): Remove this line once 2.3 is released
-    await expect(page).toClick(".secondary-input cds-button", { text: "Submit" });
-  } catch(e) {
+    await expect(page).toClick(".secondary-input cds-button", {
+      text: "Submit",
+    });
+  } catch (e) {
    await expect(page).toClick(".btn-info-outline", { text: "Submit" });


same, we can remove the try-catch

andresmgot · 2021-03-30T10:30:17Z

integration/use-cases/missing-permissions.js

+  // wait until load
+  await page.evaluate(() => {
+    [...document.querySelectorAll(".kubeapps-dropdown-header")].find(element =>
+      element.outerText.includes("Current Context"),
+    );
+  });


this better waits for the text "apache" to appear

andresmgot · 2021-03-30T10:31:15Z

integration/use-cases/operator-deployment.js

+  // wait for the loading msg to disappear
+  await page.waitForFunction(() => !document.querySelector("cds-progress-circle"));


same, wait for "prometheus" (not sure if the timeout below works).

andresmgot · 2021-03-30T10:32:09Z

integration/use-cases/operator-deployment.js

+  const isAlreadyDeployed = await page.evaluate(
+    () => document.querySelector("cds-button[disabled]") !== null,
+  );

-  await utils.retryAndRefresh(page, 2, async () => {
-    // The CSV takes a bit to get populated
-    await expect(page).toMatch("Installed");
-  });
+  if (!isAlreadyDeployed) {
+    // Deploy the Operator
+    await expect(page).toClick("cds-button", { text: "Deploy" });
+
+    await utils.retryAndRefresh(page, 4, async () => {
+      // The CSV takes a bit to get populated
+      await expect(page).toMatch("Installed", { timeout: 10000 });
+    });
+  } else {
+    console.log("Warning: the operator has already been deployed");
+  }


not sure if you already replied, why the isAlreadyDeployed?

Yep (sorry for dropping the former PR, but I wanted to split the changes for ease the review process)
Check it at #2610 (comment)
Since uninstalling the operator is not that easy, I left this if for those cases where we are executing the tests locally. I don't mind removing it if you want, though.

it should be fine (if for some reason this is bypassing the operator installation, it should fail when creating an instance).

andresmgot · 2021-03-30T10:34:11Z

integration/use-cases/operator-deployment.js

@@ -37,24 +47,31 @@ test("Deploys an Operator", async () => {
    await expect(page).toMatch("Operators", { timeout: 10000 });

    // Filter out charts to search only for the prometheus operator
-    await expect(page).toClick("label", { text: "Operators" });
+    await expect(page).toClick("label", { text: "Operators", timeout: 10000 });


the only timeout option working is for the toMatch function:

https://github.com/smooth-code/jest-puppeteer/blob/master/packages/expect-puppeteer/README.md#expectinstancetomatchmatcher-options

andresmgot · 2021-03-30T10:35:42Z

script/e2e-test.sh

  testsToIgnore=("operator-deployment.js" "${testsToIgnore[@]}")
+  testsToIgnore=("add-multicluster-deployment.js" "${testsToIgnore[@]}")


you can just use a single line for this

andresmgot · 2021-03-30T10:36:11Z

script/e2e-test.sh

@@ -339,6 +368,10 @@ kubectl create serviceaccount kubeapps-view -n kubeapps
 kubectl create role view-secrets --verb=get,list,watch --resource=secrets
 kubectl create rolebinding kubeapps-view-secret --role view-secrets --serviceaccount kubeapps:kubeapps-view
 kubectl create clusterrolebinding kubeapps-view --clusterrole=view --serviceaccount kubeapps:kubeapps-view
+## Create view user (oidc)


and the admin user?

It is already created with cluster-admin role with the default RBAC (kubeapps-operator@example.com).
This step is just modifying the kubeapps-user@example.com, but we can perform this step in the same place as the operator user for cohesion if you may prefer.

I see, yes, better to have them in the same place.

antgamdia · 2021-03-30T22:32:11Z

integration/jest.sequencer.js

+class CustomSequencer extends Sequencer {
+  sort(tests) {
+    const copyTests = Array.from(tests);
+    return copyTests.sort((testA, testB) => (testA.path > testB.path ? 1 : -1));


Despite the tests use to be executed in alphabetical order, there is no actual guarantee of that. It was causing a failure in the multicluster test because the secret created in the secret-repo test wasn't being found.
This sequencer just ensures this alphabetical order (I've pushed the integration img with this dep as well)

You are saying that creating a private repository causes an error in the multicluster scenario? (That sounds like a bug we should fix). If that's the case, can you open an issue for it?

We don't yet support app repos (let alone secret ones) on other clusters (#1982) because the interface to sync with the asset svc is via the DB (#2394) (left over from the read-only monocular).

Thanks for the clarification, I always forget about this issue.
Perhaps we should name the tests like "01-xxxx.js", "00-xxx.js" just to make clear that they are being executed in order.

I know that private repositories are not supported in the second cluster but what I don't understand is why the order of the tests affects the result. I thought that creating a private repo in the default cluster caused an error when trying to deploy in the second cluster but that may not be it.

Ok, ok, I see your point now and it may be an issue as you said:

When creating a private repo first, we are creating the docker credentials. Somehow, when deploying in the second cluster, it tries to retrieve these credentials (not available, though), and fails, therefore.

Let me replicate the issue in my dev environment and will file an issue with the details.

However, with regards to this PR, I think (for the sake of replicability) to still define an explicit order during the test suite execution.

I wasn't able to reproduce this issue in my dev environment. I set up the multicluster env and: 1) tried to manually create the private repo and then a new deployment; 2) run the e2e test directly from yarn start ....
In both cases, the credentials are being properly stored and the app deploys seamlessly as well.

That's odd... :S

You probably added a namespaced private repository. If you create it in the kubeapps namespace, you are likely able to reproduce the issue.

antgamdia · 2021-03-30T22:36:06Z

script/libtest.sh

@@ -41,12 +41,21 @@ k8s_wait_for_deployment() {

    debug "Waiting for deployment ${deployment} to be successfully rolled out..."
    # Avoid to exit the function if the rollout fails
-    silence kubectl rollout status --namespace "$namespace" deployment "$deployment" || exit_code=$?
+    silence kubectl rollout status --namespace "$namespace" deployment "$deployment" -w --timeout=60s || exit_code=$?


These changes are not strictly necessary (the main error was I didn't pass the db password 🤦 ), but I think they make the function work as expected, I mean, performing an actual waiting, rather than digesting the exit_code for logging purposes.

antgamdia · 2021-03-30T22:38:34Z

script/e2e-test.sh

@@ -169,13 +174,34 @@ installOrUpgradeKubeapps() {
      --set kubeops.replicaCount=1 \
      --set assetsvc.replicaCount=1 \
      --set dashboard.replicaCount=1 \
+      --set postgresql.replication.enabled=false \
      --set postgresql.postgresqlPassword=password \


Mental note: don't forget this flag when upgrading kubeapps

antgamdia · 2021-03-30T22:43:18Z

LGTM when you fix the current error.

Despite my sorrow and regardless of how I want much to press the merge button :P, I would need another +1 just to make sure the new changes are ok with you as well.

absoludity · 2021-03-30T22:46:26Z

LGTM when you fix the current error.

Despite my sorrow and regardless of how I want much to press the merge button :P, I would need another +1 just to make sure the new changes are ok with you as well.

Hah! Andres can merge it if happy with the changes (I thought it might have just been a few, but I see lots of commits, so I'll let Andres continue it). Enjoy your break o/

andresmgot

Thanks @antgamdia. I have some (minor) comments

andresmgot · 2021-03-31T07:54:28Z

integration/jest.sequencer.js

+class CustomSequencer extends Sequencer {
+  sort(tests) {
+    const copyTests = Array.from(tests);
+    return copyTests.sort((testA, testB) => (testA.path > testB.path ? 1 : -1));


You are saying that creating a private repository causes an error in the multicluster scenario? (That sounds like a bug we should fix). If that's the case, can you open an issue for it?

andresmgot · 2021-03-31T07:57:15Z

integration/use-cases/create-private-registry.js

  try {
    // TODO(andresmgot): Remove this line once 2.3 is released
    await expect(page).toClick("cds-button", { text: "Add new credentials" });
-  } catch(e) {
-    await expect(page).toClick(".btn-info-outline", { text: "Add new credentials" });
+  } catch (e) {
+    await expect(page).toClick(".btn-info-outline", {
+      text: "Add new credentials",
+    });
  }


you forgot to remove the try-catch?

andresmgot · 2021-03-31T08:04:51Z

script/e2e-test.sh

+  # TODO(agamez): Remove these lines in the next version
+  kubectl delete  clusterrole -n kubapps kubeapps:controller:kubeops-ns-discovery-kubeapps kubeapps:controller:kubeops-operators-kubeapps kubeapps:kubeapps:apprepositories-read kubeapps:kubeapps:apprepositories-refresh kubeapps:kubeapps:apprepositories-write || true
+  kubectl delete  clusterrolebinding -n kubapps kubeapps:controller:kubeapps:apprepositories-read kubeapps:controller:kubeops-ns-discovery-kubeapps || true
+  kubectl delete apprepositories.kubeapps.com -n kubeapps bitnami || true


I am not sure you should need these?

The clusterrole/clusterrolebinding should be adopted by the new version and the apprepository should not exist because we are using --set apprepository.initialRepos=null

Unfortunately I think you saw a former commit, 7086d6f should fix it.
Thanks for the heads-up anyway :)

andresmgot · 2021-03-31T08:06:00Z

script/libtest.sh

      kubectl get pods --namespace "$namespace"
+      sleep 60


please don't add sleeps this long. I don't think we need to retry here.

andresmgot

LGTM, I just don't understand very well why the tests need to be ordered.

andresmgot · 2021-04-05T10:05:09Z

integration/jest.sequencer.js

+class CustomSequencer extends Sequencer {
+  sort(tests) {
+    const copyTests = Array.from(tests);
+    return copyTests.sort((testA, testB) => (testA.path > testB.path ? 1 : -1));


I know that private repositories are not supported in the second cluster but what I don't understand is why the order of the tests affects the result. I thought that creating a private repo in the default cluster caused an error when trying to deploy in the second cluster but that may not be it.

antgamdia added 6 commits March 29, 2021 15:41

Log wrong responses during the e2e tests

721a679

Use pipeline params in CI. Add names to the steps

b972339

Add multicluster w/ oidc inconditionally

66676ee

Refactor login logic in e2e test using the envar

1487874

Fix typo

7b6e377

Add missing import utils. Using imports

9a5ac98

antgamdia changed the title ~~Multicluster ci~~ Add CI multicluster + OIDC environment Mar 29, 2021

antgamdia added 2 commits March 29, 2021 17:36

Use same prettier configuration as in the dashboard

83836d8

Use a more specific selector for the loading spinner

10e06b5

antgamdia mentioned this pull request Mar 29, 2021

Add ODIC multicluster environment to our CI system #2610

Closed

Merge branch 'minor-dry-ci-fixes' into multicluster-ci

e7049d6

Base automatically changed from minor-dry-ci-fixes to master March 29, 2021 15:51

antgamdia changed the base branch from master to dependabot/npm_and_yarn/dashboard/ajv-8.0.1 March 29, 2021 15:55

antgamdia changed the base branch from dependabot/npm_and_yarn/dashboard/ajv-8.0.1 to master March 29, 2021 15:55

andresmgot reviewed Mar 29, 2021

View reviewed changes

antgamdia added 12 commits March 29, 2021 18:56

Use require instead of es6 imports

788381d

Use require instead of es6 imports

5f9c022

Remove unnecessary waits for the spinner to disappear

30686dc

Merge branch 'master' into multicluster-ci

03d3509

Remove unnecessary class

25d933a

Add http redirect URI for CI purposes

2cbaa70

Remove 'document' from utils lib, use 'page' instead

22664e4

Fix wrong url

80478e9

Ignore multicluster in gke

f74443f

Increase timeout

37350dd

Fix typos

fe581bb

Better handling of envars

9d67768

andresmgot reviewed Mar 30, 2021

View reviewed changes

antgamdia added 2 commits March 30, 2021 14:11

Add PR comments. Remove unused timeouts. Increase global timeout.

7cf84c0

Merge branch 'master' into multicluster-ci

830b574

antgamdia added 15 commits March 30, 2021 16:54

Temporary removal of the upgrade test

1db5a59

Remove some flags. Edit timeout

23ea30c

Add temporary manual deletion of some resources

7383fae

Add retry in k8s_wait_for_deployment

026b29d

Remove workaround

c1656c2

Add wait

79fa463

Add timeout to k8s_wait_for_deployment and remove sleep

264d8a2

Increase wait until retry

e6a1c1e

Add watch to k8s_wait_for_deployment

2775a31

Add temporary workaround

af61649

Fix workaround

386f27a

Add postgresqlPassword

0044b68

Add try/catch because of failing tests

13bdad3

Add test sequencer

732bb33

Fix sequencer

3320a08

antgamdia commented Mar 30, 2021

View reviewed changes

Remove unnecessary workaround

7086d6f

antgamdia commented Mar 30, 2021

View reviewed changes

antgamdia mentioned this pull request Mar 30, 2021

Bump @types/node from 14.14.36 to 14.14.37 in /dashboard #2625

Merged

andresmgot reviewed Mar 31, 2021

View reviewed changes

antgamdia added 3 commits April 5, 2021 11:14

Decrease sleep when retrying k8s_wait_for_deployment

3223042

Rename test cases to explicitize the order

12322e7

Remove try/catch after the release 2.3

a63c0e4

andresmgot approved these changes Apr 5, 2021

View reviewed changes

Change button selectors

bd1c0a1

antgamdia merged commit b29a824 into master Apr 5, 2021

antgamdia deleted the multicluster-ci branch April 5, 2021 13:51

		// wait for the loading msg to disappear
		await page.waitForFunction(() => !document.querySelector("cds-progress-circle"));

		testsToIgnore=("operator-deployment.js" "${testsToIgnore[@]}")
		testsToIgnore=("add-multicluster-deployment.js" "${testsToIgnore[@]}")

Add CI multicluster + OIDC environment #2619

Add CI multicluster + OIDC environment #2619

Conversation

antgamdia commented Mar 29, 2021

Description of the change

Benefits

Possible drawbacks

Applicable issues

Additional information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andresmgot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antgamdia Apr 5, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antgamdia commented Mar 30, 2021

absoludity commented Mar 30, 2021

andresmgot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andresmgot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antgamdia Apr 5, 2021 •

edited

Loading