In cluster package webhooks via Pepr (alpha) (#1899)

Relates to #1594 This PR adapts the way that Zarf saves the package-secrets so that the secrets are now updated more frequently and can be used as an indicator of what a package is about to (and is the process of) deploying. These more frequently updated secrets can be viewed and acted upon by a webhook such as `Pepr`. As Zarf updates the secrets, it will check to see if a webhook has mutated the component webhook `status` of the secret to `Running` and respectfully wait for the webhook to change the `status` of the secret back to indicate whatever non-async work has been completed. The `--skip-webhooks` flag can be used to tell Zarf to skip checking/waiting for webhooks to complete during package deployments: `zarf package deploy [package tarball] --skip-webhooks` --------- Co-authored-by: Wayne Starr <Racer159@users.noreply.github.com> Co-authored-by: Lucas Rodriguez <lucas.rodriguez@defenseunicorns.com> Co-authored-by: Lucas Rodriguez <lucas.rodriguez9616@gmail.com>
zarf-dev · Oct 2, 2023 · 65b51e1 · 65b51e1
1 parent c04131f
commit 65b51e1
Show file tree

Hide file tree

Showing 35 changed files with 7,534 additions and 204 deletions.
diff --git a/.gitignore b/.gitignore
@@ -13,6 +13,7 @@
 .zarf*
 *.bak
 *.key
+*.crt
 *.run.zstd
 *.tar
 *.tar.gz

diff --git a/Makefile b/Makefile
@@ -160,6 +160,8 @@ build-examples: ## Build all of the example packages
 
 	@test -s ./build/zarf-package-yolo-$(ARCH).tar.zst || $(ZARF_BIN) package create examples/yolo -o build -a $(ARCH) --confirm
 
+	@test -s ./build/zarf-package-component-webhooks-$(ARCH)-0.0.1.tar.zst || $(ZARF_BIN) package create examples/component-webhooks -o build -a $(ARCH) --confirm
+
 build-injector-linux: ## Build the Zarf injector for AMD64 and ARM64
 	docker run --rm --user "$(id -u)":"$(id -g)" -v $$PWD/src/injector:/usr/src/zarf-injector -w /usr/src/zarf-injector rust:1.71.0-bookworm make build-injector-linux
 
@@ -188,8 +190,9 @@ test-upgrade: ## Run the Zarf CLI E2E tests for an external registry and cluster
 	cd src/test/upgrade && go test -failfast -v -timeout 30m
 
 .PHONY: test-unit
-test-unit: ensure-ui-build-dir ## Run unit tests within the src/pkg and the bigbang extension directory
+test-unit: ensure-ui-build-dir ## Run unit tests
 	cd src/pkg && go test ./... -failfast -v -timeout 30m
+	cd src/internal && go test ./... -failfast -v timeout 30m
 	cd src/extensions/bigbang && go test ./. -failfast -v timeout 30m
 
 .PHONY: test-ui

diff --git a/adr/0018-hooks.md b/adr/0018-hooks.md
@@ -1,21 +1,21 @@
 # 18. Zarf Hooks
 
-Date: 2023-06-13
+Date: 2023-09-20
 
 ## Status
 
-Pending
+Accepted
 
 ## Context
 
 The idea of `hooks` is to provide a way for cluster maintainers to register functionality that runs during the deployment lifecycle. Zarf packages already have the concept of `actions` that can execute commands on the host machine's shell during certain package lifecycle events. As `actions` gain more adoption, the team has noticed they are being used to add functionality to Zarf in unexpected ways. We want `actions` to be a tool that extends upon the functionality of Zarf and its packages, not a tool that works around missing or clunky functionality.
 
-
 We want package creators to be able to create system agnostic packages by leveraging core Zarf functionality. The following is one such scenario:
 
 - _IF_ ECR is chosen as the external registry during `zarf init` / cluster creation, _THEN_ Zarf will seamlessly leverage ECR without requiring advanced user effort.
 
 Using ECR as a remote registry creates 2 problems that Zarf will need to solve:
+
  1. ECR authentication tokens expire after 12 hours and need to be refreshed. This means the cluster will need to constantly be refreshing its tokens and the user deploying packages will need to make sure they have a valid token.
  2. ECR Image Repositories do not support 'push-to-create'. This means we will need to explicitly create an image repository for every image that is being pushed within the Zarf package.
 
@@ -28,91 +28,100 @@ Currently there are 2 solutions:
 
 Neither one of these current solutions are ideal. We don't want to require overly complex external + prior actions for Zarf package deployments, and we don't want package creators to have to create and distribute packages that are specific to ECR.
 
-## Approaches Considered
+Potential considerations:
 
 ### Internal Zarf Implementation
-Clusters that have hooks will have `zarf-hook-*` secret(s) in the 'zarf' namespace. This secret will contain the hook's configuration and any other required metadata. As part of the package deployment process, Zarf will check if the cluster has any hooks and run them if they exist. Given the scenario above, there is no longer a need for an ECR specific Zarf package to be created. An ECR hook would perform the proper configuration for any package deployed onto that cluster; thereby requiring no extra manual intervention from the package deployer.
-
-
-Zarf HookConfig state information struct:
-```go
-type HookConfig struct {
-	HookName     string                 `json:"hookName" jsonschema:"description=Name of the hook"`
-	Internal     bool                   `json:"internal" jsonschema:"description=Internal hooks are run by Zarf itself, not by a plugin"`
-	Lifecycle    HookLifecycle          `json:"lifecycle" jsonschema:"description=Lifecycle of the hook"`
-	HookData     map[string]interface{} `json:"hookData" jsonschema:"description=Generic data map used for the hook. The data is obtained from a secret in the Zarf namespace"`
-	OCIReference string                 `json:"ociReference" jsonschema:"description=Optional OCI reference to the hook image to run"`
-}
-```
-
-Example Secret Data:
-```yaml
-hookName: ecr-repository
-internal: true
-lifecycle: before-component
-hookData:
-  registryURL: public.ecr.aws/abcdefg/zarf-ecr-registry
-  region: us-east-1
-  repositoryPrefix: ecr-zarf-registry
-```
-
-For this solution, hooks have to be 'installed' onto a cluster before they are used. When Zarf is deploying a package onto a cluster, it will look for any secrets with the `zarf-hook` label in the `zarf` namespace.  If hooks are found, Zarf will run any 'package' level hooks before deploying a component and run any 'component' level hook for each component that is getting deployed. The hook lifecycle options will be:
-1. Before a package deployment
-2. After a package deployment
-3. Before a component deployment
-4. After a component deployment
- - NOTE: The order of hook execution is nearly random. If there are multiple hooks for a lifecycle there is no guarantee that they will be executed in a certain order.
- - NOTE: The `package` lifecycle might be changed to a `run-once` lifecycle. This would benefit packages that don't have kube context information when the deployment starts.
-
-Zarf hooks will have two forms of execution via `Internal` and `External` hooks.
-#### Internal Hooks
-Internal hooks will be hooks that are built into the Zarf CLI and run internal code when executed. The logic for these hooks would be built into the Zarf CLI and would be updated with new releases of the CLI.
-
-#### External Hooks
-There are a few approaches for external hooks.
-1. Have the hook metadata reference an OCI image that is downloaded and run.
- - The hook metadata can reference the shasum of the image to ensure the image is not tampered with.
- - We can pass metadata from the secret to the image.
-2. Have the hook metadata reference an image/endpoint that we call via a gRPC call.
- - This would require a lot of consideration to security since we will be executing code from an external source.
-3. Have the hook metadata contain a script or list of shell commands that can get run.
- - This would be the simplest solution but would require the most work from the hook creator. This also has the most potential security issues.
-
-
-
-**PROS**
- - Implementing Hooks internally means we don't have to deal with any bootstrapping issues.
- - Internally managed hooks can leverage Zarf internal code.
-
-**CONS**
- - Since 'Internal' hooks are built into the CLI, the only way to get updates for the hook is to update the CLI.
- - External hooks will have a few security concerns that we will have to work through.
- - Implementing hooks internally adds more complexity to the Zarf CLI. This is especially true if wwe end up using WASM as the execution engine for hooks.
-
 
+  Clusters that have hooks will have `zarf-hook-*` secret(s) in the 'zarf' namespace. This secret will contain the hook's configuration and any other required metadata. As part of the package deployment process, Zarf will check if the cluster has any hooks and run them if they exist. Given the scenario above, there is no longer a need for an ECR specific Zarf package to be created. An ECR hook would perform the proper configuration for any package deployed onto that cluster; thereby requiring no extra manual intervention from the package deployer.
+
+  Zarf HookConfig state information struct:
+
+  ```go
+  type HookConfig struct {
+    HookName     string                 `json:"hookName" jsonschema:"description=Name of the hook"`
+    Internal     bool                   `json:"internal" jsonschema:"description=Internal hooks are run by Zarf itself, not by a plugin"`
+    Lifecycle    HookLifecycle          `json:"lifecycle" jsonschema:"description=Lifecycle of the hook"`
+    HookData     map[string]interface{} `json:"hookData" jsonschema:"description=Generic data map used for the hook. The data is obtained from a secret in the Zarf namespace"`
+    OCIReference string                 `json:"ociReference" jsonschema:"description=Optional OCI reference to the hook image to run"`
+  }
+  ```
+
+  Example Secret Data:
+
+  ```yaml
+  hookName: ecr-repository
+  internal: true
+  lifecycle: before-component
+  hookData:
+    registryURL: public.ecr.aws/abcdefg/zarf-ecr-registry
+    region: us-east-1
+    repositoryPrefix: ecr-zarf-registry
+  ```
+
+  For this solution, hooks have to be 'installed' onto a cluster before they are used. When Zarf is deploying a package onto a cluster, it will look for any secrets with the `zarf-hook` label in the `zarf` namespace.  If hooks are found, Zarf will run any 'package' level hooks before deploying a component and run any 'component' level hook for each component that is getting deployed. The hook lifecycle options will be:
+
+  1. Before a package deployment
+  2. After a package deployment
+  3. Before a component deployment
+  4. After a component deployment
+
+  NOTE: The order of hook execution is nearly random. If there are multiple hooks for a lifecycle there is no guarantee that they will be executed in a certain order.
+  NOTE: The `package` lifecycle might be changed to a `run-once` lifecycle. This would benefit packages that don't have kube context information when the deployment starts.
+
+  Zarf hooks will have two forms of execution via `Internal` and `External` hooks:
+
+  Internal Hooks:
+
+  Internal hooks will be hooks that are built into the Zarf CLI and run internal code when executed. The logic for these hooks would be built into the Zarf CLI and would be updated with new releases of the CLI.
+
+  External Hooks:
+
+  There are a few approaches for external hooks:
+
+  1. Have the hook metadata reference an OCI image that is downloaded and run.
+
+     - The hook metadata can reference the shasum of the image to ensure the image is not tampered with.
+     - We can pass metadata from the secret to the image.
+
+  1. Have the hook metadata reference an image/endpoint that we call via a gRPC call.
+     - This would require a lot of consideration to security since we will be executing code from an external source.
+
+  1. Have the hook metadata contain a script or list of shell commands that can get run.
+     - This would be the simplest solution but would require the most work from the hook creator. This also has the most potential security issues.
+
+  Pros:
+
+  - Implementing Hooks internally means we don't have to deal with any bootstrapping issues.
+  - Internally managed hooks can leverage Zarf internal code.
+
+  Cons:
+
+  - Since 'Internal' hooks are built into the CLI, the only way to get updates for the hook is to  update the CLI.
+  - External hooks will have a few security concerns that we will have to work through.
+  - Implementing hooks internally adds more complexity to the Zarf CLI. This is especially true if we end up using WASM as the execution engine for hooks.
 
 ### Webhooks
-Webhooks, such as Pepr, can act as a K8s controller that enables Kubernetes mutations. We are (or will be) considering using Pepr to replace the `Zarf Agent`. Pepr is capable to accomplishing most of what Zarf wants to do with the concept of Hooks. Zarf hook configuration could be saved as secrets that Zarf will be able to use. As Zarf is deploying packages onto a cluster, it can check for secrets the represent hooks (as it would if hook execution is handled internally as stated above) and get information on how to run the webhook from the secret. This would likely mean that the secret that describes the hook would have a `URL` instead of an `OCIReference` as well as config information that it would pass through to the hook. With the webhook approach, lifecycle management is a lot more flexible as the webhook can operate on native kubernetes events such as a secret getting created / updated.
 
-**PROS**
- - Pepr as a solution would be more flexible than the internal Zarf implementation of Hooks since the webhook could be anywhere.
- - Using Pepr would reduce the complexity of Zarf's codebase.
- - It will be easier to secure third party hooks when Pepr is the one running them.
- - Lifecycle management would be a lot easier with a webhook solution like Pepr.
+  Webhooks, such as Pepr, can act as a K8s controller that enables Kubernetes mutations. We are (or will be) considering using Pepr to replace the `Zarf Agent`. Pepr is capable to accomplishing most of what Zarf wants to do with the concept of Hooks. Zarf hook configuration could be saved as secrets that Zarf will be able to use. As Zarf is deploying packages onto a cluster, it can check for secrets the represent hooks (as it would if hook execution is handled internally as stated above) and get information on how to run the webhook from the secret. This would likely mean that the secret that describes the hook would have a `URL` instead of an `OCIReference` as well as config information that it would pass through to the hook. With the webhook approach, lifecycle management is a lot more flexible as the webhook can operate on native kubernetes events such as a secret getting created / updated.
 
-**CONS**
- - Pepr is a new project that hasn't been stress tested in production yet (but neither has Hooks).
- - The Pepr image needs to be pushed to an image registry before it is deployed. This will require a new bootstrapping solution to solve the ECR problem we identified above.
+  Pros:
 
+  - Pepr as a solution would be more flexible than the internal Zarf implementation of Hooks since the webhook could be anywhere.
+  - Using Pepr would reduce the complexity of Zarf's codebase.
+  - It will be easier to secure third party hooks when Pepr is the one running them.
+  - Lifecycle management would be a lot easier with a webhook solution like Pepr.
 
-## Consequences
+  Cons:
+
+  - Pepr is a new project that hasn't been stress tested in production yet (but neither has Hooks).
+  - The Pepr image needs to be pushed to an image registry before it is deployed. This will require a new bootstrapping solution to solve the ECR problem we identified above.
 
-- External hooks will likely not be implemented in the first pass of this feature. Handling external hooks will be an interesting challenge as we'll have to download the hook image and run / execute it. Security will be something we consider heavily when implementing this feature.
+## Decision
 
-- While hooks don't introduce raw schema changes to Zarf, it does add complexity where side affects are happening during package deployments that might not be obvious to the package deployer. This is especially the case if the person who deployed the hooks is different from the person who is deploying the subsequent packages.
+[Pepr](https://github.com/defenseunicorns/pepr) will be used to enable custom, or environment-specific, automation tasks to be integrated in the Zarf package deployment lifecycle. Pepr also allows the Zarf codebase to remain agnostic to any third-party APIs or dependencies that may be used.
 
-- At the current moment, we don't have a way to version hooks. This is something we should consider so we can update hooks that have been deployed onto a cluster.
+A `--skip-webhooks` flag has been added to `zarf package deploy` to allow users to opt out of Zarf checking and waiting for any webhooks to complete during package deployments.
 
-- At the current moment, there is no way to have a package opt out of running hooks. This means that someone who deploys a hook to a cluster effectively has a way to manipulate every other package deployment that will get deployed onto that cluster.
+## Consequences
 
-- Some situations will require hooks to be 'seeded' onto the cluster. For example, the ECR scenario we identified above would require hooks to exist before running `zarf init` on the EKS cluster.
+While hooks don't introduce raw schema changes to Zarf, it does add complexity where side affects are happening during package deployments that might not be obvious to the package deployer. This is especially the case if the person who deployed the hooks is different from the person who is deploying the subsequent packages.
diff --git a/docs/2-the-zarf-cli/100-cli-commands/zarf_package_deploy.md b/docs/2-the-zarf-cli/100-cli-commands/zarf_package_deploy.md
@@ -22,6 +22,7 @@ zarf package deploy [ PACKAGE ] [flags]
   -k, --key string                 Path to public key file for validating signed packages
       --set stringToString         Specify deployment variables to set on the command line (KEY=value) (default [])
       --shasum string              Shasum of the package to deploy. Required if deploying a remote package and "--insecure" is not provided
+      --skip-webhooks              [alpha] Skip waiting for external webhooks to execute as each package component is deployed
 ```
 
 ## Options inherited from parent commands

diff --git a/examples/component-webhooks/.eslintrc.json b/examples/component-webhooks/.eslintrc.json
@@ -0,0 +1,23 @@
+{
+    "env": {
+      "browser": false,
+      "es2021": true
+    },
+    "extends": [
+      "eslint:recommended",
+      "plugin:@typescript-eslint/recommended"
+    ],
+    "parser": "@typescript-eslint/parser",
+    "parserOptions": {
+      "ecmaVersion": 2022
+    },
+    "plugins": [
+      "@typescript-eslint"
+    ],
+    "ignorePatterns": [
+      "node_modules",
+      "dist",
+      "hack"
+    ],
+    "root": true
+}
diff --git a/examples/component-webhooks/.prettierrc b/examples/component-webhooks/.prettierrc
@@ -0,0 +1,13 @@
+{
+  "arrowParens": "avoid",
+  "bracketSameLine": false,
+  "bracketSpacing": true,
+  "embeddedLanguageFormatting": "auto",
+  "insertPragma": false,
+  "printWidth": 80,
+  "quoteProps": "as-needed",
+  "requirePragma": false,
+  "semi": true,
+  "tabWidth": 2,
+  "useTabs": false
+}
-Original file line number
+Diff line change
@@ Expand Up / @@ -13,6 +13,7 @@ @@
     .zarf*
     *.bak
     *.key
+    *.crt
     *.run.zstd
     *.tar
     *.tar.gz
@@ Expand Down @@