Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating rfc to reflect how webhooks promotion are gonna work #81

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 64 additions & 63 deletions docs/rfcs/0003-pipelines-promotion/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# RFC-0003 Pipeline promotions
# RFC-0003 Pipeline promotions

**Status:** implementable

Expand All @@ -8,17 +8,17 @@

## Summary

Given a continuous delivery pipeline, the application goes via different environments on its way to production. We
Given a continuous delivery pipeline, the application goes via different environments on its way to production. We
need an action to signal the intent of deploying an application between environments. That concept is generally known as a
promotion. Current pipelines in weave gitops does not support promotion. This RFC addresses this gap
promotion. Current pipelines in weave gitops does not support promotion. This RFC addresses this gap
as specified in the [product initiative](#user-stories).

## Terminology

- **Pipeline**: a continuous delivery Pipeline declares a series of environments through which a given application is expected to be deployed.
- **Promotion**: action of moving an application from a lower environment to a higher environment within a pipeline.
For example promote staging to production would attempt to deploy an application existing in staging environment to production environment.
- **Promotion Strategy**: a concrete promotion. for example, **create a pull request** could be a promotion strategy
- **Promotion Strategy**: a concrete promotion. for example, **create a pull request** could be a promotion strategy
or promote by calling an external system.
- **Promotion Target**: the entity receiving the action of the promotion. For example, in the context of an strategy `create pull request`
a promotion target will be the configuration git repo. In the example of calling external promotion, for example a jenkins server
Expand All @@ -31,20 +31,20 @@ could be the promotion target.

Given a continuous delivery pipeline, the application goes via different environments in its way to production. We
need an action to signal the intent of deploying an application between environments. That concept is generally known as a
promotion. Current pipelines in weave gitops does not support promotion.
promotion. Current pipelines in weave gitops does not support promotion.

### Goals

- Design the e2e solution for promotions on weave gitops pipelines.
- Should support the [scenarios identified](https://www.notion.so/weaveworks/Pipeline-promotion-061bb790e2e345cbab09370076ff3258#5b514ad575544595b1028d73e5b6dd23)
- Should support the [scenarios identified](https://www.notion.so/weaveworks/Pipeline-promotion-061bb790e2e345cbab09370076ff3258#5b514ad575544595b1028d73e5b6dd23)

### Non-Goals

- Anything beyond the scope of promotions.
- Scenarios other than the identified in the product initiative.

## Proposal
We propose to use a solution as specified in the following diagram.
We propose to use a solution as specified in the following diagram.

```mermaid
sequenceDiagram
Expand All @@ -58,62 +58,62 @@ We propose to use a solution as specified in the following diagram.
pc->>k8s: get pipeline
pc->>pc: is promotion required
participant k8s as Kubernetes Api
pc->>promotionTarget: execute promotion strategy
pc->>promotionTarget: execute promotion strategy
participant promotionTarget as Promotion Target
```

With three main activities

1. Detect deployment changes
2. Determine whether a promotion is needed
3. Execute the promotion
2. Determine whether a promotion is needed
3. Execute the promotion

### Detect deployment changes

The solution leverages [flux native notification capabilities](https://fluxcd.io/flux/components/notification/) for this responsibility.
Notification controllers in leaf clusters would notify of deployment events to the management cluster via
a deployment webhook. The management cluster will ingest and validate these deployment events.
Notification controllers in leaf clusters would notify of deployment events to the management cluster via
a deployment webhook. The management cluster will ingest and validate these deployment events.

A deeper look into this part of the solution could be found [here](detect-deployment-changes.md).

### Determine whether a promotion is needed

This responsibility is assumed by the `pipeline controller` running in the management cluster that
determines whether, at the back of the deployment event and a pipeline definition, a promotion is required and
initialise the promotion.
determines whether, at the back of the deployment event and a pipeline definition, a promotion is required and
initialise the promotion.

A deeper look into this part of the solution could be found [here](determine-promotion-needs.md).

### To execute the promotion

Once the previous evaluation considers that a promotion is required, pipeline controller would be in charge
of orchestrating and executing the promotion. The promotion configuration will be added as part of the pipeline spec.
Once the previous evaluation considers that a promotion is required, pipeline controller would be in charge
of orchestrating and executing the promotion. The promotion configuration will be added as part of the pipeline spec.

A deeper look into this part of the solution could be found [here](execute-promotion.md).

### Non-functional requirements

This section does a quick look into non-functional requirements for the solution at glance.
This section does a quick look into non-functional requirements for the solution at glance.

#### Security

The solution is secured by design as
The solution is secured by design as

1. Communication between leaf and management clusters are via https channel with endpoint authz via HMAC.
2. Deployment events are validated to reduce the risks of impersonation.
3. Each promotion strategy will have their own security configuration.
2. Deployment events are validated to reduce the risks of impersonation.
3. Each promotion strategy will have their own security configuration.

#### Scalability

The solution is scalable by design as
The solution is scalable by design as

- It could horizontally scale by the number of replicas of pipeline controller.
- It could vertically scale by using `goroutines` to concurrently handle promotion requests.
- It could vertically scale by using `goroutines` to concurrently handle promotion requests.

#### Reliability

Pipeline controller will need to implement the fault tolerance and reliability features within its business logic per
promotion strategy. For example, in the context of opening a pr against github, it will require to manage retries to
Pipeline controller will need to implement the fault tolerance and reliability features within its business logic per
promotion strategy. For example, in the context of opening a pr against github, it will require to manage retries to
recover from api rate limiting.

#### Monitoring
Expand Down Expand Up @@ -141,7 +141,7 @@ On the flip side, the solution has the following constraints:
## Alternatives

This solution is the result of two different alternative evaluations:
1. Alternatives to detect deployment changes.
1. Alternatives to detect deployment changes.
2. Alternatives to process and execute promotions.

### Alternatives to detect deployment changes
Expand All @@ -162,7 +162,7 @@ and take an action to start the next promotion based on the Pipeline definition.
participant PS as Promotion Strategy
API Server->>+PC: notifies
participant dt1 as dev/target 1

rect rgb(67, 207, 250)
note right of PC: setup phase
note right of PC: pipelines.wego.weave.works/name<br/>pipelines.wego.weave.works/env<br/>pipelines.wego.weave.works/target
Expand All @@ -172,7 +172,7 @@ and take an action to start the next promotion based on the Pipeline definition.
participant pt1 as prod/target 1
PC->>+pt1: label AppRef with metadata
end

rect rgb(50, 227, 221)
note right of PC: promotion phase
PC-->>+dt1: watches HelmRelease and Kustomizations changes
Expand All @@ -182,8 +182,8 @@ and take an action to start the next promotion based on the Pipeline definition.


dt1->>+PC: update events from AppRef
PC ->>PC: filter upgrade events
PC ->>PC: extract metadata
PC ->>PC: filter upgrade events
PC ->>PC: extract metadata
PC->>+PS: kicks off
```

Expand All @@ -202,7 +202,7 @@ and take an action to start the next promotion based on the Pipeline definition.

### Alternatives to process and execute promotions

They difference among them is around
They difference among them is around
the component serving the promotion logic, therefore the alternatives names are based on it.

- Alternative A: weave gitops backend
Expand All @@ -223,11 +223,11 @@ the component serving the promotion logic, therefore the alternatives names are
wge->>k8s: get pipeline
wge->>wge: is promotion required
participant k8s as Kubernetes Api
wge->>promotionTarget: execute promotion strategy
wge->>promotionTarget: execute promotion strategy
participant promotionTarget as Promotion Target
```

This solution is different from `pipeline controller` in that the three responsibilities:
This solution is different from `pipeline controller` in that the three responsibilities:

1. Notify deployment changes
2. Determine whether a promotion is needed
Expand All @@ -236,13 +236,13 @@ This solution is different from `pipeline controller` in that the three responsi
are fulfilled within weave gitops backend app.

**Pro**
- Already setup and *should* be more easily exposed.
- Already setup and *should* be more easily exposed.
- No need to manage other exposed surface, therefore less to secure.

**Cons**
- Notifier service account needs permissions for promotion resources.
- Current api layer is designed as an experience layer for users (humans) while the promotion webhook is intended for machines.
- Extends the api layer with rest api so it would require to manage both grpc and rest apis that would increase maintainability costs.
- Extends the api layer with rest api so it would require to manage both grpc and rest apis that would increase maintainability costs.


#### Alternative B: weave gitops api + pipeline controller + promotion executor
Expand All @@ -252,7 +252,7 @@ are fulfilled within weave gitops backend app.
participant F as Flux
participant LC as Leaf Cluster
F->>LC: deploy helm release
LC->>WGE: notify deployment via notification controller
LC->>WGE: notify deployment via notification controller
participant WGE as Weave Gitops API
participant k8s as Kubernetes Api
WGE->>k8s: write deployment event
Expand All @@ -261,56 +261,56 @@ are fulfilled within weave gitops backend app.
pc->>pj: create promotion job
participant pj as promotion job
pj->>pj: promotion business logic
pj->>promotionTarget: execute promotion strategy
pj->>promotionTarget: execute promotion strategy
participant promotionTarget as Promotion Target
```

This solution is different from `pipeline controller` in that the three responsibilities are split

1. Notify deployment changes: ingestion is done via weave gitops api. The event is written in pipeline resource.
1. Notify deployment changes: ingestion is done via weave gitops api. The event is written in pipeline resource.
2. Determine whether a promotion is needed: pipeline controller watches for changes in pipeline.
3. Execute the promotion: extracted to a kubernetes job layer.
3. Execute the promotion: extracted to a kubernetes job layer.

**Pro**
- Using ingestion layer so not increased operational costs.
- Using ingestion layer so not increased operational costs.
- No need to generate TS client
- Pipeline controller with reconcile loop so canonical usage.
- Pipeline controller with reconcile loop so canonical usage.
- Scalability and fault-tolerance by design.

**Cons**
- Needs for writing the pipeline resource.
- The most complex alternative.
- To extract the promotion execution logic into an external component, would require to also create a management layer
between pipeline controller to the execution layer.
- The most complex alternative.
- To extract the promotion execution logic into an external component, would require to also create a management layer
between pipeline controller to the execution layer.

#### Alternative C: promotions service

This solution would be to create a new component with the promotions responsibility.
This solution would be to create a new component with the promotions responsibility.

```mermaid
sequenceDiagram
participant F as Flux
participant LC as Notification Controller (Leaf)
F->>LC: deploy helm release
LC->>PS: notify deployment via notification controller
LC->>PS: notify deployment via notification controller
participant PS as Promotions Svc (Management)
PS->>PS: authz and validate event
participant k8s as Kubernetes Api
PS->>k8s: get pipeline
PS->>k8s: get pipeline
PS->>PS: is promotion required
participant k8s as Kubernetes Api
PS->>promotionTarget: execute promotion strategy
participant promotionTarget as Promotion Target
PS->>promotionTarget: execute promotion strategy
participant promotionTarget as Promotion Target
```
**Pro**
- Easiest to dev against (vs api solution).
- No controller so no reconcile loop executed (vs pipeline controller solution).
- No controller so no reconcile loop executed (vs pipeline controller solution).

**Cons**
- Ee would need to create it from scratch.
- One more component to manage.
- Ee would need to create it from scratch.
- One more component to manage.

## User Stories
## User Stories

This section shows how the current proposal addresses the different scenarios specified in the [product
initiative](https://www.notion.so/weaveworks/Pipeline-promotion-061bb790e2e345cbab09370076ff3258#5b514ad575544595b1028d73e5b6dd23).
Expand Down Expand Up @@ -354,8 +354,8 @@ spec:
namespace: flux-system
```

It is covered by [promotion business rules](determine-promotion-needs.md#promotion-decisions-business-logic) where
we want to execute promotions by environment and by deployment target. This is the base scenario.
It is covered by [promotion business rules](determine-promotion-needs.md#promotion-decisions-business-logic) where
we want to execute promotions by environment and by deployment target. This is the base scenario.


### Promotion for a pipeline with multiple deployment target per environment
Expand Down Expand Up @@ -409,7 +409,7 @@ covered by [promotion business rules](determine-promotion-needs.md#promotion-dec

Original scenario specified [here](https://www.notion.so/weaveworks/Pipeline-promotion-061bb790e2e345cbab09370076ff3258#3ea85277de5543d69a9e19407e69c84b)

An example of a pipeline representing this scenario could be found
An example of a pipeline representing this scenario could be found

```yaml
apiVersion: pipelines.weave.works/v1alpha1
Expand Down Expand Up @@ -454,9 +454,9 @@ spec:
name: prod
namespace: flux-system
```
The particularity of this scenario, is that we want to promote to production as soon as a deployment to test
The particularity of this scenario, is that we want to promote to production as soon as a deployment to test
has been successfully happen. This scenario is covered
by [promotion business rules](determine-promotion-needs.md#promotion-decisions-business-logic) rule #2
by [promotion business rules](determine-promotion-needs.md#promotion-decisions-business-logic) rule #2

>2. Promotion between environment will happen when at least one of lower-environment deployment targets has been successfully deployed.

Expand All @@ -466,8 +466,8 @@ it will promote to prod as soon as a successful deployment to either `qa` or `pe

Original scenario specified [here](https://www.notion.so/weaveworks/Pipeline-promotion-061bb790e2e345cbab09370076ff3258#bd4524a6838742cfa254642c1b42443f)

This scenario is currently supported by having [Call Webhook promotion strategy](execute-promotion.md#call-a-webhook).
An example of pipeline for this story is shown below.
This scenario is currently supported by having [Notification promotion strategy](execute-promotion.md#call-a-webhook).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will need to update the link

An example of pipeline for this story is shown below.

```yaml
apiVersion: pipelines.weave.works/v1alpha1
Expand All @@ -481,9 +481,7 @@ spec:
name: search-helmrelease
apiVersion: helm.toolkit.fluxcd.io/v2beta1
promotion:
webhook:
url: https://my-jenkins.prod/webhooks/XoLZfgK
secretRef: my-jenkins-promotion-secret
Copy link

@sympatheticmoose sympatheticmoose Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luizbafilho where would this secret/credentials get configured in the notification approach?

notification: {}
environments:
- name: dev
targets:
Expand All @@ -494,6 +492,9 @@ spec:
namespace: flux-system
```

In this strategy, the Pipeline Controller pushes an event to the Notification Controller to delegate the promotion action to an external system.


## Implementation History

- [Promotions Issue](https://github.com/weaveworks/weave-gitops-enterprise/issues/1589)
- [Promotions Issue](https://github.com/weaveworks/weave-gitops-enterprise/issues/1589)
Loading