Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAP-360 #110

Merged
merged 30 commits into from
Mar 25, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9adf5bb
Merge recently-synced upstream main into our release
chris-bridgett-nandos Feb 24, 2022
8e4fa5b
Align with upstream
chris-bridgett-nandos Feb 24, 2022
fdb7dbd
Remove duplicate resource
chris-bridgett-nandos Feb 24, 2022
d9e5b4c
Update queries path
chris-bridgett-nandos Feb 24, 2022
81ad199
Dependency
chris-bridgett-nandos Feb 24, 2022
269a05c
Dynamic call in for_each
chris-bridgett-nandos Feb 25, 2022
693ed17
Merge branch 'release' into nap-360
chris-bridgett-nandos Feb 25, 2022
4f27ea2
Remove itsdangerous
chris-bridgett-nandos Feb 25, 2022
499f30d
Linting
chris-bridgett-nandos Feb 25, 2022
421f182
Merge branch 'release' into nap-360
chris-bridgett-nandos Feb 28, 2022
e3f909c
Fix tests and lint
chris-bridgett-nandos Mar 2, 2022
30cea23
Merge branch 'release' into nap-360
chris-bridgett-nandos Mar 2, 2022
e4d4ae1
Merge release
chris-bridgett-nandos Mar 2, 2022
96c69e5
Made grafana dashboard optional
chris-bridgett-nandos Mar 2, 2022
6128960
Add dashboard provider to input tfvars
chris-bridgett-nandos Mar 2, 2022
5f1a3af
Merge branch 'release' into nap-360
chris-bridgett-nandos Mar 3, 2022
3c9d0f8
Lint failure
chris-bridgett-nandos Mar 3, 2022
db41e5b
Amend query merge bug and fully qualify event-handler secret reference.
chris-bridgett-nandos Mar 11, 2022
b61602a
Enable more required API's
chris-bridgett-nandos Mar 11, 2022
ce5a1bc
Refactoring to integrate upstream
chris-bridgett-nandos Mar 16, 2022
eb005d8
Update dashboard CloudBuild directory
chris-bridgett-nandos Mar 16, 2022
83ef153
Update dashboard repo path
chris-bridgett-nandos Mar 16, 2022
183ca7e
Update Dashboard cloudbuild.yaml
chris-bridgett-nandos Mar 16, 2022
aa5edb5
Update setup README with Terraform provisioning steps
chris-bridgett-nandos Mar 17, 2022
b5d75a2
Merge branch 'release' into nap-360
chris-bridgett-nandos Mar 17, 2022
d2859d9
Remove hardcoded dataset
chris-bridgett-nandos Mar 17, 2022
16e4da9
Linting
chris-bridgett-nandos Mar 17, 2022
95d8d59
Relative path correction
chris-bridgett-nandos Mar 17, 2022
bb3fe16
Linting
chris-bridgett-nandos Mar 25, 2022
c4e3814
Pin jinja2 due to deprecated function call required by flask 1.1.1
chris-bridgett-nandos Mar 25, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
on:
pull_request:
paths:
- '**.py'
push:
paths:
- '**.py'
branches:
- main

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: excitedleigh/setup-nox@v2
- run: nox
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ venv/
ENV/
env.bak/
venv.bak/
env.sh

# Spyder project settings
.spyderproject
Expand Down Expand Up @@ -164,3 +165,5 @@ terraform.rc
# exclude terraform backend b/c different users will choose different backends
backend.tf

.idea
*.iml
30 changes: 29 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,32 @@ This project follows

## Discussion

Feel free to join us on the #fourkeys channel on the [Google Cloud Platform Slack](https://join.slack.com/t/googlecloud-community/shared_invite/zt-m973j990-IMij2Xh8qKPu7SaHfOcCFg)!
Feel free to join us on the #fourkeys channel on the [Google Cloud Platform Slack](https://goo.gle/gcp-slack)!

## Office Hours

We'll be hosting office hours every two weeks at **11AM PT**. Please come if you need help or have general questions.

```
Four Keys Office Hours
Tuesday, January 4 · 11:00 – 11:30am PT
Google Meet joining info
Video call link: https://meet.google.com/rcc-qepb-jya
Or dial: ‪(US) +1 316-512-1050‬ PIN: ‪132 793 518‬#
More phone numbers: https://tel.meet/rcc-qepb-jya?pin=3556509734950
```

Upcoming dates:
- January 18
- Feb 1
- Feb 15
- Mar 1
- Mar 15
- Mar 29

## Meetups

We'll be hosting quarterly meetups to discuss proposed changes, integrations, roadmaps, etc.

Next Meetup TBD

119 changes: 70 additions & 49 deletions METRICS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ For each of the metrics, the dashboard shows a running daily calculation, as wel
This is the simplest of the charts to create, with a very straightforward script. We simply want the daily volume of distinct deployments.


``` sql
```sql
SELECT
TIMESTAMP_TRUNC(time_created, DAY) AS day,
COUNT(distinct deploy_id) AS deployments
Expand Down Expand Up @@ -119,34 +119,60 @@ If we have a list of all changes for every deployment, it is simple to calculate

```sql
SELECT
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) as day,
##### Time to Change
TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) time_to_change_minutes
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) AS day,
#### Time to Change
TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) AS time_to_change_minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id;
```

From this base, we want to extract the daily median lead time to change.

```sql
SElECT
day,
PERCENTILE_CONT(
# Ignore automated changes
IF(time_to_change_minutes > 0,time_to_change_minutes, NULL),
0.5) # Median
OVER (partition by day) median_time_to_change
FROM
(SELECT
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) as day,
# Time to Change
TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) time_to_change_minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id
)
GROUP BY day, time_to_change_minutes;
SELECT
day,
median_time_to_change
FROM (
SELECT
day,
PERCENTILE_CONT(
# Ignore automated changes
IF(time_to_change_minutes > 0, time_to_change_minutes, NULL),
0.5) # Median
OVER (partition by day) AS median_time_to_change
FROM (
SELECT
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) AS day,
# Time to Change
TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) AS time_to_change_minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id
)
)
GROUP BY day, median_time_to_change;
```

Here is how we write it more efficiently for the dashboard.

```sql
SELECT
day,
IFNULL(ANY_VALUE(med_time_to_change)/60, 0) AS median_time_to_change, # Hours
FROM (
SELECT
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) AS day,
PERCENTILE_CONT(
# Ignore automated pushes
IF(TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) > 0, TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE), NULL),
0.5) # Median
OVER (PARTITION BY TIMESTAMP_TRUNC(d.time_created, DAY)) AS med_time_to_change, # Minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id
)
GROUP BY day ORDER BY day;
```

Automated changes are excluded from this metric. This is a subject up for debate. Our rationale is that when we merge a Pull Request it creates a Push event in the main branch. This Push event is not its own distinct change, but rather a link in the workflow. If we trigger a deployment off of this push event, this artificially skews the metrics and does not give us a clear picture of developer velocity.
Expand All @@ -158,33 +184,28 @@ To get the buckets, rather than aggregating daily, we look at the last 3 months

```sql
SELECT
CASE WHEN median_time_to_change < 24 * 60 then "One day"
WHEN median_time_to_change < 168 * 60 then "One week"
WHEN median_time_to_change < 730 * 60 then "One month"
WHEN median_time_to_change < 730 * 6 * 60 then "Six months"
ELSE "One year"
END as lead_time_to_change,
FROM
(SElECT
PERCENTILE_CONT(
# Ignore automated changes
IF(time_to_change_minutes > 0,time_to_change_minutes, NULL),
0.5) # Median
OVER () median_time_to_change
FROM
(SELECT
d.deploy_id,
TIMESTAMP_TRUNC(d.time_created, DAY) as day,
# Time to Change
TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) time_to_change_minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id
# Limit to 3 months
WHERE d.time_created > TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL 3 MONTH))
)
GROUP BY day, time_to_change_minutes
)
LIMIT 1;
CASE
WHEN median_time_to_change < 24 * 60 then "One day"
WHEN median_time_to_change < 168 * 60 then "One week"
WHEN median_time_to_change < 730 * 60 then "One month"
WHEN median_time_to_change < 730 * 6 * 60 then "Six months"
ELSE "One year"
END as lead_time_to_change
FROM (
SELECT
IFNULL(ANY_VALUE(med_time_to_change), 0) AS median_time_to_change
FROM (
SELECT
PERCENTILE_CONT(
# Ignore automated pushes
IF(TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE) > 0, TIMESTAMP_DIFF(d.time_created, c.time_created, MINUTE), NULL),
0.5) # Median
OVER () AS med_time_to_change, # Minutes
FROM four_keys.deployments d, d.changes
LEFT JOIN four_keys.changes c ON changes = c.change_id
WHERE d.time_created > TIMESTAMP(DATE_SUB(CURRENT_DATE(), INTERVAL 3 MONTH)) # Limit to 3 months
)
)
```

## Time to Restore Services ##
Expand Down
22 changes: 10 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# Four Keys
![Four Keys](images/fourkeys_wide.svg)

[![Four Keys YouTube Video](images/youtube-screenshot.png)](https://www.youtube.com/watch?v=2rzvIL29Nz0 "Measuring Devops: The Four Keys Project")

# Background

Through six years of research, the [DevOps Research and Assessment (DORA)](https://cloud.google.com/blog/products/devops-sre/the-2019-accelerate-state-of-devops-elite-performance-productivity-and-scaling) team has identified four key metrics that indicate the performance of a software development team. Four Keys allows you to collect data from your development environment (such as GitHub or GitLab) and compiles it into a dashboard displaying these key metrics.
Through six years of research, the [DevOps Research and Assessment (DORA)](https://cloud.google.com/blog/products/devops-sre/the-2019-accelerate-state-of-devops-elite-performance-productivity-and-scaling) team has identified four key metrics that indicate the performance of software delivery. Four Keys allows you to collect data from your development environment (such as GitHub or GitLab) and compiles it into a dashboard displaying these key metrics.

These four key metrics are:

Expand Down Expand Up @@ -33,7 +34,7 @@ For a quick baseline of your team's software delivery performance, you can also
1. Events are sent to a webhook target hosted on Cloud Run. Events are any occurrence in your development environment (for example, GitHub or GitLab) that can be measured, such as a pull request or new issue. Four Keys defines events to measure, and you can add others that are relevant to your project.
1. The Cloud Run target publishes all events to Pub/Sub.
1. A Cloud Run instance is subscribed to the Pub/Sub topics, does some light data transformation, and inputs the data into BigQuery.
1. Nightly scripts are scheduled in BigQuery to complete the data transformations and feed into the dashboard.
1. The BigQuery view to complete the data transformations and feed into the dashboard.

This diagram shows the design of the Four Keys system:

Expand All @@ -51,7 +52,6 @@ This diagram shows the design of the Four Keys system:
* Contains the code for the `event_handler`, which is the public service that accepts incoming webhooks.
* `queries/`
* Contains the SQL queries for creating the derived tables.
* Contains a bash script for scheduling the queries.
* `setup/`
* Contains the code for setting up and tearing down the Four Keys pipeline. Also contains a script for extending the data sources.
* `shared/`
Expand All @@ -65,11 +65,10 @@ _The project uses Python 3 and supports data extraction for Cloud Build and GitH

1. Fork this project.
1. Run the automation scripts, which does the following (See the [setup README](setup/README.md) for more details):
1. Set up a new Google Cloud Project.
1. Create and deploy the Cloud Run webhook target and ETL workers.
1. Create the Pub/Sub topics and subscriptions.
1. Enable the Google Secret Manager and create a secret for your GitHub repo.
1. Create a BigQuery dataset and tables, and schedule the nightly scripts.
1. Create a BigQuery dataset, tables and views.
1. Open up a browser tab to connect your data to a DataStudio dashboard template.
1. Set up your development environment to send events to the webhook created in the second step.
1. Add the secret to your GitHub webhook.
Expand Down Expand Up @@ -117,23 +116,22 @@ To run outside of the setup script:

The scripts consider some events to be “changes”, “deploys”, and “incidents.” You may want to reclassify one or more of these events, for example, if you want to use a label for your incidents other than “incident.” To reclassify one of the events in the table, no changes are required on the architecture or code of the project.

1. Update the nightly scripts in BigQuery for the following tables:
1. Update the view in BigQuery for the following tables:

* `four_keys.changes`
* `four_keys.deployments`
* `four_keys.incidents`

To update the scripts, we recommend that you update the `sql` files in the `queries` folder, rather than in the BigQuery UI.
To update the view, we recommend that you update the `sql` files in the `queries` folder, rather than in the BigQuery UI.

1. Once you've edited the SQL, run the `schedule.sh` script to update the scheduled query that populates the table. For example, if you wanted to update the `four_keys.changes` table, you'd run:
1. Once you've edited the SQL, run `terraform apply` to update the view that populates the table:

```sh
schedule.sh --query_file=changes.sql --table=changes --project_id=$FOURKEYS_PROJECT
cd ./setup && terraform apply
```

Notes:

* The `query_file` flag should contain the relative path of the file.
* To feed into the dashboard, the table name should be one of `changes`, `deployments`, `incidents`.


Expand Down Expand Up @@ -169,7 +167,7 @@ To run nox:

### Listing tests

To list all the test sesssions in the noxfile, use the following command:
To list all the test sessions in the noxfile, use the following command:

```sh
python3 -m nox -l
Expand Down
15 changes: 7 additions & 8 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,13 @@ Non-goals:
## Roadmap

* Short Term
* Google verification on the [DataStudio Connector](https://github.com/GoogleCloudPlatform/fourkeys/tree/main/connector)
* Experimental
* Data modeling with [Grafeas](https://github.com/grafeas/grafeas)
* Terraform project setup
* [Experimental folder](https://github.com/GoogleCloudPlatform/fourkeys/tree/main/experimental/terraform)
* Enriching the dashboard
* [More data points](https://github.com/GoogleCloudPlatform/fourkeys/issues/77)
* New data views for drilling down into the metrics
* Long Term
* CloudEvents migration
* Migrate the `four_keys.events_raw` schema to [CloudEvents](https://github.com/cloudevents/spec) schema
* Use the CloudEvents adapters to do the ETL rather than the current [workers](bq-workers/)
* New Integrations
* CI/CD Tools
* [Jenkins](https://www.jenkins.io/)
Expand All @@ -45,6 +46,4 @@ Non-goals:
* Custom deployment events
* Support for different [deployment patterns](https://github.com/GoogleCloudPlatform/fourkeys/issues/46), eg multiple change sets in a single deployment
* Canary and Blue/Green deployments
* Enriching the dashboard
* [More data points](https://github.com/GoogleCloudPlatform/fourkeys/issues/77)
* New data views for drilling down into the metrics

2 changes: 1 addition & 1 deletion bq-workers/argocd-parser/cloudbuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ steps:

- # Deploy to Cloud Run
name: google/cloud-sdk
args: ['gcloud', 'run', 'deploy', 'argocd-worker',
args: ['gcloud', 'run', 'deploy', 'argocd-parser',
'--image', '${_FOURKEYS_GCR_DOMAIN}/$PROJECT_ID/default/argocd-worker:${_TAG}',
'--region', '${_FOURKEYS_REGION}',
'--platform', 'managed',
Expand Down
12 changes: 6 additions & 6 deletions bq-workers/argocd-parser/main_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def test_missing_msg_attributes(client):


def test_argocd_event_processed(client):
data = json.dumps({"foo": "bar"}).encode("utf-8")
data = json.dumps({"foo": "bar", "id": "foo", "time": 0}).encode("utf-8")
pubsub_msg = {
"message": {
"data": base64.b64encode(data).decode("utf-8"),
Expand All @@ -68,13 +68,13 @@ def test_argocd_event_processed(client):
}

event = {
"event_type": "event_type",
"id": "e_id",
"metadata": '{"foo": "bar"}',
"event_type": "deployment",
"id": "foo",
"metadata": '{"foo": "bar", "id": "foo", "time": 0}',
"time_created": 0,
"signature": "signature",
"signature": "a424b5326ac45bde4c42c9b74dc878e56623d84f",
"msg_id": "foobar",
"source": "source",
"source": "argocd",
}

shared.insert_row_into_bigquery = mock.MagicMock()
Expand Down
1 change: 1 addition & 0 deletions bq-workers/argocd-parser/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Flask==1.1.1
gunicorn==19.9.0
google-cloud-bigquery==1.23.1
itsdangerous==2.0.1
git+https://github.com/GoogleCloudPlatform/fourkeys.git#egg=shared&subdirectory=shared
2 changes: 1 addition & 1 deletion bq-workers/azuredevops-parser/cloudbuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ steps:

- # Deploy to Cloud Run
name: google/cloud-sdk
args: ['gcloud', 'run', 'deploy', 'azuredevops-worker',
args: ['gcloud', 'run', 'deploy', 'azuredevops-parser',
'--image', '${_FOURKEYS_GCR_DOMAIN}/$PROJECT_ID/default/azuredevops-worker:${_TAG}',
'--region', '${_FOURKEYS_REGION}',
'--platform', 'managed',
Expand Down
8 changes: 4 additions & 4 deletions bq-workers/azuredevops-parser/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,22 +67,22 @@ def process_azuredevops_event(msg):
print(f"Metadata after decoding {metadata}")
# Unique hash for the event
signature = shared.create_unique_id(msg)

event_type = metadata['eventType']
types = {"ms.vss-release.deployment-completed-event"}
if event_type not in types:
raise Warning("Unsupported AzureDevops event: '%s'" % event_type)

azuredevops_event = {
"event_type": event_type, # Event type,
"id": metadata['id'], # Event ID,
"metadata": json.dumps(metadata), # The body of the msg
"signature": signature, # The unique event signature
"msg_id": msg["message_id"], # The pubsub message id
"time_created" : metadata['createdDate'], # The timestamp of with the event resolved
"time_created" : metadata['createdDate'], # The timestamp of with the event resolved
"source": "azuredevops", # The name of the source, eg "azuredevops"
}

print(f"Azure Devops event to metrics--------> {azuredevops_event}")
return azuredevops_event

Expand Down
Loading