Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-monitoring 2.0 alpha #757

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
f3cbf82
Add cluster events feature chart and a new github action to test feat…
petewall Sep 30, 2024
647b173
Update test action
petewall Sep 30, 2024
4c9c162
Add a gitkeep to make the snapshot directory exist
petewall Sep 30, 2024
1ad2292
Add annotation feature chart
petewall Sep 30, 2024
753ddd4
Add cluster metrics feature
petewall Sep 30, 2024
a6a0618
Fix yaml lint
petewall Sep 30, 2024
de70c1f
Update linters and top-level makefile
petewall Oct 1, 2024
8b6fd31
More lint fixes
petewall Oct 1, 2024
0423ccf
Re-enable linting for test yamls
petewall Oct 1, 2024
56a7654
Make markdownlint happier
petewall Oct 1, 2024
746c1ab
Adding pod logs, profiling, and prom operator object feature charts
petewall Oct 1, 2024
9827964
remove -it from docker run
petewall Oct 1, 2024
d27ec5a
Update checkout action
petewall Oct 1, 2024
f00baa8
Simplify diff check
petewall Oct 1, 2024
4663de5
Add app o11y and frontend o11y feature charts
petewall Oct 1, 2024
def3112
Move codeowners into .github and break things down by directory
petewall Oct 1, 2024
91999f5
add integration feature
petewall Oct 1, 2024
37aced9
Don't delete Chart.lock, because it is not built deterministicly, sin…
petewall Oct 1, 2024
4110ff7
Sort the integration values so the order is deterministic
petewall Oct 1, 2024
3b45e6a
Add main chart
petewall Oct 1, 2024
47d7326
Fix linters
petewall Oct 1, 2024
c6309f2
Make lint-md do more and catch more changes
petewall Oct 1, 2024
b5e56ad
ct lint in every chart
petewall Oct 1, 2024
553b20c
legacy test should only work on the the v1 chart
petewall Oct 1, 2024
b6ccea9
Add ct to the github action
petewall Oct 1, 2024
0933bbb
add ct configs for charts that use dependenties
petewall Oct 1, 2024
5a27369
Include chart v2 in the chart test workflow
petewall Oct 1, 2024
e9b7dbb
Add clean target
petewall Oct 1, 2024
c61c4e7
Add file to make helm unittest work
petewall Oct 1, 2024
53204ee
Add a test chart and use that to parallelize integration tests
petewall Oct 2, 2024
fa05b29
inline the helm repo
petewall Oct 2, 2024
85f1ec7
Update generated files
petewall Oct 2, 2024
7971a83
Need alloy for some unittests
petewall Oct 2, 2024
fa07738
Fix v1 test and output helm commands in new integration tests
petewall Oct 2, 2024
a045dd4
Fix integration test script and don't check chart bundles for modific…
petewall Oct 2, 2024
3c3ce41
Move function up
petewall Oct 2, 2024
1973339
Use the pre-existing cluster
petewall Oct 2, 2024
e03a281
More prerequisites before deploying prometheus
petewall Oct 2, 2024
d691ffc
Update integration test scripts to work this time
petewall Oct 2, 2024
ee08902
further work on the test chart
petewall Oct 3, 2024
a739f35
Gotta get that chart dir
petewall Oct 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
35 changes: 35 additions & 0 deletions .configs/certificates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: prometheus
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: ca-cert
namespace: prometheus
spec:
isCA: true
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: prometheus-ssl
namespace: prometheus
spec:
alternativeNames:
- prometheus-server.prometheus.svc.cluster.local
caRef:
name: ca-cert
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: prometheus-workload-ssl
namespace: prometheus
spec:
alternativeNames:
- prometheus-workload-server.prometheus.svc.cluster.local
caRef:
name: ca-cert
File renamed without changes.
93 changes: 93 additions & 0 deletions .configs/loki.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
loki:
commonConfig:
replication_factor: 1
schemaConfig:
configs:
- from: 2024-04-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
ingester:
chunk_encoding: snappy
querier:
# Default is 4, if you have enough memory and CPU you can increase, reduce if OOMing
max_concurrent: 2

test:
enabled: false

lokiCanary:
enabled: false

gateway:
basicAuth:
enabled: true
username: loki
password: lokipassword
service:
port: 8080

deploymentMode: SingleBinary
singleBinary:
replicas: 1
# resources:
# limits:
# cpu: 3
# memory: 4Gi
# requests:
# cpu: 2
# memory: 2Gi
# extraEnv:
# # Keep a little bit lower than memory limits
# - name: GOMEMLIMIT
# value: 3750MiB

# Enable minio for storage
minio:
enabled: true

# Zero out replica counts of other deployment modes
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0

ingester:
replicas: 0
querier:
replicas: 0
queryFrontend:
replicas: 0
queryScheduler:
replicas: 0
distributor:
replicas: 0
compactor:
replicas: 0
indexGateway:
replicas: 0
bloomCompactor:
replicas: 0
bloomGateway:
replicas: 0
resultsCache:
enabled: false
chunksCache:
enabled: false

monitoring:
selfMonitoring:
enabled: false
grafanaAgent:
installOperator: false
serviceMonitor:
enabled: true
# This actually isn't recommended by Loki, the default is 15s for a reason, but we don't want to upset
# our DPM test calculations.
interval: 1m
49 changes: 49 additions & 0 deletions .configs/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
server:
extraFlags:
- enable-feature=otlp-write-receiver
- enable-feature=remote-write-receiver
- web.config.file=/etc/config/web.yml

extraSecretMounts:
- name: prometheus-ssl
mountPath: /etc/prometheus-ssl
secretName: prometheus-ssl
readOnly: true

persistentVolume:
enabled: false

probeHeaders:
- name: "Authorization"
value: "Basic cHJvbXVzZXI6cHJvbWV0aGV1c3Bhc3N3b3Jk"
probeScheme: HTTPS

service:
servicePort: 9090

serverFiles:
prometheus.yml:
scrape_configs: []
web.yml:
basic_auth_users:
promuser: $2a$12$1UJsAG4QnhjjDzqcSVkZmeDxxjgIFOAmzfuVTybTuhhDnYgfuAbAq # "prometheuspassword"
tls_server_config:
cert_file: /etc/prometheus-ssl/crt.pem
key_file: /etc/prometheus-ssl/key.pem

configmapReload:
prometheus:
enabled: false

alertmanager:
enabled: false

kube-state-metrics:
enabled: false

prometheus-node-exporter:
enabled: false

prometheus-pushgateway:
enabled: false
16 changes: 16 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# This file is used to define the owners of the code in this repository.
# https://help.github.com/articles/about-codeowners/

# Global owners
* @petewall

# Chart owners
charts/feature-annotation-autodiscovery @grafana/k8s-monitoring-dev
charts/feature-application-observability @rlankfo
charts/feature-cluster-events @grafana/k8s-monitoring-dev
charts/feature-cluster-metrics @grafana/k8s-monitoring-dev
charts/feature-frontend-observability @rlankfo
charts/feature-pod-logs @grafana/k8s-monitoring-dev
charts/feature-profiling @simonswine
charts/feature-prometheus-operator-objects @grafana/k8s-monitoring-dev
charts/k8s-monitoring-v1 @grafana/k8s-monitoring-dev
4 changes: 2 additions & 2 deletions .github/workflows/helm-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ on:

env:
CT_CONFIGFILE: "${{ github.workspace }}/.github/configs/ct.yaml"
LINT_CONFIGFILE: "${{ github.workspace }}/.github/configs/lintconf.yaml"
LINT_CONFIGFILE: "${{ github.workspace }}/.configs/lintconf.yaml"
GRAFANA_ALLOY_VALUES: "${{ github.workspace }}/.github/configs/alloy-config.yaml"
GRAFANA_ALLOY_LOKI_OTLP_VALUES: "${{ github.workspace }}/.github/configs/alloy-config-loki-otlp.yaml"
GRAFANA_ALLOY_RECEIVER_SERVICE: "${{ github.workspace }}/.github/configs/receiver-service.yaml"
Expand Down Expand Up @@ -212,4 +212,4 @@ jobs:
if: (steps.list-changed.outputs.changed == 'true') || (contains(github.event.pull_request.labels.*.name, 'full_test_required'))
run: |
latestRelease=$(git describe --abbrev=0 --tags)
ct install --all --config "${CT_CONFIGFILE}" --since "${latestRelease}" --helm-extra-args "--timeout 10m"
ct install --config "${CT_CONFIGFILE}" --since "${latestRelease}" --helm-extra-args "--timeout 10m" --charts charts/k8s-monitoring-v1
56 changes: 56 additions & 0 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
name: Integration Test
# yamllint disable-line rule:truthy
on:
push:
branches: ["main"]
paths:
- 'charts/**'
- '!charts/k8s-monitoring-v1/**'
pull_request:
paths:
- 'charts/**'
- '!charts/k8s-monitoring-v1/**'

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
list-tests:
name: List tests
runs-on: ubuntu-latest
outputs:
tests: ${{ steps.list_tests.outputs.tests }}
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: List tests
id: list_tests
run: |
tests=$(ls charts/k8s-monitoring/tests/integration)
echo "Tests: ${tests}"
echo "tests=$(echo "${tests}" | jq --raw-input --slurp --compact-output 'split("\n") | map(select(. != ""))')" >> "${GITHUB_OUTPUT}"

run-tests:
name: Integration Test
needs: list-tests
runs-on: ubuntu-latest
strategy:
matrix:
test: ${{ fromJson(needs.list-tests.outputs.tests) }}
fail-fast: false
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Helm
uses: azure/setup-helm@v4

- name: Create kind cluster
uses: helm/kind-action@v1

- name: Run test
run: |
echo "Testing ${{ matrix.test }}"
CREATE_CLUSTER=false ./scripts/integration-test.sh "charts/k8s-monitoring/tests/integration/${{ matrix.test }}"
8 changes: 6 additions & 2 deletions .github/workflows/reviewdog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,13 @@
name: ReviewDog
# yamllint disable-line rule:truthy
on:
pull_request:
push:
branches: ["main"]

pull_request:

workflow_dispatch:

jobs:
markdownlint:
name: runner / markdownlint
Expand Down Expand Up @@ -137,7 +141,7 @@ jobs:
- env:
REVIEWDOG_GITHUB_API_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
npx textlint --format checkstyle --config ./.textlintrc --ignore-path ./.textlintignore $(find . -type f -name "*.md" -not \( -path "./node_modules/*" -o -path "./data-alloy/*" \)) | \
npx textlint --format checkstyle --config ./.textlintrc --ignore-path ./.textlintignore "$(find . -type f -name "*.md" -not \( -path "./node_modules/*" -o -path "./data-alloy/*" \))" | \
reviewdog -f=checkstyle -name="textlint" -reporter=github-check -level=info

alloy:
Expand Down
Loading
Loading