Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-monitoring 2.0 alpha #757

Merged
merged 42 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
618b5c4
Add cluster events feature chart and a new github action to test feat…
petewall Sep 30, 2024
c3aa695
Update test action
petewall Sep 30, 2024
ada7950
Add a gitkeep to make the snapshot directory exist
petewall Sep 30, 2024
f2c55ca
Add annotation feature chart
petewall Sep 30, 2024
0d23025
Add cluster metrics feature
petewall Sep 30, 2024
8f9ce23
Fix yaml lint
petewall Sep 30, 2024
0099f17
Update linters and top-level makefile
petewall Oct 1, 2024
554a0c9
More lint fixes
petewall Oct 1, 2024
d286900
Re-enable linting for test yamls
petewall Oct 1, 2024
744aa6a
Make markdownlint happier
petewall Oct 1, 2024
0ba4c08
Adding pod logs, profiling, and prom operator object feature charts
petewall Oct 1, 2024
f07049f
remove -it from docker run
petewall Oct 1, 2024
4fda5c4
Update checkout action
petewall Oct 1, 2024
2667b85
Simplify diff check
petewall Oct 1, 2024
4884450
Add app o11y and frontend o11y feature charts
petewall Oct 1, 2024
3c75fa9
Move codeowners into .github and break things down by directory
petewall Oct 1, 2024
a533ccf
add integration feature
petewall Oct 1, 2024
76abacd
Don't delete Chart.lock, because it is not built deterministicly, sin…
petewall Oct 1, 2024
e8a362e
Sort the integration values so the order is deterministic
petewall Oct 1, 2024
0397982
Add main chart
petewall Oct 1, 2024
ec52ef1
Fix linters
petewall Oct 1, 2024
f673d82
Make lint-md do more and catch more changes
petewall Oct 1, 2024
a32484d
ct lint in every chart
petewall Oct 1, 2024
4d75415
legacy test should only work on the the v1 chart
petewall Oct 1, 2024
53d9420
Add ct to the github action
petewall Oct 1, 2024
d31b8cd
add ct configs for charts that use dependenties
petewall Oct 1, 2024
5d29033
Include chart v2 in the chart test workflow
petewall Oct 1, 2024
9329421
Add clean target
petewall Oct 1, 2024
c614935
Add file to make helm unittest work
petewall Oct 1, 2024
e32f0c4
Add a test chart and use that to parallelize integration tests
petewall Oct 2, 2024
b50278e
inline the helm repo
petewall Oct 2, 2024
3794691
Update generated files
petewall Oct 2, 2024
9ea8c1c
Need alloy for some unittests
petewall Oct 2, 2024
3e1c530
Fix v1 test and output helm commands in new integration tests
petewall Oct 2, 2024
0f4ea3d
Fix integration test script and don't check chart bundles for modific…
petewall Oct 2, 2024
211a839
Move function up
petewall Oct 2, 2024
34c1e42
Use the pre-existing cluster
petewall Oct 2, 2024
3dd00cc
More prerequisites before deploying prometheus
petewall Oct 2, 2024
1f7565b
Update integration test scripts to work this time
petewall Oct 2, 2024
569339e
further work on the test chart
petewall Oct 3, 2024
17bba77
Gotta get that chart dir
petewall Oct 3, 2024
00a9d0a
Remove ec lint, which we don't really need and also fix some lint issues
petewall Oct 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
35 changes: 35 additions & 0 deletions .configs/certificates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: prometheus
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: ca-cert
namespace: prometheus
spec:
isCA: true
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: prometheus-ssl
namespace: prometheus
spec:
alternativeNames:
- prometheus-server.prometheus.svc.cluster.local
caRef:
name: ca-cert
---
apiVersion: secretgen.k14s.io/v1alpha1
kind: Certificate
metadata:
name: prometheus-workload-ssl
namespace: prometheus
spec:
alternativeNames:
- prometheus-workload-server.prometheus.svc.cluster.local
caRef:
name: ca-cert
File renamed without changes.
93 changes: 93 additions & 0 deletions .configs/loki.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
loki:
commonConfig:
replication_factor: 1
schemaConfig:
configs:
- from: 2024-04-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
ingester:
chunk_encoding: snappy
querier:
# Default is 4, if you have enough memory and CPU you can increase, reduce if OOMing
max_concurrent: 2

test:
enabled: false

lokiCanary:
enabled: false

gateway:
basicAuth:
enabled: true
username: loki
password: lokipassword
service:
port: 8080

deploymentMode: SingleBinary
singleBinary:
replicas: 1
# resources:
# limits:
# cpu: 3
# memory: 4Gi
# requests:
# cpu: 2
# memory: 2Gi
# extraEnv:
# # Keep a little bit lower than memory limits
# - name: GOMEMLIMIT
# value: 3750MiB

# Enable minio for storage
minio:
enabled: true

# Zero out replica counts of other deployment modes
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0

ingester:
replicas: 0
querier:
replicas: 0
queryFrontend:
replicas: 0
queryScheduler:
replicas: 0
distributor:
replicas: 0
compactor:
replicas: 0
indexGateway:
replicas: 0
bloomCompactor:
replicas: 0
bloomGateway:
replicas: 0
resultsCache:
enabled: false
chunksCache:
enabled: false

monitoring:
selfMonitoring:
enabled: false
grafanaAgent:
installOperator: false
serviceMonitor:
enabled: true
# This actually isn't recommended by Loki, the default is 15s for a reason, but we don't want to upset
# our DPM test calculations.
interval: 1m
49 changes: 49 additions & 0 deletions .configs/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
server:
extraFlags:
- enable-feature=otlp-write-receiver
- enable-feature=remote-write-receiver
- web.config.file=/etc/config/web.yml

extraSecretMounts:
- name: prometheus-ssl
mountPath: /etc/prometheus-ssl
secretName: prometheus-ssl
readOnly: true

persistentVolume:
enabled: false

probeHeaders:
- name: "Authorization"
value: "Basic cHJvbXVzZXI6cHJvbWV0aGV1c3Bhc3N3b3Jk"
probeScheme: HTTPS

service:
servicePort: 9090

serverFiles:
prometheus.yml:
scrape_configs: []
web.yml:
basic_auth_users:
promuser: $2a$12$1UJsAG4QnhjjDzqcSVkZmeDxxjgIFOAmzfuVTybTuhhDnYgfuAbAq # "prometheuspassword"
tls_server_config:
cert_file: /etc/prometheus-ssl/crt.pem
key_file: /etc/prometheus-ssl/key.pem

configmapReload:
prometheus:
enabled: false

alertmanager:
enabled: false

kube-state-metrics:
enabled: false

prometheus-node-exporter:
enabled: false

prometheus-pushgateway:
enabled: false
16 changes: 16 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# This file is used to define the owners of the code in this repository.
# https://help.github.com/articles/about-codeowners/

# Global owners
* @petewall

# Chart owners
charts/feature-annotation-autodiscovery @grafana/k8s-monitoring-dev
charts/feature-application-observability @rlankfo
charts/feature-cluster-events @grafana/k8s-monitoring-dev
charts/feature-cluster-metrics @grafana/k8s-monitoring-dev
charts/feature-frontend-observability @rlankfo
charts/feature-pod-logs @grafana/k8s-monitoring-dev
charts/feature-profiling @simonswine
charts/feature-prometheus-operator-objects @grafana/k8s-monitoring-dev
charts/k8s-monitoring-v1 @grafana/k8s-monitoring-dev
4 changes: 2 additions & 2 deletions .github/workflows/helm-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ on:

env:
CT_CONFIGFILE: "${{ github.workspace }}/.github/configs/ct.yaml"
LINT_CONFIGFILE: "${{ github.workspace }}/.github/configs/lintconf.yaml"
LINT_CONFIGFILE: "${{ github.workspace }}/.configs/lintconf.yaml"
GRAFANA_ALLOY_VALUES: "${{ github.workspace }}/.github/configs/alloy-config.yaml"
GRAFANA_ALLOY_LOKI_OTLP_VALUES: "${{ github.workspace }}/.github/configs/alloy-config-loki-otlp.yaml"
GRAFANA_ALLOY_RECEIVER_SERVICE: "${{ github.workspace }}/.github/configs/receiver-service.yaml"
Expand Down Expand Up @@ -212,4 +212,4 @@ jobs:
if: (steps.list-changed.outputs.changed == 'true') || (contains(github.event.pull_request.labels.*.name, 'full_test_required'))
run: |
latestRelease=$(git describe --abbrev=0 --tags)
ct install --all --config "${CT_CONFIGFILE}" --since "${latestRelease}" --helm-extra-args "--timeout 10m"
ct install --config "${CT_CONFIGFILE}" --since "${latestRelease}" --helm-extra-args "--timeout 10m" --charts charts/k8s-monitoring-v1
56 changes: 56 additions & 0 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
name: Integration Test
# yamllint disable-line rule:truthy
on:
push:
branches: ["main"]
paths:
- 'charts/**'
- '!charts/k8s-monitoring-v1/**'
pull_request:
paths:
- 'charts/**'
- '!charts/k8s-monitoring-v1/**'

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
list-tests:
name: List tests
runs-on: ubuntu-latest
outputs:
tests: ${{ steps.list_tests.outputs.tests }}
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: List tests
id: list_tests
run: |
tests=$(ls charts/k8s-monitoring/tests/integration)
echo "Tests: ${tests}"
echo "tests=$(echo "${tests}" | jq --raw-input --slurp --compact-output 'split("\n") | map(select(. != ""))')" >> "${GITHUB_OUTPUT}"

run-tests:
name: Integration Test
needs: list-tests
runs-on: ubuntu-latest
strategy:
matrix:
test: ${{ fromJson(needs.list-tests.outputs.tests) }}
fail-fast: false
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Helm
uses: azure/setup-helm@v4

- name: Create kind cluster
uses: helm/kind-action@v1

- name: Run test
run: |
echo "Testing ${{ matrix.test }}"
CREATE_CLUSTER=false ./scripts/integration-test.sh "charts/k8s-monitoring/tests/integration/${{ matrix.test }}"
18 changes: 6 additions & 12 deletions .github/workflows/reviewdog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,13 @@
name: ReviewDog
# yamllint disable-line rule:truthy
on:
pull_request:
push:
branches: ["main"]

pull_request:

workflow_dispatch:

jobs:
markdownlint:
name: runner / markdownlint
Expand Down Expand Up @@ -117,16 +121,6 @@ jobs:
github_token: ${{ secrets.github_token }}
reporter: github-check

eclint:
name: runner / eclint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: reviewdog/action-eclint@v1
with:
github_token: ${{ secrets.github_token }}
reporter: github-check

textlint:
name: runner / textlint
runs-on: ubuntu-latest
Expand All @@ -137,7 +131,7 @@ jobs:
- env:
REVIEWDOG_GITHUB_API_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
npx textlint --format checkstyle --config ./.textlintrc --ignore-path ./.textlintignore $(find . -type f -name "*.md" -not \( -path "./node_modules/*" -o -path "./data-alloy/*" \)) | \
npx textlint --format checkstyle --config ./.textlintrc --ignore-path ./.textlintignore "$(find . -type f -name "*.md" -not \( -path "./node_modules/*" -o -path "./data-alloy/*" \))" | \
reviewdog -f=checkstyle -name="textlint" -reporter=github-check -level=info

alloy:
Expand Down
Loading