diff --git a/monitoring/README.md b/monitoring/README.md
index ee0c5e5301a..1e0eb95abad 100644
--- a/monitoring/README.md
+++ b/monitoring/README.md
@@ -12,6 +12,10 @@ This directory contains chaos interleaved grafana dashboards along with the util
> Contains utilities required to setup monitoring infrastructure on a kubernetes cluster.
+- [Tutorials](./tutorials)
+
+ > Contains tutorials for users on monitoring target applications under chaos using various tools.
+
## Setup the LitmusChaos Infrastructure
- Install the litmus chaos operator and CRDs
diff --git a/monitoring/tutorials/README.md b/monitoring/tutorials/README.md
new file mode 100644
index 00000000000..e092e6fc1ad
--- /dev/null
+++ b/monitoring/tutorials/README.md
@@ -0,0 +1,7 @@
+# Tutorials
+
+This directory contains tutorials for users on monitoring target applications under chaos using various tools.
+
+- [Otel-demo](./otel-demo)
+
+ > Contains a tutorial on injecting chaos into target applications using LitmusChaos and observing the chaos with OpenTelemetry.
diff --git a/monitoring/tutorials/otel-demo/README.md b/monitoring/tutorials/otel-demo/README.md
new file mode 100644
index 00000000000..48c6d5b9e5e
--- /dev/null
+++ b/monitoring/tutorials/otel-demo/README.md
@@ -0,0 +1,95 @@
+# Otel-demo tutorial
+
+This tutorial provides a step-by-step guide for injecting chaos into target applications using LitmusChaos and observing the chaos with OpenTelemetry.
+
+
+
+### 0. Prerequisites
+- Kubernetes 1.24+
+- 8 GB of free RAM
+- Helm 3.9+
+
+### 1. Install Litmus
+1. Create the `litmus` namespace.
+ ```bash
+ kubectl create ns litmus
+ ```
+2. Add the Litmus Helm repository.
+ ```bash
+ helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm/
+ ```
+3. Install Litmus using Helm.
+ ```bash
+ helm install chaos litmuschaos/litmus \
+ --namespace=litmus \
+ --set portal.frontend.service.type=NodePort \
+ --set mongodb.image.registry=ghcr.io/zcube \
+ --set mongodb.image.repository=bitnami-compat/mongodb \
+ --set mongodb.image.tag=6.0.5
+ ```
+4. Verify the installation.
+ ```bash
+ kubectl get all -n litmus
+ ```
+5. Forward the Litmus frontend service port.
+ ```bash
+ kubectl port-forward svc/chaos-litmus-frontend-service 9091:9091 -n litmus
+ ```
+ Access the Litmus frontend at [http://localhost:9091](http://localhost:9091) and log in with `admin` / `litmus`.
+
+### 2. Set Up Litmus Environment
+1. Create a new environment.
+ - Environment Name: `local`
+ - Environment Type: `Production`
+2. Configure a new chaos infrastructure.
+ - Name: `local`
+ - Chaos Components Installation: `Cluster-wide access`
+ - Installation Location (Namespace): `litmus`
+ - Service Account Name: `litmus`
+3. Deploy the new chaos infrastructure.
+ ```bash
+ cd ~/Downloads
+ kubectl apply -f local-litmus-chaos-enable.yml
+ ```
+ Wait until the status shows `CONNECTED`.
+
+### 3. Install Otel-demo microservices & Observability tools
+1. Create the `otel-demo` namespace.
+ ```bash
+ kubectl create ns otel-demo
+ ```
+2. Add the OpenTelemetry Helm repository.
+ ```bash
+ helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
+ ```
+3. Install Otel-demo microservices and Observability tools using Helm.
+ ```bash
+ cd litmus/monitoring/tutorials/otel-demo
+ helm install my-otel-demo open-telemetry/opentelemetry-demo --namespace otel-demo --values custom_otel_demo_values.yml
+ ```
+ It contains Otel-demo microservices, OpenTelemetry(with chaos metrics), Prometheus, Jaeger and Grafana.
+4. Verify the installation.
+ ```bash
+ kubectl get all -n otel-demo
+ ```
+5. Forward the Otel-demo frontend proxy port.
+ ```bash
+ kubectl port-forward svc/my-otel-demo-frontendproxy 8080:8080 -n otel-demo
+ ```
+6. Access the following services.
+ - Web store: [http://localhost:8080/](http://localhost:8080/)
+ - Grafana: [http://localhost:8080/grafana/](http://localhost:8080/grafana/)
+ - Load Generator UI: [http://localhost:8080/loadgen/](http://localhost:8080/loadgen/)
+ - Jaeger UI: [http://localhost:8080/jaeger/ui/](http://localhost:8080/jaeger/ui/)
+
+### 4. Add Grafana Panel
+Import the `chaos-experiments-dashboard.json` file into Grafana to visualize the results of chaos experiments.
+
+### 5. Observe chaos
+Explore the following experiments to observe chaos on the Otel-demo microservices.
+
+- [Pod Network Latency](./cart-service)
+ > Performs a pod network latency experiment on the cart service.
+
+- [Pod Delete](./recommendation-service)
+ > Performs a pod delete experiment on the recommendation service.
diff --git a/monitoring/tutorials/otel-demo/cart-service/README.md b/monitoring/tutorials/otel-demo/cart-service/README.md
new file mode 100644
index 00000000000..335aa14e98d
--- /dev/null
+++ b/monitoring/tutorials/otel-demo/cart-service/README.md
@@ -0,0 +1,26 @@
+# cart service pod network latency
+## Description
+- This experiment injects network latency to the cart service pod.
+- The Probe checks Prometheus metrics Latency of cart service requests.
+## Steps
+### 1. Probe Settings
+- probe type: `Prometheus Probe`
+- name: `cart-service-pod-network-latency-probe`
+- timeout: 3s
+- interval: 3s
+- prometheus endpoint: `http://my-otel-demo-prometheus-server.otel-demo:9090`
+- prometheus query: `histogram_quantile(0.99, sum(rate(duration_milliseconds_bucket{service_name=\"cartservice\"}[5m])) by (le))/1000`
+- Data Comparison:
+ - Type: Float
+ - Criteria: `<`
+ - Value: `3.0`
+### 2. Make Experiment
+1. New Experimnet
+2. Complete Overview
+3. Start off by Upload YML(cart-service-pod-network-latency.yml)
+### 3. Run Experiment
+1. Click on the `Run` button
+2. Check Experiment Status and Logs
+3. Check the Resilience Score
+4. Check the Chaos Exporter metrics using Grafana and confirm if the experiment failed. ![cart_service_pod_network_latency_experiment_result_dashboard.png](../screenshots/cart_service_pod_network_latency_experiment_result_dashboard.png)
+5. Check cart service Spanmetrics Metrics using Grafana ![cartservice_spanmetrics.png](../screenshots/cartservice_spanmetrics.png)
\ No newline at end of file
diff --git a/monitoring/tutorials/otel-demo/cart-service/cart-service-pod-network-latency.yml b/monitoring/tutorials/otel-demo/cart-service/cart-service-pod-network-latency.yml
new file mode 100644
index 00000000000..5aa913a3356
--- /dev/null
+++ b/monitoring/tutorials/otel-demo/cart-service/cart-service-pod-network-latency.yml
@@ -0,0 +1,315 @@
+kind: Workflow
+apiVersion: argoproj.io/v1alpha1
+metadata:
+ name: cart-service-pod-network-latency
+ namespace: litmus
+ creationTimestamp: null
+ labels:
+ infra_id: 5b9be872-6396-4ad1-b64a-ed4b25edd516
+ revision_id: bd738dca-14f0-4145-8f67-afb3d8c17991
+ workflow_id: 1912f522-5197-4bd5-8854-732ccf1882bb
+ workflows.argoproj.io/controller-instanceid: 5b9be872-6396-4ad1-b64a-ed4b25edd516
+spec:
+ templates:
+ - name: test
+ inputs: {}
+ outputs: {}
+ metadata: {}
+ steps:
+ - - name: install-chaos-faults
+ template: install-chaos-faults
+ arguments: {}
+ - - name: pod-network-latency-pok
+ template: pod-network-latency-pok
+ arguments: {}
+ - - name: cleanup-chaos-resources
+ template: cleanup-chaos-resources
+ arguments: {}
+ - name: install-chaos-faults
+ inputs:
+ artifacts:
+ - name: pod-network-latency-pok
+ path: /tmp/pod-network-latency-pok.yaml
+ raw:
+ data: >
+ apiVersion: litmuschaos.io/v1alpha1
+
+ description:
+ message: |
+ Injects network latency on pods belonging to an app deployment
+ kind: ChaosExperiment
+
+ metadata:
+ name: pod-network-latency
+ labels:
+ name: pod-network-latency
+ app.kubernetes.io/part-of: litmus
+ app.kubernetes.io/component: chaosexperiment
+ app.kubernetes.io/version: ci
+ spec:
+ definition:
+ scope: Namespaced
+ permissions:
+ - apiGroups:
+ - ""
+ resources:
+ - pods
+ verbs:
+ - create
+ - delete
+ - get
+ - list
+ - patch
+ - update
+ - deletecollection
+ - apiGroups:
+ - ""
+ resources:
+ - events
+ verbs:
+ - create
+ - get
+ - list
+ - patch
+ - update
+ - apiGroups:
+ - ""
+ resources:
+ - configmaps
+ verbs:
+ - get
+ - list
+ - apiGroups:
+ - ""
+ resources:
+ - pods/log
+ verbs:
+ - get
+ - list
+ - watch
+ - apiGroups:
+ - ""
+ resources:
+ - pods/exec
+ verbs:
+ - get
+ - list
+ - create
+ - apiGroups:
+ - apps
+ resources:
+ - deployments
+ - statefulsets
+ - replicasets
+ - daemonsets
+ verbs:
+ - list
+ - get
+ - apiGroups:
+ - apps.openshift.io
+ resources:
+ - deploymentconfigs
+ verbs:
+ - list
+ - get
+ - apiGroups:
+ - ""
+ resources:
+ - replicationcontrollers
+ verbs:
+ - get
+ - list
+ - apiGroups:
+ - argoproj.io
+ resources:
+ - rollouts
+ verbs:
+ - list
+ - get
+ - apiGroups:
+ - batch
+ resources:
+ - jobs
+ verbs:
+ - create
+ - list
+ - get
+ - delete
+ - deletecollection
+ - apiGroups:
+ - litmuschaos.io
+ resources:
+ - chaosengines
+ - chaosexperiments
+ - chaosresults
+ verbs:
+ - create
+ - list
+ - get
+ - patch
+ - update
+ - delete
+ image: docker.io/litmuschaos/go-runner:latest
+ imagePullPolicy: Always
+ args:
+ - -c
+ - ./experiments -name pod-network-latency
+ command:
+ - /bin/bash
+ env:
+ - name: TARGET_CONTAINER
+ value: ""
+ - name: NETWORK_INTERFACE
+ value: eth0
+ - name: LIB_IMAGE
+ value: docker.io/litmuschaos/go-runner:latest
+ - name: TC_IMAGE
+ value: gaiadocker/iproute2
+ - name: NETWORK_LATENCY
+ value: "2000"
+ - name: TOTAL_CHAOS_DURATION
+ value: "60"
+ - name: RAMP_TIME
+ value: ""
+ - name: JITTER
+ value: "0"
+ - name: PODS_AFFECTED_PERC
+ value: ""
+ - name: TARGET_PODS
+ value: ""
+ - name: CONTAINER_RUNTIME
+ value: containerd
+ - name: DEFAULT_HEALTH_CHECK
+ value: "false"
+ - name: DESTINATION_IPS
+ value: ""
+ - name: DESTINATION_HOSTS
+ value: ""
+ - name: SOCKET_PATH
+ value: /run/containerd/containerd.sock
+ - name: NODE_LABEL
+ value: ""
+ - name: SEQUENCE
+ value: parallel
+ labels:
+ name: pod-network-latency
+ app.kubernetes.io/part-of: litmus
+ app.kubernetes.io/component: experiment-job
+ app.kubernetes.io/runtime-api-usage: "true"
+ app.kubernetes.io/version: ci
+ outputs: {}
+ metadata: {}
+ container:
+ name: ""
+ image: litmuschaos/k8s:2.11.0
+ command:
+ - sh
+ - -c
+ args:
+ - kubectl apply -f /tmp/ -n {{workflow.parameters.adminModeNamespace}}
+ && sleep 30
+ resources: {}
+ - name: cleanup-chaos-resources
+ inputs: {}
+ outputs: {}
+ metadata: {}
+ container:
+ name: ""
+ image: litmuschaos/k8s:2.11.0
+ command:
+ - sh
+ - -c
+ args:
+ - kubectl delete chaosengine -l workflow_run_id={{workflow.uid}} -n
+ {{workflow.parameters.adminModeNamespace}}
+ resources: {}
+ - name: pod-network-latency-pok
+ inputs:
+ artifacts:
+ - name: pod-network-latency-pok
+ path: /tmp/chaosengine-pod-network-latency-pok.yaml
+ raw:
+ data: >
+ apiVersion: litmuschaos.io/v1alpha1
+
+ kind: ChaosEngine
+
+ metadata:
+ namespace: "{{workflow.parameters.adminModeNamespace}}"
+ labels:
+ workflow_run_id: "{{ workflow.uid }}"
+ workflow_name: cart-service-pod-network-latency
+ annotations:
+ probeRef: '[{"name":"cart-service-pod-network-latency-probe","mode":"EOT"}]'
+ generateName: pod-network-latency-pok
+ spec:
+ engineState: active
+ appinfo:
+ appns: otel-demo
+ applabel: app.kubernetes.io/component=cartservice
+ appkind: deployment
+ chaosServiceAccount: litmus-admin
+ experiments:
+ - name: pod-network-latency
+ spec:
+ components:
+ env:
+ - name: TARGET_CONTAINER
+ value: ""
+ - name: NETWORK_INTERFACE
+ value: eth0
+ - name: LIB_IMAGE
+ value: docker.io/litmuschaos/go-runner:latest
+ - name: TC_IMAGE
+ value: gaiadocker/iproute2
+ - name: NETWORK_LATENCY
+ value: "2000"
+ - name: TOTAL_CHAOS_DURATION
+ value: "150"
+ - name: RAMP_TIME
+ value: ""
+ - name: JITTER
+ value: "0"
+ - name: PODS_AFFECTED_PERC
+ value: ""
+ - name: TARGET_PODS
+ value: ""
+ - name: CONTAINER_RUNTIME
+ value: containerd
+ - name: DEFAULT_HEALTH_CHECK
+ value: "false"
+ - name: DESTINATION_IPS
+ value: ""
+ - name: DESTINATION_HOSTS
+ value: ""
+ - name: SOCKET_PATH
+ value: /run/containerd/containerd.sock
+ - name: NODE_LABEL
+ value: ""
+ - name: SEQUENCE
+ value: parallel
+ outputs: {}
+ metadata:
+ labels:
+ weight: "10"
+ container:
+ name: ""
+ image: docker.io/litmuschaos/litmus-checker:2.11.0
+ args:
+ - -file=/tmp/chaosengine-pod-network-latency-pok.yaml
+ - -saveName=/tmp/engine-name
+ resources: {}
+ entrypoint: test
+ arguments:
+ parameters:
+ - name: adminModeNamespace
+ value: litmus
+ serviceAccountName: argo-chaos
+ podGC:
+ strategy: OnWorkflowCompletion
+ securityContext:
+ runAsUser: 1000
+ runAsNonRoot: true
+status:
+ startedAt: null
+ finishedAt: null
diff --git a/monitoring/tutorials/otel-demo/chaos-exporter-dashboard.json b/monitoring/tutorials/otel-demo/chaos-exporter-dashboard.json
new file mode 100644
index 00000000000..0586abffe08
--- /dev/null
+++ b/monitoring/tutorials/otel-demo/chaos-exporter-dashboard.json
@@ -0,0 +1,647 @@
+{
+ "annotations": {
+ "list": [
+ {
+ "builtIn": 1,
+ "datasource": {
+ "type": "grafana",
+ "uid": "-- Grafana --"
+ },
+ "enable": true,
+ "hide": true,
+ "iconColor": "rgba(0, 211, 255, 1)",
+ "name": "Annotations & Alerts",
+ "type": "dashboard"
+ }
+ ]
+ },
+ "editable": true,
+ "fiscalYearStartMonth": 0,
+ "graphTooltip": 0,
+ "id": 5,
+ "links": [],
+ "panels": [
+ {
+ "collapsed": false,
+ "gridPos": {
+ "h": 1,
+ "w": 24,
+ "x": 0,
+ "y": 0
+ },
+ "id": 8,
+ "panels": [],
+ "title": "Chaos Exporter Dashboard",
+ "type": "row"
+ },
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "fieldConfig": {
+ "defaults": {
+ "color": {
+ "mode": "palette-classic"
+ },
+ "custom": {
+ "fillOpacity": 50,
+ "hideFrom": {
+ "legend": false,
+ "tooltip": false,
+ "viz": false
+ },
+ "insertNulls": false,
+ "lineWidth": 0,
+ "spanNulls": false
+ },
+ "mappings": [],
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "transparent",
+ "value": null
+ }
+ ]
+ }
+ },
+ "overrides": []
+ },
+ "gridPos": {
+ "h": 7,
+ "w": 20,
+ "x": 0,
+ "y": 1
+ },
+ "id": 1,
+ "options": {
+ "alignValue": "center",
+ "legend": {
+ "displayMode": "list",
+ "placement": "bottom",
+ "showLegend": false
+ },
+ "mergeValues": true,
+ "rowHeight": 0.7,
+ "showValue": "auto",
+ "tooltip": {
+ "mode": "single",
+ "sort": "none"
+ }
+ },
+ "pluginVersion": "10.4.1",
+ "targets": [
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "disableTextWrap": false,
+ "editorMode": "builder",
+ "expr": "litmuschaos_experiment_total_duration",
+ "format": "time_series",
+ "fullMetaSearch": false,
+ "includeNullMetadata": true,
+ "legendFormat": "{{chaosengine_name}}",
+ "range": true,
+ "refId": "A",
+ "useBackend": false
+ }
+ ],
+ "title": "Chaos Experiments Duration",
+ "type": "state-timeline"
+ },
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "fieldConfig": {
+ "defaults": {
+ "mappings": [],
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "green",
+ "value": null
+ },
+ {
+ "color": "red",
+ "value": 80
+ }
+ ]
+ }
+ },
+ "overrides": []
+ },
+ "format": "short",
+ "gridPos": {
+ "h": 6,
+ "w": 5,
+ "x": 0,
+ "y": 8
+ },
+ "id": 2,
+ "max": 100,
+ "min": 0,
+ "options": {
+ "minVizHeight": 75,
+ "minVizWidth": 75,
+ "orientation": "auto",
+ "reduceOptions": {
+ "calcs": [
+ "lastNotNull"
+ ],
+ "fields": "",
+ "values": false
+ },
+ "showThresholdLabels": false,
+ "showThresholdMarkers": true,
+ "sizing": "auto"
+ },
+ "pluginVersion": "11.1.0",
+ "targets": [
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "disableTextWrap": false,
+ "editorMode": "builder",
+ "expr": "litmuschaos_cluster_scoped_experiments_installed_count",
+ "format": "time_series",
+ "fullMetaSearch": false,
+ "includeNullMetadata": true,
+ "legendFormat": "Total Experiments",
+ "range": true,
+ "refId": "A",
+ "useBackend": false
+ }
+ ],
+ "thresholds": "0,50,100",
+ "title": "Total Experiments",
+ "type": "gauge",
+ "valueMaps": [
+ {
+ "text": "No Data",
+ "value": "null"
+ }
+ ],
+ "valueName": "current"
+ },
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "fieldConfig": {
+ "defaults": {
+ "color": {
+ "fixedColor": "dark-yellow",
+ "mode": "fixed"
+ },
+ "mappings": [],
+ "thresholds": {
+ "mode": "absolute",
+ "steps": [
+ {
+ "color": "green",
+ "value": null
+ },
+ {
+ "color": "red",
+ "value": 80
+ }
+ ]
+ }
+ },
+ "overrides": []
+ },
+ "format": "short",
+ "gridPos": {
+ "h": 6,
+ "w": 5,
+ "x": 5,
+ "y": 8
+ },
+ "id": 5,
+ "max": 100,
+ "min": 0,
+ "options": {
+ "minVizHeight": 75,
+ "minVizWidth": 75,
+ "orientation": "auto",
+ "reduceOptions": {
+ "calcs": [
+ "lastNotNull"
+ ],
+ "fields": "",
+ "values": false
+ },
+ "showThresholdLabels": false,
+ "showThresholdMarkers": true,
+ "sizing": "auto"
+ },
+ "pluginVersion": "11.1.0",
+ "targets": [
+ {
+ "datasource": {
+ "type": "prometheus",
+ "uid": "webstore-metrics"
+ },
+ "disableTextWrap": false,
+ "editorMode": "code",
+ "expr": "sum(litmuschaos_awaited_experiments)",
+ "format": "time_series",
+ "fullMetaSearch": false,
+ "includeNullMetadata": true,
+ "legendFormat": "Queued Experiments",
+ "range": true,
+ "refId": "A",
+ "useBackend": false
+ }
+ ],
+ "thresholds": "0,50,100",
+ "title": "Awaited Experiments",
+ "type": "gauge",
+ "valueMaps": [
+ {
+ "text": "No Data",
+ "value": "null"
+ }
+ ],
+ "valueName": "current"
+ },
+ {
+ "alert": {
+ "alertRuleTags": {},
+ "conditions": [
+ {
+ "evaluator": {
+ "params": [
+ 0.99
+ ],
+ "type": "gt"
+ },
+ "operator": {
+ "type": "and"
+ },
+ "query": {
+ "params": [
+ "A",
+ "5s",
+ "now"
+ ]
+ },
+ "reducer": {
+ "params": [],
+ "type": "max"
+ },
+ "type": "query"
+ }
+ ],
+ "executionErrorState": "alerting",
+ "for": "1s",
+ "frequency": "1s",
+ "handler": 1,
+ "message": "Chaos Probe Failed !!!\n\n
\n
Chaos Details:-
\n
App Details:-
\n