kubernetes
diff --git a/‎content/en/blog/_posts/2018-04-11-migrating-the-kubernetes-blog.md‎
Lines changed: 0 additions & 1 deletion b/‎content/en/blog/_posts/2018-04-11-migrating-the-kubernetes-blog.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎content/en/blog/_posts/2018-04-24-kubernetes-application-survey-results-2018.md‎
Lines changed: 0 additions & 1 deletion b/‎content/en/blog/_posts/2018-04-24-kubernetes-application-survey-results-2018.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎content/en/blog/_posts/2018-05-04-Announcing-Kubeflow-0-1.md‎
Lines changed: 0 additions & 1 deletion b/‎content/en/blog/_posts/2018-05-04-Announcing-Kubeflow-0-1.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎content/en/blog/_posts/2018-05-05-hugo-migration.md‎
Lines changed: 0 additions & 1 deletion b/‎content/en/blog/_posts/2018-05-05-hugo-migration.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎content/en/blog/_posts/2018-05-17-gardener-the-kubernetes-botanist.md‎
Lines changed: 0 additions & 1 deletion b/‎content/en/blog/_posts/2018-05-17-gardener-the-kubernetes-botanist.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎content/en/docs/concepts/cluster-administration/observability.md‎
Lines changed: 197 additions & 0 deletions b/‎content/en/docs/concepts/cluster-administration/observability.md‎
Lines changed: 197 additions & 0 deletions
diff --git a/‎content/en/docs/concepts/architecture/cri.md‎ renamed to ‎content/en/docs/concepts/containers/cri.md‎ b/‎content/en/docs/concepts/architecture/cri.md‎ renamed to ‎content/en/docs/concepts/containers/cri.md‎
diff --git a/‎content/en/docs/concepts/storage/dynamic-provisioning.md‎
Lines changed: 3 additions & 4 deletions b/‎content/en/docs/concepts/storage/dynamic-provisioning.md‎
Lines changed: 3 additions & 4 deletions
diff --git a/‎content/en/docs/concepts/workloads/controllers/deployment.md‎
Lines changed: 9 additions & 6 deletions b/‎content/en/docs/concepts/workloads/controllers/deployment.md‎
Lines changed: 9 additions & 6 deletions
diff --git a/‎content/en/docs/concepts/workloads/controllers/replicaset.md‎
Lines changed: 1 addition & 1 deletion b/‎content/en/docs/concepts/workloads/controllers/replicaset.md‎
Lines changed: 1 addition & 1 deletion
@@ -3,7 +3,6 @@ title: 'Migrating the Kubernetes Blog'
 author: zcorleissen
 date: 2018-04-11
 slug: migrating-the-kubernetes-blog
-date: 2018-04-11
 ---
 
 We recently migrated the Kubernetes Blog from the Blogger platform to GitHub. With the change in platform comes a change in URL: formerly at [http://blog.kubernetes.io](http://blog.kubernetes.io), the blog now resides at [https://kubernetes.io/blog](https://kubernetes.io/blog).
 
@@ -3,7 +3,6 @@ title: 'Kubernetes Application Survey 2018 Results'
 author: mattfarina
 date: 2018-04-24
 slug: kubernetes-application-survey-results-2018
-date: 2018-04-24
 ---
 
 Understanding how people use or want to use Kubernetes can help us shape everything from what we build to how we do it. To help us understand how application developers, application operators, and ecosystem tool developers are using and want to use Kubernetes, the Application Definition Working Group recently performed a survey. The survey focused in on these types of user roles and the features and sub-projects owned by the Kubernetes organization. That included kubectl, Dashboard, Minikube, Helm, the Workloads API, etc.
 
@@ -1,7 +1,6 @@
 ---
 title: 'Announcing Kubeflow 0.1'
 date: 2018-05-04
-author: aronchick
 slug: announcing-kubeflow-0.1
 author: >
   Jeremy Lewi (Google),
 
@@ -1,7 +1,6 @@
 ---
 title: 'Docs are Migrating from Jekyll to Hugo'
 date: 2018-05-05
-author: zcorleissen
 slug: hugo-migration
 author: >
   [Zach Corleissen](https://www.cncf.io/people/staff/) (CNCF) 
 
@@ -1,7 +1,6 @@
 ---
 title: 'Gardener - The Kubernetes Botanist'
 date: 2018-05-17
-author: rfranzke
 slug: gardener
 author: >
   [Rafael Franzke](mailto:rafael.franzke@sap.com) (SAP),
 
@@ -0,0 +1,197 @@
+---
+title: Observability
+reviewers:
+weight: 55
+content_type: concept
+description: >
+  Understand how to gain end-to-end visibility of a Kubernetes cluster through the collection of metrics, logs, and traces.
+no_list: true
+card:
+  name: setup
+  weight: 60
+  anchors:
+  - anchor: "#metrics"
+    title: Metrics
+  - anchor: "#logs"
+    title: Logs
+  - anchor: "#traces"
+    title: Traces
+---
+
+<!-- overview -->
+
+In Kubernetes, observability is the process of collecting and analyzing metrics, logs, and traces—often referred to as the three pillars of observability—in order to obtain a better understanding of the internal state, performance, and health of the cluster.
+
+Kubernetes control plane components, as well as many add-ons, generate and emit these signals. By aggregating and correlating them, you can gain a unified picture of the control plane, add-ons, and applications across the cluster.
+
+Figure 1 outlines how cluster components emit the three primary signal types.
+
+{{< mermaid >}}
+flowchart LR
+    A[Cluster components] --> M[Metrics pipeline]
+    A --> L[Log pipeline]
+    A --> T[Trace pipeline]
+    M --> S[(Storage and analysis)]
+    L --> S
+    T --> S
+    S --> O[Operators and automation]
+{{< /mermaid >}}
+
+*Figure 1. High-level signals emitted by cluster components and their consumers.*
+
+<!-- body -->
+## Metrics
+
+Kubernetes components emit metrics in [Prometheus format](https://prometheus.io/docs/instrumenting/exposition_formats/) from their `/metrics` endpoints, including:
+
+- kube-controller-manager
+- kube-proxy
+- kube-apiserver
+- kube-scheduler
+- kubelet
+
+The kubelet also exposes metrics at `/metrics/cadvisor`, `/metrics/resource`, and `/metrics/probes`, and add-ons such as [kube-state-metrics](/docs/concepts/cluster-administration/kube-state-metrics/) enrich those control plane signals with Kubernetes object status.
+
+A typical Kubernetes metrics pipeline periodically scrapes these endpoints and stores the samples in a time series database (for example with Prometheus).
+
+See the [system metrics guide](/docs/concepts/cluster-administration/system-metrics/) for details and configuration options.
+
+Figure 2 outlines a common Kubernetes metrics pipeline.
+
+{{< mermaid >}}
+flowchart LR
+    C[Cluster components] --> P[Prometheus scraper]
+    P --> TS[(Time series storage)]
+    TS --> D[Dashboards and alerts]
+    TS --> A[Automated actions]
+{{< /mermaid >}}
+
+*Figure 2. Components of a typical Kubernetes metrics pipeline.*
+
+For multi-cluster or multi-cloud visibility, distributed time series databases (for example Thanos or Cortex) can complement Prometheus.
+
+See [Common observability tools - metrics tools](#metrics-tools) for metrics scrapers and time series databases.
+
+#### {{% heading "seealso" %}}
+
+- [System metrics for Kubernetes components](/docs/concepts/cluster-administration/system-metrics/)
+- [Resource usage monitoring with metrics-server](/docs/tasks/debug/debug-cluster/resource-usage-monitoring/)
+- [kube-state-metrics concept](/docs/concepts/cluster-administration/kube-state-metrics/)
+- [Resource metrics pipeline overview](/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/)
+
+## Logs
+
+Logs provide a chronological record of events inside applications, Kubernetes system components, and security-related activities such as audit logging.
+
+Container runtimes capture a containerized application’s output from standard output (`stdout`) and standard error (`stderr`) streams. While runtimes implement this differently, the integration with the kubelet is standardized through the _CRI logging format_, and the kubelet makes these logs available through `kubectl logs`.
+
+![Node-level logging](/images/docs/user-guide/logging/logging-node-level.png)
+
+*Figure 3a. Node-level logging architecture.*
+
+System component logs capture events from the cluster and are often useful for debugging and troubleshooting. These components are classified in two different ways: those that run in a container and those that do not. For example, the `kube-scheduler` and `kube-proxy` usually run in containers, whereas the `kubelet` and the container runtime run directly on the host.
+
+- On machines with `systemd`, the kubelet and container runtime write to journald. Otherwise, they write to `.log` files in the `/var/log` directory.
+- System components that run inside containers always write to `.log` files in `/var/log`, bypassing the default container logging mechanism.
+
+System component and container logs stored under `/var/log` require log rotation to prevent uncontrolled growth. Some cluster provisioning scripts install log rotation by default; verify your environment and adjust as needed. See the [system logs reference](/docs/concepts/cluster-administration/system-logs/) for details on locations, formats, and configuration options.
+
+Most clusters run a node-level logging agent (for example, Fluent Bit or Fluentd) that tails these files and forwards entries to a central log store. The [logging architecture guidance](/docs/concepts/cluster-administration/logging/) explains how to design such pipelines, apply retention, and log flows to backends.
+
+Figure 3 outlines a common log aggregation pipeline.
+
+{{< mermaid >}}
+flowchart LR
+    subgraph Sources
+        A[Application stdout / stderr]
+        B[Control plane logs]
+        C[Audit records]
+    end
+    A --> N[Node log agent]
+    B --> N
+    C --> N
+    N --> L[Central log store]
+    L --> Q[Dashboards, alerting, SIEM]
+{{< /mermaid >}}
+
+*Figure 3. Components of a typical Kubernetes logs pipeline.*
+
+See [Common observability tools - logging tools](#logging-tools) for logging agents and central log stores.
+
+#### {{% heading "seealso" %}}
+
+- [Logging architecture](/docs/concepts/cluster-administration/logging/)
+- [System logs](/docs/concepts/cluster-administration/system-logs/)
+- [Logging tasks and tutorials](/docs/tasks/debug/logging/)
+- [Configure audit logging](/docs/tasks/debug/debug-cluster/audit/)
+
+## Traces
+
+Traces capture how requests moves across Kubernetes components and applications, linking latency, timing and relationships between operations.By collecting traces, you can visualize end-to-end request flow, diagnose performance issues, and identify bottlenecks or unexpected interactions in the control plane, add-ons, or applications.
+
+Kubernetes {{< skew currentVersion >}} can export spans over the [OpenTelemetry Protocol](/docs/concepts/cluster-administration/system-traces/) (OTLP), either directly via built-in gRPC exporters or by forwarding them through an OpenTelemetry Collector.
+
+The OpenTelemetry Collector receives spans from components and applications, processes them (for example by applying sampling or redaction), and forwards them to a tracing backend for storage and analysis.
+
+Figure 4 outlines a typical distributed tracing pipeline.
+
+{{< mermaid >}}
+flowchart LR
+    subgraph Sources
+        A[Control plane spans]
+        B[Application spans]
+    end
+    A --> X[OTLP exporter]
+    B --> X
+    X --> COL[OpenTelemetry Collector]
+    COL --> TS[(Tracing backend)]
+    TS --> V[Visualization and analysis]
+{{< /mermaid >}}
+
+*Figure 4. Components of a typical Kubernetes traces pipeline.*
+
+See [Common observability tools - tracing tools](#tracing-tools) for tracing collectors and backends.
+
+#### {{% heading "seealso" %}}
+
+- [System traces for Kubernetes components](/docs/concepts/cluster-administration/system-traces/)
+- [OpenTelemetry Collector getting started guide](https://opentelemetry.io/docs/collector/getting-started/)
+- [Monitoring and tracing tasks](/docs/tasks/debug/monitoring/)
+
+## Common observability tools
+
+{{% thirdparty-content %}}
+
+Note: This section links to third-party projects that provide observability capabilities required by Kubernetes.
+The Kubernetes project authors aren't responsible for these projects, which are listed alphabetically. To add a
+project to this list, read the [content guide](/docs/contribute/style/content-guide/) before submitting a change.
+
+### Metrics tools
+
+- [Cortex](https://cortexmetrics.io/) offers horizontally scalable, long-term Prometheus storage.
+- [Grafana Mimir](https://grafana.com/oss/mimir/) is a Grafana Labs project that provides multi-tenant, horizontally scalable Prometheus-compatible storage.
+- [Prometheus](https://prometheus.io/) is the monitoring system that scrapes and stores metrics from Kubernetes components.
+- [Thanos](https://thanos.io/) extends Prometheus with global querying, downsampling, and object storage support.
+
+### Logging tools
+
+- [Elasticsearch](https://www.elastic.co/elasticsearch/) delivers distributed log indexing and search.
+- [Fluent Bit](https://fluentbit.io/) collects and forwards container and node logs with a low resource footprint.
+- [Fluentd](https://www.fluentd.org/) routes and transforms logs to multiple destinations.
+- [Grafana Loki](https://grafana.com/oss/loki/) stores logs in a Prometheus-inspired, label-based format.
+- [OpenSearch](https://opensearch.org/) provides open source log indexing and search compatible with Elasticsearch APIs.
+
+### Tracing tools
+
+- [Grafana Tempo](https://grafana.com/oss/tempo/) offers scalable, low-cost distributed tracing storage.
+- [Jaeger](https://www.jaegertracing.io/) captures and visualizes distributed traces for microservices.
+- [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) receives, processes, and exports telemetry data including traces.
+- [Zipkin](https://zipkin.io/) provides distributed tracing collection and visualization.
+
+## {{% heading "whatsnext" %}}
+
+- Learn how to [collect resource usage metrics with metrics-server](/docs/tasks/debug/debug-cluster/resource-usage-monitoring/)
+- Explore [logging tasks and tutorials](/docs/tasks/debug/logging/)
+- Follow the [monitoring and tracing task guides](/docs/tasks/debug/monitoring/)
+- Review the [system metrics guide](/docs/concepts/cluster-administration/system-metrics/) for component endpoints and stability
+- Review the [common observability tools](#common-observability-tools) section for vetted third-party options
@@ -35,8 +35,7 @@ of parameters. This design also ensures that end users don't have to worry
 about the complexity and nuances of how storage is provisioned, but still
 have the ability to select from multiple storage options.
 
-More information on storage classes can be found
-[here](/docs/concepts/storage/storage-classes/).
+For more details, see the [Storage Classes](/docs/concepts/storage/storage-classes/) concept.
 
 ## Enabling Dynamic Provisioning
 
@@ -81,7 +80,7 @@ their `PersistentVolumeClaim`. Before Kubernetes v1.6, this was done via the
 is deprecated since v1.9. Users now can and should instead use the
 `storageClassName` field of the `PersistentVolumeClaim` object. The value of
 this field must match the name of a `StorageClass` configured by the
-administrator (see [below](#enabling-dynamic-provisioning)).
+administrator (see [Enabling Dynamic Provisioning](#enabling-dynamic-provisioning)).
 
 To select the "fast" storage class, for example, a user would create the
 following PersistentVolumeClaim:
@@ -109,7 +108,7 @@ Dynamic provisioning can be enabled on a cluster such that all claims are
 dynamically provisioned if no storage class is specified. A cluster administrator
 can enable this behavior by:
 
-- Marking one `StorageClass` object as *default*;
+- Marking one `StorageClass` object as *default*.
 - Making sure that the [`DefaultStorageClass` admission controller](/docs/reference/access-authn-authz/admission-controllers/#defaultstorageclass)
   is enabled on the API server.
 
 
@@ -335,7 +335,7 @@ until the `terminationGracePeriodSeconds` of the terminating Pods expires.
 
 Each time a new Deployment is observed by the Deployment controller, a ReplicaSet is created to bring up
 the desired Pods. If the Deployment is updated, the existing ReplicaSet that controls Pods whose labels
-match `.spec.selector` but whose template does not match `.spec.template` are scaled down. Eventually, the new
+match `.spec.selector` but whose template does not match `.spec.template` is scaled down. Eventually, the new
 ReplicaSet is scaled to `.spec.replicas` and all old ReplicaSets is scaled to 0.
 
 If you update a Deployment while an existing rollout is in progress, the Deployment creates a new ReplicaSet
@@ -503,15 +503,20 @@ Follow the steps given below to check the rollout history:
    ```
    deployments "nginx-deployment"
    REVISION    CHANGE-CAUSE
-   1           kubectl apply --filename=https://k8s.io/examples/controllers/nginx-deployment.yaml
-   2           kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
-   3           kubectl set image deployment/nginx-deployment nginx=nginx:1.161
+   1           <none>
+   2           <none>
+   3           <none>
    ```
 
    `CHANGE-CAUSE` is copied from the Deployment annotation `kubernetes.io/change-cause` to its revisions upon creation. You can specify the`CHANGE-CAUSE` message by:
 
    * Annotating the Deployment with `kubectl annotate deployment/nginx-deployment kubernetes.io/change-cause="image updated to 1.16.1"`
    * Manually editing the manifest of the resource.
+   * Using tooling that sets the annotation automatically.
+
+   {{< note >}}
+   In older versions of Kubernetes, you could use the `--record` flag with kubectl commands to automatically populate the `CHANGE-CAUSE` field. This flag is deprecated and will be removed in a future release.
+   {{< /note >}}
 
 2. To see the details of each revision, run:
    ```shell
@@ -523,7 +528,6 @@ Follow the steps given below to check the rollout history:
    deployments "nginx-deployment" revision 2
      Labels:       app=nginx
              pod-template-hash=1159050644
-     Annotations:  kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
      Containers:
       nginx:
        Image:      nginx:1.16.1
@@ -584,7 +588,6 @@ Follow the steps given below to rollback the Deployment from the current version
    CreationTimestamp:      Sun, 02 Sep 2018 18:17:55 -0500
    Labels:                 app=nginx
    Annotations:            deployment.kubernetes.io/revision=4
-                           kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
    Selector:               app=nginx
    Replicas:               3 desired | 3 updated | 3 total | 3 available | 0 unavailable
    StrategyType:           RollingUpdate
 
@@ -408,7 +408,7 @@ Alternatively, you can use the `kubectl autoscale` command to accomplish the sam
 (and it's easier!)
 
 ```shell
-kubectl autoscale rs frontend --max=10 --min=3 --cpu-percent=50
+kubectl autoscale rs frontend --max=10 --min=3 --cpu=50%
 ```
 
 ## Alternatives to ReplicaSet