From 8741ae68003556e007c9aa49c9a783b5f13c86dc Mon Sep 17 00:00:00 2001 From: Francesco Sbaraglia <23255586+fsbaraglia@users.noreply.github.com> Date: Sun, 6 Mar 2022 15:46:44 +0100 Subject: [PATCH] Add Chaos Engineering to the main Glossary - english version (#498) * Add Security Chaos Engineering lang-en * added first draft of chaos engineering * security chaos engineering concept version 2 * Links + edits * Some edits * removed Chaos Engineering concept * Chaos Engineering concept * fixed high-frequency deployments in chaos engineering concept * split concepts * rephrase concept * Small edit Changed "people working in SRE and DevOps" to "SRE and DevOps engineers." Less words, same meaning :) * new version with last changes * Added SRE and DevOps links * link Co-authored-by: Francesco Sbaraglia Co-authored-by: Catherine Paganini <74001907+CathPag@users.noreply.github.com> --- content/en/chaos_engineering.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 content/en/chaos_engineering.md diff --git a/content/en/chaos_engineering.md b/content/en/chaos_engineering.md new file mode 100644 index 0000000000..61837f0ed6 --- /dev/null +++ b/content/en/chaos_engineering.md @@ -0,0 +1,14 @@ +--- +title: Chaos Engineering +status: Completed +category: concept +--- + +## What it is +Chaos Engineering or CE is the discipline of experimenting on a [distributed system](https://glossary.cncf.io/distributed_systems/) in production to build confidence in the system's capability to withstand turbulent and unexpected conditions. + +## Problem it addresses +[SRE](https://glossary.cncf.io/site_reliability_engineering/) and [DevOps](https://glossary.cncf.io/devops/) practices focus on techniques to increase product resiliency and [reliability](https://glossary.cncf.io/reliability/). A system's ability to tolerate failures while ensuring adequate service quality is typically a software development requirement. There are several aspects involved that could lead to outages of an application, like infrastructure, platform or other moving parts of a ([microservice](https://glossary.cncf.io/microservices/)-based) application. High-frequency deployment of new features to the production environment can result in a high probability of downtime and a critical incident — with considerable consequences to the business. + +## How it helps +Chaos engineering is a technique to meet resilience requirements. It is used to achieve resilience against infrastructure, platform, and application failures. Chaos engineers use chaos experiments to proactively inject random failures to verify that an application, infrastructure, or platform can self-heal and the failure cannot noticeably impact customers. Chaos experiments aim to discover blind spots (e.g. monitoring or autoscaling techniques) and to improve the communications between teams during critical incidents. This approach helps increase resiliency and the team's confidence in complex systems, particularly production.