Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Chaos Engineering to the main Glossary - english version #498

Merged
merged 14 commits into from
Mar 6, 2022
14 changes: 14 additions & 0 deletions content/en/chaos_engineering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: Chaos Engineering
status: Completed
category: concept
---

## What it is
Chaos Engineering or CE is the discipline of experimenting on a [distributed system](https://glossary.cncf.io/distributed_systems/) in production to build confidence in the system's capability to withstand turbulent and unexpected conditions.

## Problem it addresses
People working in the field of [SRE](https://glossary.cncf.io/site_reliability_engineering/) and [DevOps](https://glossary.cncf.io/devops/) continuously look for new techniques to increase product resiliency and [reliability](https://glossary.cncf.io/reliability/). A system's ability to tolerate failures while ensuring adequate service quality is typically a software development requirement. But deep testing on a production system is not always possible due to the complexity of the architecture. High-frequency deployment of new features to the production environment can result in a high probability of downtime and a critical incident — with considerable consequences to the business.
fsbaraglia marked this conversation as resolved.
Show resolved Hide resolved

## How it helps
Chaos engineering is a technique to meet resilience requirements. It is used to achieve resilience against infrastructure, platform, and application failures. Chaos Engineers use Chaos Experiments to proactively inject random failures to verify that an application, infrastructure, or platform can self-heal and the failure cannot noticeably impact customers. Chaos Experiments aim to discover blind spots (example in monitoring or autoscaling technics) and to improve the communications between teams during critical incidents. This approach helps increase resiliency and the team's confidence in complex systems, particularly production.