Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 54 additions & 49 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3101,251 +3101,256 @@ menu:
parent: containers
identifier: containers_autoscaling
weight: 2
- name: Remediation
url: containers/bits_ai_kubernetes_remediation
parent: containers
identifier: containers_autoscaling
weight: 3
- name: Docker and other runtimes
url: containers/docker/
parent: containers
identifier: containers_docker
weight: 3
weight: 4
- name: APM
url: containers/docker/apm/
parent: containers_docker
identifier: containers_docker_apm
weight: 301
weight: 401
- name: Log collection
url: containers/docker/log/
parent: containers_docker
identifier: containers_docker_log
weight: 302
weight: 402
- name: Tag extraction
url: containers/docker/tag/
parent: containers_docker
identifier: containers_docker_tag
weight: 303
weight: 403
- name: Integrations
url: containers/docker/integrations/
parent: containers_docker
identifier: containers_docker_integrations
weight: 304
weight: 404
- name: Prometheus
url: containers/docker/prometheus/
parent: containers_docker
identifier: containers_docker_prometheus
weight: 305
weight: 405
- name: Data Collected
url: containers/docker/data_collected/
parent: containers_docker
identifier: containers_docker_data_collected
weight: 306
weight: 406
- name: Kubernetes
url: containers/kubernetes/
parent: containers
identifier: containers_kubernetes
weight: 4
weight: 5
- name: Installation
url: containers/kubernetes/installation
parent: containers_kubernetes
identifier: containers_kubernetes_installation
weight: 401
weight: 501
- name: Further Configuration
url: containers/kubernetes/configuration
parent: containers_kubernetes
identifier: containers_kubernetes_configuration
weight: 402
weight: 502
- name: Distributions
url: containers/kubernetes/distributions
parent: containers_kubernetes
identifier: containers_kubernetes_distributions
weight: 403
weight: 503
- name: APM
url: containers/kubernetes/apm/
parent: containers_kubernetes
identifier: containers_kubernetes_apm
weight: 404
weight: 504
- name: Log collection
url: containers/kubernetes/log/
parent: containers_kubernetes
identifier: containers_kubernetes_log
weight: 405
weight: 505
- name: Tag extraction
url: containers/kubernetes/tag/
parent: containers_kubernetes
identifier: containers_kubernetes_tag
weight: 406
weight: 506
- name: Integrations
url: containers/kubernetes/integrations/
parent: containers_kubernetes
identifier: containers_kubernetes_integrations
weight: 407
weight: 507
- name: Prometheus & OpenMetrics
url: containers/kubernetes/prometheus/
parent: containers_kubernetes
identifier: containers_kubernetes_prometheus
weight: 408
weight: 508
- name: Control plane monitoring
url: containers/kubernetes/control_plane/
parent: containers_kubernetes
identifier: containers_kubernetes_control_plane
weight: 409
weight: 509
- name: Data collected
url: containers/kubernetes/data_collected/
parent: containers_kubernetes
identifier: containers_kubernetes_data_collected
weight: 410
weight: 510
- name: Datadog CSI Driver
url: containers/kubernetes/csi_driver
parent: containers_kubernetes
identifier: csi_driver
weight: 411
weight: 511
- name: Data security
url: data_security/kubernetes
parent: containers_kubernetes
identifier: container_kubernetes_data_security
weight: 412
weight: 512
- name: Cluster Agent
url: containers/cluster_agent/
parent: containers
identifier: containers_cluster
weight: 5
weight: 6
- name: Setup
url: containers/cluster_agent/setup/
parent: containers_cluster
identifier: cluster_agent_setup
weight: 501
weight: 601
- name: Commands & Options
url: containers/cluster_agent/commands/
identifier: cluster_agent_commands
parent: containers_cluster
weight: 502
weight: 602
- name: Cluster Checks
identifier: containers_cluster_agent_clusterchecks
url: containers/cluster_agent/clusterchecks/
parent: containers_cluster
weight: 503
weight: 603
- name: Endpoint Checks
identifier: containers_cluster_agent_endpoint_checks
url: containers/cluster_agent/endpointschecks/
parent: containers_cluster
weight: 504
weight: 604
- name: Admission Controller
identifier: containers_cluster_agent_admission_controller
url: containers/cluster_agent/admission_controller/
parent: containers_cluster
weight: 505
weight: 605
- name: Amazon ECS
url: containers/amazon_ecs/
parent: containers
identifier: containers_amazon_ecs
weight: 6
weight: 7
- name: APM
url: containers/amazon_ecs/apm/
parent: containers_amazon_ecs
identifier: containers_amazon_ecs_apm
weight: 601
weight: 701
- name: Log collection
url: containers/amazon_ecs/logs/
parent: containers_amazon_ecs
identifier: containers_amazon_ecs_logs
weight: 602
weight: 702
- name: Tag extraction
url: containers/amazon_ecs/tags/
parent: containers_amazon_ecs
identifier: containers_amazon_ecs_tags
weight: 603
weight: 703
- name: Data collected
url: containers/amazon_ecs/data_collected/
parent: containers_amazon_ecs
identifier: containers_amazon_ecs_data_collected
weight: 604
weight: 704
- name: AWS Fargate
url: integrations/ecs_fargate/
parent: containers
identifier: ecs_fargate
weight: 7
weight: 8
- name: Datadog Operator
url: containers/datadog_operator
identifier: containers_datadog_operator
parent: containers
weight: 8
weight: 9
- name: Advanced Install
url: containers/datadog_operator/advanced_install
identifier: containers_datadog_operator_installation
parent: containers_datadog_operator
weight: 801
weight: 901
- name: Configuration
url: containers/datadog_operator/config
identifier: containers_datadog_operator_configuration
parent: containers_datadog_operator
weight: 802
weight: 902
- name: Custom Checks
url: containers/datadog_operator/custom_check
identifier: containers_datadog_operator_customchecks
parent: containers_datadog_operator
weight: 803
weight: 903
- name: Data Collected
url: containers/datadog_operator/data_collected
identifier: containers_datadog_operator_datacollected
parent: containers_datadog_operator
weight: 804
weight: 904
- name: kubectl Plugin
url: containers/datadog_operator/kubectl_plugin
identifier: containers_datadog_operator_kubectlplugin
parent: containers_datadog_operator
weight: 805
weight: 905
- name: Secret Management
url: containers/datadog_operator/secret_management
identifier: containers_datadog_operator_secretmanagement
parent: containers_datadog_operator
weight: 806
weight: 906
- name: DatadogDashboard CRD
url: containers/datadog_operator/crd_dashboard
identifier: containers_datadog_operator_crd_dashboard
parent: containers_datadog_operator
weight: 807
weight: 907
- name: DatadogMonitor CRD
url: containers/datadog_operator/crd_monitor
identifier: containers_datadog_operator_crd_monitor
parent: containers_datadog_operator
weight: 808
weight: 908
- name: DatadogSLO CRD
url: containers/datadog_operator/crd_slo
identifier: containers_datadog_operator_crd_slo
parent: containers_datadog_operator
weight: 809
weight: 909
- name: Troubleshooting
url: containers/troubleshooting/
parent: containers
identifier: containers_troubleshooting
weight: 9
weight: 10
- name: Duplicate hosts
url: containers/troubleshooting/duplicate_hosts
parent: containers_troubleshooting
identifier: containers_troubleshooting_duplicate_hosts
weight: 901
weight: 1001
- name: Cluster Agent
url: containers/troubleshooting/cluster-agent
parent: containers_troubleshooting
identifier: containers_troubleshooting_cluster_agent
weight: 902
weight: 1002
- name: Cluster Checks
url: containers/troubleshooting/cluster-and-endpoint-checks
parent: containers_troubleshooting
identifier: containers_troubleshooting_cluster_and_endpoint_checks
weight: 903
weight: 1003
- name: HPA and Metrics Provider
url: containers/troubleshooting/hpa
parent: containers_troubleshooting
identifier: containers_troubleshooting_hpa
weight: 904
weight: 1004
- name: Admission Controller
url: containers/troubleshooting/admission-controller
parent: containers_troubleshooting
identifier: containers_troubleshooting_admission_controller
weight: 905
weight: 1005
- name: Guides
url: containers/guide
parent: containers
identifier: containers_guide
weight: 10
weight: 11
- name: Processes
url: infrastructure/process
identifier: process
Expand Down
3 changes: 2 additions & 1 deletion content/en/bits_ai/bits_ai_dev_agent/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Bits AI Dev Agent is available for the following Datadog products:
| [Code Security][2] | Preview | Remediates code vulnerabilities individually or in bulk |
| [Test Optimization][4] | Preview | Provides code fixes for flaky tests and verifies that tests remain stable |
| [Continuous Profiler][3] | Preview | Provides code changes for [Automated Analysis][10] insights |
| [Containers][12] | Preview | Provides code changes for Container Recommendations |
| [Containers][12] | Preview | Provides code changes for [Kubernetes Remediations][13] |

**Note**: Enabling Bits AI Dev Agent is product-specific. Even if it's active for one Datadog product, it must be separately enabled for each additional product you use.

Expand Down Expand Up @@ -119,3 +119,4 @@ To enable Bits AI Dev Agent, see [Setup][6].
[10]: /profiler/automated_analysis/
[11]: /tracing/trace_explorer/
[12]: /containers/
[13]: /containers/bits_ai_kubernetes_remediation
59 changes: 59 additions & 0 deletions content/en/containers/bits_ai_kubernetes_remediation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: Bits AI Kubernetes Remediation
description: Discover and automatically remediate Kubernetes errors with Bits AI Kubernetes Remediation
further_reading:
- link: 'https://www.datadoghq.com/blog/kubernetes-active-remediation-ai/'
tag: 'blog'
text: 'Accelerate Kubernetes issue resolution with AI-powered guided remediation'
---

Bits AI Kubernetes Remediation analyzes and fixes Kubernetes errors in your infrastructure.

The following Kubernetes errors are supported:
- `CrashLoopBackOff`
- `ErrImagePull`
- `ImagePullBackOff`
- `OOMKilled`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know these are alphabetical, but i would lead with the below 4 issue types if possible as they're the most common

- `CreateContainerError`
- `CreateContainerConfigError`

## Usage

You can launch Bits AI Kubernetes Remediation from multiple locations within Datadog:
- **From a Kubernetes monitor**: In the _Troubleshooting_ section, select a workload under _Problematic Workloads_.
- **From [Kubernetes Explorer][2]**: Hover over an error to see more information about the alert and the affected workload(s), and click _Start Remediation_.
- **From the [Kubernetes Remediation][1] tab**: Select a workload from the list.

Any one of these actions opens a Remediation side panel that displays:

- An AI-powered explanation for root cause, based on collected telemetry and known patterns
- Recommended next steps, which you may be able to [perform directly from Datadog](#remediate-from-datadog)
- Related information, including recent deployments, error logs, Kubernetes events, etc.

{{< img src="containers/remediation/side_panel2.png" alt="Remediation side panel opened for a workload with a CrashLoopBackOff error. Displays a What Happened section with a Bits AI-powered explanation of the error's root cause. Below, a Recommended Next Steps section where the user can inspect the workload manifest. Step-by-step instructions for a suggested fix are also displayed." style="width:80%;" >}}

### Remediate from Datadog

If your repositories are [connected to Datadog][4], and an error can be fixed by changing code in one of these connected repositories, then you can use Bits AI to perform the remediation action directly from Datadog. For other problem scenarios, Bits AI provides a detailed list of remediation steps to follow.

{{% collapse-content title="Example: Increasing memory limit for a deployment" level="h4" expanded=true id="example-pr" %}}

{{< img src="containers/remediation/bitsai_action2.mp4" alt="In a Remediation side panel, the Recommended Next Steps section suggests that the user 'Increase memory limit'. The user enters a new value for the memory limit, increasing it from 10 mebibytes to 20 mebibytes. Clicking Fix with Bits AI brings up a dialog that prompts the user to select a connected repository. " video="true" style="width:80%;" >}}

When a pod is terminated because the memory usage exceeded its limit, you may be able to fix the error by increasing your container's memory limit.

1. Click **Edit Memory Limit**.
2. Adjust your limit so that it is higher than what your container normally uses.
3. Click **Fix with Bits AI**.
4. On the next page, select the repository where your deployment is defined, and review the proposed changes. Click **Fix with Bits** to create a pull request.
5. You are redirected to a Bits [Code Session][3], where you can verify that the Bits AI Dev Agent identified the specific configuration file where your memory limits are defined. Click **View Pull Request** to view the pull request in GitHub.
{{% /collapse-content %}}

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: https://app.datadoghq.com/orchestration/remediation
[2]: https://app.datadoghq.com/orchestration/explorer/pod
[3]: https://app.datadoghq.com/code?tab=my-sessions
[4]: https://docs.datadoghq.com/integrations/guide/source-code-integration/?tab=githubsaasonprem#connect-your-git-repositories-to-datadog
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading