diff --git a/content/en/docs/concepts/traffic-management/index.md b/content/en/docs/concepts/traffic-management/index.md index 85953ae3a2fc4..c3bc57e042e49 100644 --- a/content/en/docs/concepts/traffic-management/index.md +++ b/content/en/docs/concepts/traffic-management/index.md @@ -742,6 +742,12 @@ error conditions. Using fault injection can be particularly useful to ensure that your failure recovery policies aren’t incompatible or too restrictive, potentially resulting in critical services being unavailable. +{{< warning >}} +Currently, the fault injection configuration can not be combined with retry or timeout configuration +on the same virtual service, see +[Traffic Management Problems](/docs/ops/common-problems/network-issues/#virtual-service-with-fault-injection-and-retry-timeout-policies-not-working-as-expected). +{{< /warning >}} + Unlike other mechanisms for introducing errors such as delaying packets or killing pods at the network layer, Istio’ lets you inject faults at the application layer. This lets you inject more relevant failures, such as HTTP diff --git a/content/en/docs/ops/common-problems/network-issues/index.md b/content/en/docs/ops/common-problems/network-issues/index.md index cede0e249d710..6c4916092696a 100644 --- a/content/en/docs/ops/common-problems/network-issues/index.md +++ b/content/en/docs/ops/common-problems/network-issues/index.md @@ -223,6 +223,79 @@ server { } {{< /text >}} +## Virtual service with fault injection and retry/timeout policies not working as expected + +Currently, Istio does not support configuring fault injections and retry or timeout policies on the +same `VirtualService`. Consider the following configuration: + +{{< text yaml >}} +apiVersion: networking.istio.io/v1alpha3 +kind: VirtualService +metadata: + name: helloworld +spec: + hosts: + - "*" + gateways: + - helloworld-gateway + http: + - match: + - uri: + exact: /hello + fault: + abort: + httpStatus: 500 + percentage: + value: 50 + retries: + attempts: 5 + retryOn: 5xx + route: + - destination: + host: helloworld + port: + number: 5000 +{{< /text >}} + +You would expect that given the configured five retry attempts, the user would almost never see any +errors when calling the `helloworld` service. However since both fault and retries are configured on +the same `VirtualService`, the retry configuration does not take effect, resulting in a 50% failure +rate. To work around this issue, you may remove the fault config from your `VirtualService` and +inject the fault to the upstream Envoy proxy using `EnvoyFilter` instead: + +{{< text yaml >}} +apiVersion: networking.istio.io/v1alpha3 +kind: EnvoyFilter +metadata: + name: hello-world-filter +spec: + workloadSelector: + labels: + app: helloworld + configPatches: + - applyTo: HTTP_FILTER + match: + context: SIDECAR_INBOUND # will match outbound listeners in all sidecars + listener: + filterChain: + filter: + name: "envoy.filters.network.http_connection_manager" + patch: + operation: INSERT_BEFORE + value: + name: envoy.fault + typed_config: + "@type": "type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault" + abort: + http_status: 500 + percentage: + numerator: 50 + denominator: HUNDRED +{{< /text >}} + +This works because this way the retry policy is configured for the client proxy while the fault +injection is configured for the upstream proxy. + ## TLS configuration mistakes Many traffic management problems