Add docs about configuring faults and retry/timeout on the same

VirtualService Signed-off-by: Dennis Effing <dennis.effing@codecentric.de>
istio · Jan 13, 2022 · 001d303 · 001d303
1 parent d4cd6b2
commit 001d303
Show file tree

Hide file tree

Showing 2 changed files with 79 additions and 0 deletions.
diff --git a/content/en/docs/concepts/traffic-management/index.md b/content/en/docs/concepts/traffic-management/index.md
@@ -742,6 +742,12 @@ error conditions. Using fault injection can be particularly useful to ensure
 that your failure recovery policies aren’t incompatible or too restrictive,
 potentially resulting in critical services being unavailable.
 
+{{< warning >}}
+Currently, the fault injection configuration can not be combined with retry or timeout configuration
+on the same virtual service, see 
+[Traffic Management Problems](/docs/ops/common-problems/network-issues/#virtual-service-with-fault-injection-and-retry-timeout-policies-not-working-as-expected).
+{{< /warning >}}
+
 Unlike other mechanisms for introducing errors such as delaying packets or
 killing pods at the network layer, Istio’ lets you inject faults at the
 application layer. This lets you inject more relevant failures, such as HTTP

diff --git a/content/en/docs/ops/common-problems/network-issues/index.md b/content/en/docs/ops/common-problems/network-issues/index.md
@@ -223,6 +223,79 @@ server {
 }
 {{< /text >}}
 
+## Virtual service with fault injection and retry/timeout policies not working as expected
+
+Currently, Istio does not support configuring fault injections and retry or timeout policies on the
+same `VirtualService`. Consider the following configuration:
+
+{{< text yaml >}}
+apiVersion: networking.istio.io/v1alpha3
+kind: VirtualService
+metadata:
+  name: helloworld
+spec:
+  hosts:
+    - "*"
+    gateways:
+    - helloworld-gateway
+    http:
+    - match:
+      - uri:
+          exact: /hello
+        fault:
+          abort:
+            httpStatus: 500
+            percentage:
+              value: 50
+        retries:
+          attempts: 5
+          retryOn: 5xx
+        route:
+        - destination:
+          host: helloworld
+          port:
+          number: 5000 
+{{< /text >}}
+
+You would expect that given the configured five retry attempts, the user would almost never see any
+errors when calling the `helloworld` service. However since both fault and retries are configured on
+the same `VirtualService`, the retry configuration does not take effect, resulting in a 50% failure
+rate. To work around this issue, you may remove the fault config from your `VirtualService` and
+inject the fault to the upstream Envoy proxy using `EnvoyFilter` instead:
+
+{{< text yaml >}}
+apiVersion: networking.istio.io/v1alpha3
+kind: EnvoyFilter
+metadata:
+  name: hello-world-filter
+spec:
+  workloadSelector:
+    labels:
+      app: helloworld
+  configPatches:
+    - applyTo: HTTP_FILTER
+      match:
+        context: SIDECAR_INBOUND # will match outbound listeners in all sidecars
+        listener:
+          filterChain:
+            filter:
+              name: "envoy.filters.network.http_connection_manager"
+      patch:
+        operation: INSERT_BEFORE
+        value:
+          name: envoy.fault
+          typed_config:
+            "@type": "type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault"
+            abort:
+              http_status: 500
+              percentage:
+                numerator: 50
+                denominator: HUNDRED
+{{< /text >}}
+
+This works because this way the retry policy is configured for the client proxy while the fault
+injection is configured for the upstream proxy.
+
 ## TLS configuration mistakes
 
 Many traffic management problems