From 0c7ecd9d08315bc3bc94930bf11cb0d0b47e4d55 Mon Sep 17 00:00:00 2001 From: Arko Dasgupta Date: Wed, 6 Nov 2024 06:28:25 -0800 Subject: [PATCH] docs: Active Passive Failover (#4637) Fixes: https://github.com/envoyproxy/gateway/issues/4501 Signed-off-by: Arko Dasgupta --- .../en/latest/tasks/traffic/failover.md | 566 ++++++++++++++++++ 1 file changed, 566 insertions(+) create mode 100644 site/content/en/latest/tasks/traffic/failover.md diff --git a/site/content/en/latest/tasks/traffic/failover.md b/site/content/en/latest/tasks/traffic/failover.md new file mode 100644 index 00000000000..625d5e2afcd --- /dev/null +++ b/site/content/en/latest/tasks/traffic/failover.md @@ -0,0 +1,566 @@ +--- +title: Failover +--- + +Active-passive failover in an API gateway setup is like having a backup plan in place to keep things +running smoothly if something goes wrong. Here’s why it’s valuable: + +* Staying Online: When the main (or "active") backend has issues or goes offline, +the fallback (or "passive") backend is ready to step in instantly. +This helps keep your API accessible and your services running, so users don’t even notice any interruptions. + +* Automatic Switch Over: If a problem occurs, the system can automatically switch traffic over to the fallback backend. +This avoids needing someone to jump in and fix things manually, which could take time and might even lead to mistakes. + +* Lower Costs: In an active-passive setup, the fallback backend doesn’t need to work all the time—it’s just on standby. +This can save on costs (like cloud egress costs) compared to setups where both backend are running at full capacity. + +* Peace of Mind with Redundancy: Although the fallback backend isn’t handling traffic daily, it's there as a safety net. +If something happens with the primary backend, the backup can take over immediately, ensuring your service doesn’t skip a beat. + +## Prerequisites + +{{< boilerplate prerequisites >}} + +## Test + +* We'll first create two services & deployments, called `active` and `passive`, representing an `active` and `passive` backend application. + +{{< tabpane text=true >}} +{{% tab header="Apply from stdin" %}} + +```shell +cat <}} + + +* Follow the instructions [here](./../../tasks/traffic/backend/#enable-backend) to enable the Backend API + +* Create two Backend resources that are used to represent the `active` backend and `passive` backend. +Note, we've set `fallback: true` for the `passive` backend to indicate its a passive backend + + +{{< tabpane text=true >}} +{{% tab header="Apply from stdin" %}} + +```shell +cat <}} + +* Lets create an HTTPRoute that can route to both these backends + +{{< tabpane text=true >}} +{{% tab header="Apply from stdin" %}} + +```shell +cat <}} + +* Lets configure a `BackendTrafficPolicy` with a passive health check setting to detect an transient errors. + +{{< tabpane text=true >}} +{{% tab header="Apply from stdin" %}} + +```shell +cat <}} + + + +* Lets send 10 requests. You should see that they all go to the `active` backend. + +```shell +for i in {1..10; do curl --verbose --header "Host: www.example.com" http://$GATEWAY_HOST/test 2>/dev/null | jq .pod; done +``` + +```console +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +"active-5bb896774f-lz8s9" +``` + +* Lets simulate a failure in the `active` backend by changing the server listening port to `5000` + +{{< tabpane text=true >}} +{{% tab header="Apply from stdin" %}} + +```shell +cat <}} + +* Lets send 10 requests again. You should see them all being sent to the `passive` backend + +```shell +for i in {1..10; do curl --verbose --header "Host: www.example.com" http://$GATEWAY_HOST/test 2>/dev/null | jq .pod; done +``` + +```console +parse error: Invalid numeric literal at line 1, column 9 +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +"passive-7ddbf945c9-wkc4f" +``` + +The first error can be avoided by configuring [retries](./../../tasks/traffic/retry.md).