Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add troubleshooting for allocation gRPC request #1878

Merged
merged 2 commits into from
Nov 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 87 additions & 25 deletions site/content/en/docs/Advanced/allocator-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: >
Agones provides an mTLS based allocator service that is accessible from outside the cluster using a load balancer. The service is deployed and scales independent to Agones controller.
---

To allocate a game server, Agones in addition to {{< ghlink href="pkg/apis/allocation/v1/gameserverallocation.go" >}}GameServerAllocations{{< /ghlink >}}, provides a gRPC service with mTLS authentication, called agones-allocator, which is on {{< ghlink href="proto/allocation" >}}stable version{{< /ghlink >}}, starting on agones v1.6.
To allocate a game server, Agones in addition to {{< ghlink href="pkg/apis/allocation/v1/gameserverallocation.go" >}}GameServerAllocations{{< /ghlink >}}, provides a gRPC service with mTLS authentication, called `agones-allocator`.

The gRPC service is accessible through a Kubernetes service that is externalized using a load balancer. For the gRPC request to succeed, a client certificate must be provided that is in the authorization list of the allocator service.

Expand All @@ -30,9 +30,16 @@ agones-allocator LoadBalancer 10.55.251.73 <b>34.82.195.204</b>

## Server TLS certificate

If the `agones-allocator` service is installed as a `LoadBalancer` [using a static IP]({{< relref "/docs/Installation/Install Agones/helm.md#reserved-allocator-load-balancer-ip" >}}), a valid self-signed server TLS certificate is generated using the IP provided. Otherwise, the server TLS certificate should be replaced.
If the `agones-allocator` service is installed as a `LoadBalancer` [using a reserved IP]({{< relref "/docs/Installation/Install Agones/helm.md#reserved-allocator-load-balancer-ip" >}}), a valid self-signed server TLS certificate is generated using the IP provided. Otherwise, the server TLS certificate should be replaced. If you installed Agones using [helm]({{< relref "/docs/Installation/Install Agones/helm.md" >}}), you can easily reconfigure the allocator service with a preset IP address by setting the `agones.allocator.http.loadBalancerIP` parameter to the address that was automatically assigned to the service and `helm upgrade`:

Replace the default server TLS certificate with a certificate with CN and subjectAltName. There are multiple approaches to generate a certificate. Agones recommends using [cert-manager.io](https://cert-manager.io/) solution for cluster level certificate management.
```bash
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
helm upgrade --install --wait \
--set agones.allocator.http.loadBalancerIP=${EXTERNAL_IP} \
...
```

Another approach is to replace the default server TLS certificate with a certificate with CN and subjectAltName. There are multiple approaches to generate a certificate. Agones recommends using [cert-manager.io](https://cert-manager.io/) solution for cluster level certificate management.

In order to use the cert-manager solution, first [install cert-manager](https://cert-manager.io/docs/installation/kubernetes/) on the cluster.
Then, [configure](https://cert-manager.io/docs/configuration/) an `Issuer`/`ClusterIssuer` resource and
Expand All @@ -53,10 +60,10 @@ spec:
selfSigned: {}
EOF

EXTERNAL_IP=`kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'`
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# for EKS use hostname
# HOST_NAME=`kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'`
# HOST_NAME=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

# Create a Certificate with IP for the allocator-tls secret
cat <<EOF | kubectl apply -f -
Expand All @@ -75,54 +82,56 @@ spec:
kind: ClusterIssuer
EOF

# Optional: Store the secret ca.crt in a file to be used by the client for the server authentication
TLS_CA_FILE=ca.crt
TLS_CA_VALUE=`kubectl get secret allocator-tls -n agones-system -ojsonpath='{.data.ca\.crt}'`
echo ${TLS_CA_VALUE} | base64 -d > ${TLS_CA_FILE}

# In case of MacOS
# echo ${TLS_CA_VALUE} | base64 -D > ${TLS_CA_FILE}

# Add ca.crt to the allocator-tls-ca Secret
TLS_CA_VALUE=$(kubectl get secret allocator-tls -n agones-system -ojsonpath='{.data.ca\.crt}')
kubectl get secret allocator-tls-ca -o json -n agones-system | jq '.data["tls-ca.crt"]="'${TLS_CA_VALUE}'"' | kubectl apply -f -
```

## Client Certificate

Because agones-allocator uses an mTLS authentication mechanism, client must provide a certificate that is accepted by the server. Here is an example of generating a client certificate. For the agones-allocator service to accept the newly generate client certificate, the generated client certificate CA or public portion of the certificate must be added to a kubernetes secret called `allocator-client-ca`.
Because agones-allocator uses an mTLS authentication mechanism, a client must provide a certificate that is accepted by the server.

If Agones is installed using Helm, you can leverage a default client secret, `allocator-client.default`, created in the game server namespace and allowlisted in `allocator-client-ca` Kubernetes secret. You can extract and use that secret for client side authentication, by following [the allocation example]({{< relref "#send-allocation-request" >}}).

Otherwise, here is an example of generating a client certificate using openssl.

```bash
#!/bin/bash

KEY_FILE=client.key
CERT_FILE=client.crt

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout ${KEY_FILE} -out ${CERT_FILE}
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout client.key -out client.crt

CERT_FILE_VALUE=`cat ${CERT_FILE} | base64 -w 0`
CERT_FILE_VALUE=$(cat ${CERT_FILE} | base64 -w 0)

# In case of MacOS
# CERT_FILE_VALUE=`cat ${CERT_FILE} | base64`
# CERT_FILE_VALUE=$(cat ${CERT_FILE} | base64)

# white-list client certificate
# allowlist client certificate
markmandel marked this conversation as resolved.
Show resolved Hide resolved
kubectl get secret allocator-client-ca -o json -n agones-system | jq '.data["client_trial.crt"]="'${CERT_FILE_VALUE}'"' | kubectl apply -f -
```

The last command creates a new entry in the secret data map called `client_trial.crt` for `allocator-client-ca` and stores it. You can also achieve this by `kubectl edit secret allocator-client-ca -n agones-system`, and then add the entry.
The last command creates a new entry in the secret data map for `allocator-client-ca` for the client CA. This is for the `agones-allocator` service to accept the newly generated client certificate.

## Send allocation request

Now the service is ready to accept requests from the client with the generated certificates. Create a [fleet]({{< ref "/docs/Getting Started/create-fleet.md" >}}) and send a gRPC request to agones-allocator. To start, take a look at the allocation gRPC client examples in {{< ghlink href="examples/allocator-client/main.go" >}}golang{{< /ghlink >}} and {{< ghlink href="examples/allocator-client-csharp/Program.cs" >}}C#{{< /ghlink >}} languages. In the following, the {{< ghlink href="examples/allocator-client/main.go" >}}golang gRPC client example{{< /ghlink >}} is used to allocate a Game Server in the default namespace.
After setting up `agones-allocator` with server certificate and allowlisting the client certificate, the service can be used to allocate game servers. To start, take a look at the allocation gRPC client examples in {{< ghlink href="examples/allocator-client/main.go" >}}golang{{< /ghlink >}} and {{< ghlink href="examples/allocator-client-csharp/Program.cs" >}}C#{{< /ghlink >}} languages. In the following, the {{< ghlink href="examples/allocator-client/main.go" >}}golang gRPC client example{{< /ghlink >}} is used to allocate a Game Server in the `default` namespace.

Make sure you have a [fleet]({{< ref "/docs/Getting Started/create-fleet.md" >}}) with ready game servers in the game server namespace. Then proceed with running the following script.

```bash
#!/bin/bash

NAMESPACE=default # replace with any namespace
EXTERNAL_IP=`kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'`
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
KEY_FILE=client.key
CERT_FILE=client.crt
TLS_CA_FILE=ca.crt

# allocator-client.default secret is created only when using helm installation. Otherwise generate the client certificate and replace the following.
# In case of MacOS replace "base64 -d" with "base64 -D"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.crt}" | base64 -d > "${CERT_FILE}"
kubectl get secret allocator-client.default -n "${NAMESPACE}" -ojsonpath="{.data.tls\.key}" | base64 -d > "${KEY_FILE}"
kubectl get secret allocator-tls-ca -n agones-system -ojsonpath="{.data.tls-ca\.crt}" | base64 -d > "${TLS_CA_FILE}"

go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--port 443 \
--namespace ${NAMESPACE} \
Expand All @@ -131,4 +140,57 @@ go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--cacert ${TLS_CA_FILE}
```

If your matchmaker is external to the cluster on which your game servers are hosted, the `agones-allocator` provides the gRPC API to allocate game services using mTLS authentication, which can scale independently to the Agones controller.
## Secrets Explained
roberthbailey marked this conversation as resolved.
Show resolved Hide resolved

`agones-allocator` has a dependency on three Kubernetes secrets:

1. `allocator-tls` - stores the server certificate.
2. `allocator-client-ca` - stores the allocation authorized client CA for mTLS to allowlist client certificates.
3. `allocator-tls-ca` (optional) - stores `allocator-tls` CA.

The separation of CA secret from the private secret is for the security reason to avoid reading the private secret, while retrieving the allocator CA that is used by the allocation client to validate the server. It is optional to set or maintain the `allocator-tls-ca` secret.

## Troubleshooting

If you encounter problems, explore the following potential root causes:

1. Check server certificate - Using openssl you can get the certificate chain for the server.

```bash
EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
openssl s_client -connect ${EXTERNAL_IP}:443
```

- Inspect the server certificate by storing the certificate returned, under `Server certificate` and validating using `openssl x509 -in tls.crt -text -noout`.
- Make sure the certificate is not expired and the Subject Alternative Name is set.
- If the issuer is `CN = allocation-ca`, the certificate is generated using Agones helm installation.

2. Check client certificate

- You may get an error such as `rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed`, make sure your client certificate is allowlisted by being added to `allocator-client-ca`.

```bash
kubectl get secret allocator-client-ca -o json -n agones-system
```

- If the server certificate is not accepted by the client, you may get an error such as `rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"`, depending on the client. In this case, verify that the TLS CA file matches the server certificate.

```bash
kubectl get secret allocator-tls -n agones-system -ojsonpath="{.data.tls\.crt}" | base64 -d > tls.crt
openssl verify -verbose -CAfile ca.crt tls.crt
tls.crt: OK
```

3. Make sure the service is up and running.

```bash
kubectl get pod -n agones-system | grep agones-allocator
agones-allocator-59b4f6b5c6-86j62 1/1 Running 0 6m36s
agones-allocator-59b4f6b5c6-kbqrq 1/1 Running 0 6m45s
agones-allocator-59b4f6b5c6-trbkl 1/1 Running 0 6m28s
```

```bash
kubectl get service agones-allocator -n agones-system
agones-allocator LoadBalancer 10.55.248.14 34.82.195.204 443:32468/TCP 6d23h
```
43 changes: 38 additions & 5 deletions site/content/en/docs/Advanced/multi-cluster-allocation.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This feature is in a pre-release state and might change.
{{< /alert >}}
{{% /feature %}}

There may be different types of clusters, such as on-premise, and Google Kubernetes Engine (GKE), used by a game to help with the cost-saving and availability.
There may be different types of clusters, such as on-premise, and Google Kubernetes Engine (GKE), used by a game to help with the cost-saving and availability.
For this purpose, Agones provides a mechanism to define priorities on the clusters. Priorities are defined on {{< ghlink href="pkg/apis/multicluster/v1/gameserverallocationpolicy.go" >}}GameServerAllocationPolicy{{< /ghlink >}} agones CRD. A matchmaker can enable the multi-cluster rules on a request and target [agones-allocator]({{< relref "allocator-service.md">}}) endpoint in any of the clusters and get resources allocated on the cluster with the highest priority. If the cluster with the highest priority is overloaded, the allocation request is redirected to the cluster with the next highest priority.

The remainder of this article describes how to enable multi-cluster allocation.
Expand Down Expand Up @@ -97,16 +97,49 @@ EOF

To enable multi-cluster allocation, set `multiClusterSetting.enabled` to `true` in {{< ghlink href="proto/allocation/allocation.proto" >}}allocation.proto{{< /ghlink >}} and send allocation requests. For more information visit [agones-allocator]({{< relref "allocator-service.md">}}). In the following, using {{< ghlink href="examples/allocator-client/main.go" >}}allocator-client sample{{< /ghlink >}}, a multi-cluster allocation request is sent to the agones-allocator service.

Follow [agones-allocator]({{< relref "allocator-service.md#send-allocation-request">}}) to set the environment variables.

```bash
#!/bin/bash
EXTERNAL_IP=`kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'`

NAMESPACE=default # replace with any namespace

go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--namespace ${NAMESPACE} \
--key ${KEY_FILE} \
--cert ${CERT_FILE} \
--cacert ${TLS_CA_FILE} \
--multicluster true
```

## Troubleshooting

If you encounter problems, explore the following potential root causes:

1. Make sure single cluster allocation works for each cluster using [this troubleshooting]({{< relref "allocator-service.md#troubleshooting">}}).

2. For each cluster, make sure there is a `GameServerAllocationPolicy` resource defined in the game server cluster.

3. Inspect the `.spec.connectionInfo` for `GameServerAllocationPolicy` for each cluster. Use the cluster connection information in that field to verify that single cluster allocation works. Use the information to verify the connection:

```bash
POLICY_NAME=<policy-name>
POLICY_NAMESPACE=<policy-namespace>

NAMESPACE=$(kubectl get gameserverallocationpolicy ${POLICY_NAME} -n ${POLICY_NAMESPACE} -ojsonpath={.spec.connectionInfo.namespace})
EXTERNAL_IP=$(kubectl get gameserverallocationpolicy ${POLICY_NAME} -n ${POLICY_NAMESPACE} -ojsonpath={.spec.connectionInfo.allocationEndpoints\[0\]})
CLIENT_SECRET_NAME=$(kubectl get gameserverallocationpolicy ${POLICY_NAME} -n ${POLICY_NAMESPACE} -ojsonpath={.spec.connectionInfo.secretName})

KEY_FILE=client.key
CERT_FILE=client.crt
TLS_CA_FILE=ca.crt

# In case of MacOS replace "base64 -d" with "base64 -D"
kubectl get secret "${CLIENT_SECRET_NAME}" -n "${POLICY_NAMESPACE}" -ojsonpath="{.data.tls\.crt}" | base64 -d > "${CERT_FILE}"
kubectl get secret "${CLIENT_SECRET_NAME}" -n "${POLICY_NAMESPACE}" -ojsonpath="{.data.tls\.key}" | base64 -d > "${KEY_FILE}"
kubectl get secret "${CLIENT_SECRET_NAME}" -n "${POLICY_NAMESPACE}" -ojsonpath="{.data.ca\.crt}" | base64 -d > "${TLS_CA_FILE}"

go run examples/allocator-client/main.go --ip ${EXTERNAL_IP} \
--port 443 \
--namespace ${NAMESPACE} \
--key ${KEY_FILE} \
--cert ${CERT_FILE} \
--cacert ${TLS_CA_FILE}
```