You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tests/graceful-recovery/graceful-recovery.md
+19-30
Original file line number
Diff line number
Diff line change
@@ -34,18 +34,18 @@ Ensure that NGF can recover gracefully from container failures without any user
34
34
3. Check out the latest tag (unless you are installing the edge version from the main branch).
35
35
4. Go into `deploy/manifests/nginx-gateway.yaml` and change `runAsNonRoot` from `true` to `false`.
36
36
This allows us to insert our ephemeral container as root which enables us to restart the nginx-gateway container.
37
-
5. Follow the [installation instructions](https://github.com/nginxinc/nginx-gateway-fabric/blob/main/docs/installation.md)
37
+
5. Follow the [installation instructions](https://github.com/nginxinc/nginx-gateway-fabric/blob/main/site/content/installation/installing-ngf/manifests.md)
38
38
to deploy NGINX Gateway Fabric using manifests and expose it through a LoadBalancer Service.
Known issue: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108
86
+
87
+
88
+
### Restart Node with draining
89
+
90
+
Previous NGF container error:
91
+
92
+
```json
93
+
{
94
+
"level": "error",
95
+
"ts": "2023-12-05T21:43:31Z",
96
+
"logger": "eventLoop.eventHandler",
97
+
"msg": "Failed to update NGINX configuration",
98
+
"batchID": 11,
99
+
"error": "failed to reload NGINX: could not get expected config version 7: error getting client: Get \"http://config-version/version\": dial unix /var/run/nginx/nginx-config-version.sock: connect: no such file or directory",
This error is likely due to NGINX terminating during a reload attempt and does not consistently occur on a node restart.
105
+
106
+
No errors in previous NGINX container.
107
+
No errors in new NGF/NGINX containers.
108
+
109
+
### Restart Node without draining
110
+
111
+
The NGF Pod was unable to recover the majority of times after running `docker restart kind-control-plane`.
112
+
113
+
The following appeared in the NGINX logs:
114
+
115
+
```text
116
+
2023/12/05 21:53:51 [emerg] 29#29: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use)
117
+
2023/12/05 21:53:51 [notice] 29#29: try again to bind() after 500ms
118
+
2023/12/05 21:53:51 [emerg] 29#29: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use)
119
+
2023/12/05 21:53:51 [notice] 29#29: try again to bind() after 500ms
120
+
2023/12/05 21:53:51 [emerg] 29#29: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use)
121
+
2023/12/05 21:53:51 [notice] 29#29: try again to bind() after 500ms
122
+
2023/12/05 21:53:51 [emerg] 29#29: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use)
123
+
2023/12/05 21:53:51 [notice] 29#29: try again to bind() after 500ms
124
+
2023/12/05 21:53:51 [emerg] 29#29: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use)
125
+
2023/12/05 21:53:51 [notice] 29#29: try again to bind() after 500ms
126
+
2023/12/05 21:53:51 [emerg] 29#29: still could not bind()
127
+
```
128
+
129
+
The following appeared in the NGF logs:
130
+
131
+
```text
132
+
failed to start control loop: cannot create nginx metrics collector: failed to get http://config-status/stub_status: Get "http://config-status/stub_status": dial unix /var/run/nginx/nginx-status.sock: connect: connection refused
133
+
```
134
+
135
+
Known issue: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108
0 commit comments