Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(runloop) execute plugins on Nginx-produced errors
Context ------- Before this patch, several scenarios would make Kong/Nginx produce errors that did not execute plugins: 1. Nginx-produced client errors (HTTP 4xx) on invalid requests per the Nginx settings (e.g. headers too large, URI too large, etc...) 2. Proxy TCP/HTTP errors on the last balancer try (e.g. connection refused, connection timeout, read/write timeout, etc...) 3. Any other Nginx-produced server errors (HTTP 5xx), if any Because plugins are not executed in such cases, logging plugins in particular do not get a chance to report such errors, leaving Kong administrator blind when relying on bundled or homegrown logging plugins. Additionally, other plugins, e.g. transformation, do not get a chance to run either and could lead to scenarios where those errors produce headers or response bodies that are undesired by the administrator. This issue is similar to that fixed by #3079, which handled the following case: 4. Requests short-circuited by Kong itself (vs Nginx in 1, 2, and 3.) See also #490 & #892 for issues related to 4. This patches addresses 1, 2, and 3. Problem ------- Nginx-produced errors (4xx and 5xx) are redirected to the `/kong_error_handler` location by way of the `error_page` directive, but the runloop is not given a chance to run, since the error handler is only instructed to override the Nginx-defined HTML responses, and produces Kong-friendly ones. In short, a patch solving this issue must enable the runloop within the error handler. Proposed solution ----------------- Several (many) possibilities were evaluated and explored to fix this. The proposed patch is the result of many attempts, observations, and adjustments to initial proposals. In a gist, it proposes: * Executing the response part of the runloop in the error_page location. * Ensure that if the request part of the runloop did not have a chance to run, we do build the list of plugins to execute (thus, global plugins only since the request was not processed). * Ensure that if the request part of the runloop did run, we preserve its `ngx.ctx` values. A few catches: * Currently, `ngx.req.get_method()` in the `error_page` location block will return `GET` even if the request method was different (only `HEAD` is preserved). A follow up patch for Nginx is being prepared. As such, several tests are currently marked as "Pending". * When the `rewrite` phase did not get a chance to run, some parts of the request may not have been parsed by Nginx (e.g. on HTTP 414), thus some variables, such as `$request_uri` or request headers may not have been parsed. Code should be written defensively against such possibilities, and logging plugins may report incomplete request data (but will still report the produced error). * The `uninitialized_variable_warn` directive must be disabled in the `error_page` location block for cases when the `rewrite` phase did not get a chance to run (Nginx-produced 4xx clients errors). * The `ngx.ctx` variables must be copied by reference for plugins to be able to access it in the response phases when the request was short-circuited by Nginx (e.g. HTTP 502/504 from the upstream module). * Using a named location for the `error_page` handler is not possible, since in some cases the URI might not be parsed by Nginx, and thus Nginx refuses to call named locations in such cases (e.g. on HTTP 414). Resulting behavior ------------------ The `03-plugins_triggering_spec.lua` test suite is now a good reference for the behavior observed upon Nginx-produced errors. This behavior can be classified in 2 categories: 1. When the request is not parsed by Nginx (HTTP 4xx) a. The `error_page` directive is called (`rewrite` and `access` are skipped) b. The runloop retrieves the list of **global** plugins to run (it cannot retrieve more specific plugins since the request could not be parsed) c. The error content handler overrides the Nginx body with a Kong-friendly one (unchanged from previous behavior) d. It executes the `header_filter` phase of each global plugin. e. It executes the `body_filter` phase of each global plugin. f. It executes the `log` phase of each global plugin. 2. When the error was produced during upstream module execution (HTTP 5xx) a. The `rewrite` and `access` phases ran normally (we stashed the `ngx.ctx` reference) b. The `error_page` directive is called c. The error content handler applies the original `ngx.ctx` by reference d. It does **not** run the plugins loop, since `ngx.ctx` contains the list of plugins to run (global **and** specific plugins in this case) e. The error content handler overrides the Nginx body with a Kong-friendly one (unchanged from previous behavior) f. It executes the `header_filter` phase of each global plugin. g. It executes the `body_filter` phase of each global plugin. h. It executes the `log` phase of each global plugin. Fix #3193 From #3533
- Loading branch information