-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Nginx monitoring. #11558
Merged
wu-sheng
merged 8 commits into
apache:master
from
weixiang1862:feature/nginx-monitoring
Nov 17, 2023
Merged
Support Nginx monitoring. #11558
Changes from 3 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
61921a5
Support Nginx monitoring.
1c44fa2
update docker build context path.
9f55c43
Merge branch 'master' into feature/nginx-monitoring
wu-sheng c197067
Use MQE in e2e test.
e4a8abf
Merge branch 'master' into feature/nginx-monitoring
wu-sheng 6049ee2
extract level for error.log & http status code for access.log
0b99168
Merge remote-tracking branch 'forked/feature/nginx-monitoring' into f…
04e2551
Fix wrong link in doc.
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
# Nginx monitoring | ||
## Nginx performance from nginx-lua-prometheus | ||
The [nginx-lua-prometheus](https://github.com/openresty/lua-nginx-module) is a lua library that can be used with Nginx to collect metrics | ||
and expose them on a separate web page. | ||
To use this library, you will need Nginx with [lua-nginx-module](https://github.com/openresty/lua-nginx-module) or directly [OpenResty](https://openresty.org/). | ||
|
||
SkyWalking leverages OpenTelemetry Collector to transfer the metrics to [OpenTelemetry receiver](opentelemetry-receiver.md) and into the [Meter System](./../../concepts-and-designs/meter.md). | ||
|
||
### Data flow | ||
1. [nginx-lua-prometheus](https://github.com/openresty/lua-nginx-module) collects metrics from Nginx and expose them to an endpoint. | ||
2. OpenTelemetry Collector fetches metrics from the endpoint expose above via Prometheus Receiver and pushes metrics to SkyWalking OAP Server via OpenTelemetry gRPC exporter. | ||
3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to filter/calculate/aggregate and store the results. | ||
|
||
### Set up | ||
1. Collect Nginx metrics and expose the following four metrics by [nginx-lua-prometheus](https://github.com/openresty/lua-nginx-module). For details on metrics definition, refer to [here](../../../../test/e2e-v2/cases/nginx/nginx.conf). | ||
- histogram: nginx_http_latency | ||
- gauge: nginx_http_connections | ||
- counter: nginx_http_size_bytes | ||
- counter: nginx_http_requests_total | ||
|
||
2. Set up [OpenTelemetry Collector ](https://opentelemetry.io/docs/collector/getting-started/#docker). For details on Prometheus Receiver in OpenTelemetry Collector, refer to [here](../../../../test/e2e-v2/cases/nginx/otel-collector-config.yaml). | ||
3. Config SkyWalking [OpenTelemetry receiver](opentelemetry-receiver.md). | ||
|
||
### Nginx Monitoring | ||
|
||
SkyWalking observes the status, payload, and latency of the Nginx server, which is cataloged as a `LAYER: Nginx` `Service` in the OAP and instances would be recognized as `LAYER: Nginx` `instance`. | ||
|
||
About `LAYER: Nginx` `endpoint`, it depends on how precision you want to monitor the nginx. | ||
We do not recommend expose every request path metrics, because it will cause explosion of metrics endpoint data. | ||
|
||
You can collect host metrics: | ||
``` | ||
http { | ||
log_by_lua_block { | ||
metric_bytes:inc(tonumber(ngx.var.request_length), {"request", ngx.var.host}) | ||
metric_bytes:inc(tonumber(ngx.var.bytes_send), {"response", ngx.var.host}) | ||
metric_requests:inc(1, {ngx.var.status, ngx.var.host}) | ||
metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.host}) | ||
} | ||
} | ||
``` | ||
or grouped urls and upstream metrics: | ||
``` | ||
upstream backend { | ||
server ip:port; | ||
} | ||
|
||
server { | ||
|
||
location /test { | ||
default_type application/json; | ||
return 200 '{"code": 200, "message": "success"}'; | ||
|
||
log_by_lua_block { | ||
metric_bytes:inc(tonumber(ngx.var.request_length), {"request", "/test/**"}) | ||
metric_bytes:inc(tonumber(ngx.var.bytes_send), {"response", "/test/**"}) | ||
metric_requests:inc(1, {ngx.var.status, "/test/**"}) | ||
metric_latency:observe(tonumber(ngx.var.request_time), {"/test/**"}) | ||
} | ||
} | ||
|
||
location /test_upstream { | ||
|
||
proxy_pass http://backend; | ||
|
||
log_by_lua_block { | ||
metric_bytes:inc(tonumber(ngx.var.request_length), {"request", "upstream/backend"}) | ||
metric_bytes:inc(tonumber(ngx.var.bytes_send), {"response", "upstream/backend"}) | ||
metric_requests:inc(1, {ngx.var.status, "upstream/backend"}) | ||
metric_latency:observe(tonumber(ngx.var.request_time), {"upstream/backend"}) | ||
} | ||
} | ||
} | ||
``` | ||
|
||
#### Nginx Service Supported Metrics | ||
| Monitoring Panel | Unit | Metric Name | Catalog | Description | Data Source | | ||
|-------------------------|------|-----------------------------------------------------------------------------------------------|---------|------------------------------------------------------|--------------------------------| | ||
| HTTP Request Trend | | meter_nginx_service_http_requests | Service | The increment rate of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Latency | ms | meter_nginx_service_http_latency | Service | The increment rate of the latency of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Bandwidth | KB | meter_nginx_service_bandwidth | Service | The increment rate of the bandwidth of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Connections | | meter_nginx_service_http_connections | Service | The avg number of the connections | nginx-lua-prometheus | | ||
| HTTP Status Trend | | meter_nginx_service_http_status | Service | The increment rate of the status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 4xx Percent | % | meter_nginx_service_http_4xx_requests_increment / meter_nginx_service_http_requests_increment | Service | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 5xx Percent | % | meter_nginx_service_http_5xx_requests_increment / meter_nginx_service_http_requests_increment | Service | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
|
||
#### Nginx Instance Supported Metrics | ||
| Monitoring Panel | Unit | Metric Name | Catalog | Description | Data Source | | ||
|---------------------------|------|-------------------------------------------------------------------------------------------------|----------|------------------------------------------------------|--------------------------------| | ||
| HTTP Request Trend | | meter_nginx_instance_http_requests | Instance | The increment rate of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Latency | ms | meter_nginx_instance_http_latency | Instance | The increment rate of the latency of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Bandwidth | KB | meter_nginx_instance_bandwidth | Instance | The increment rate of the bandwidth of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Connections | | meter_nginx_instance_http_connections | Instance | The avg number of the connections | nginx-lua-prometheus | | ||
| HTTP Status Trend | | meter_nginx_instance_http_status | Instance | The increment rate of the status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 4xx Percent | % | meter_nginx_instance_http_4xx_requests_increment / meter_nginx_instance_http_requests_increment | Instance | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 5xx Percent | % | meter_nginx_instance_http_5xx_requests_increment / meter_nginx_instance_http_requests_increment | Instance | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
|
||
#### Nginx Endpoint Supported Metrics | ||
| Monitoring Panel | Unit | Metric Name | Catalog | Description | Data Source | | ||
|-------------------------|------|-------------------------------------------------------------------------------------------------|----------|------------------------------------------------------|----------------------| | ||
| HTTP Request Trend | | meter_nginx_endpoint_http_requests | Endpoint | The increment rate of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Latency | ms | meter_nginx_endpoint_http_latency | Endpoint | The increment rate of the latency of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Bandwidth | KB | meter_nginx_endpoint_bandwidth | Endpoint | The increment rate of the bandwidth of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status Trend | | meter_nginx_endpoint_http_status | Endpoint | The increment rate of the status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 4xx Percent | % | meter_nginx_endpoint_http_4xx_requests_increment / meter_nginx_endpoint_http_requests_increment | Endpoint | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
| HTTP Status 5xx Percent | % | meter_nginx_endpoint_http_5xx_requests_increment / meter_nginx_endpoint_http_requests_increment | Endpoint | The percentage of 4xx status of HTTP requests | nginx-lua-prometheus | | ||
|
||
### Customizations | ||
You can customize your own metrics/expression/dashboard panel. | ||
|
||
The metrics definition and expression rules are found in `/config/otel-rules/nginx-service.yaml, /config/otel-rules/nginx-instance.yaml, /config/otel-rules/nginx-endpoint.yaml`. | ||
|
||
The Nginx dashboard panel configurations are found in `/config/ui-initialized-templates/nginx`. | ||
|
||
## Collect nginx access and error log | ||
SkyWalking leverages [fluentbit](https://fluentbit.io/) or other log agents for collecting access log and error log of Nginx. | ||
|
||
### Data flow | ||
1. fluentbit agent collects access log and error log from Nginx. | ||
2. fluentbit agent sends data to SkyWalking OAP Server using native meter APIs via HTTP. | ||
3. The SkyWalking OAP Server parses the expression with [LAL](../../concepts-and-designs/lal.md) to parse/extract and store the results. | ||
|
||
### Set up | ||
1. Install [fluentbit](https://docs.fluentbit.io/manual/installation/docker). | ||
2. Config fluent bit with fluent-bit.conf, refer to [here](../../../../test/e2e-v2/cases/nginx/fluent-bit.conf). | ||
|
||
### Error Log Monitoring | ||
Error Log monitoring provides monitoring of the error.log of the Nginx server. | ||
|
||
#### Supported Metrics | ||
| Monitoring Panel | Metric Name | Catalog | Description | Data Source | | ||
|--------------------------|--------------------------------------|----------|-------------------------------------------|-------------| | ||
| Service Error Log Count | meter_nginx_service_error_log_count | Service | The count of log level of nginx error.log | fluent bit | | ||
| Instance Error Log Count | meter_nginx_instance_error_log_count | Instance | The count of log level of nginx error.log | fluent bit | | ||
|
||
### Customizations | ||
You can customize your own metrics/expression/dashboard panel. | ||
|
||
The log collect and analyse rules are found in `/config/lal/nginx.yaml`, `/config/log-mal-rules/nginx.yaml`. | ||
|
||
The Nginx dashboard panel configurations are found in `/config/ui-initialized-templates/nginx`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
49 changes: 49 additions & 0 deletions
49
oap-server/server-starter/src/main/resources/lal/nginx.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
rules: | ||
- name: nginx-access-log | ||
layer: NGINX | ||
dsl: | | ||
filter { | ||
if (tag("LOG_KIND") == "NGINX_ACCESS_LOG") { | ||
sink { | ||
} | ||
} | ||
} | ||
- name: nginx-error-log | ||
layer: NGINX | ||
dsl: | | ||
filter { | ||
if (tag("LOG_KIND") == "NGINX_ERROR_LOG") { | ||
text { | ||
regexp $/(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>.+)].*/$ | ||
} | ||
|
||
extractor { | ||
timestamp parsed.time as String, "yyyy/MM/dd HH:mm:ss" | ||
|
||
metrics { | ||
timestamp log.timestamp as Long | ||
labels level: parsed.level, service: log.service, service_instance_id: log.serviceInstance | ||
name "nginx_error_log_count" | ||
value 1 | ||
} | ||
} | ||
|
||
sink { | ||
} | ||
} | ||
} |
36 changes: 36 additions & 0 deletions
36
oap-server/server-starter/src/main/resources/log-mal-rules/nginx.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# This will parse a textual representation of a duration. The formats | ||
# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS} | ||
# with days considered to be exactly 24 hours. | ||
# <p> | ||
# Examples: | ||
# <pre> | ||
# "PT20.345S" -- parses as "20.345 seconds" | ||
# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds) | ||
# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds) | ||
# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds) | ||
# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes" | ||
# "P-6H3M" -- parses as "-6 hours and +3 minutes" | ||
# "-P6H3M" -- parses as "-6 hours and -3 minutes" | ||
# "-P-6H+3M" -- parses as "+6 hours and -3 minutes" | ||
# </pre> | ||
metricPrefix: meter_nginx | ||
metricsRules: | ||
- name: service_error_log_count | ||
exp: nginx_error_log_count.sum(['level','service']).downsampling(SUM).service(['service'], Layer.NGINX) | ||
- name: instance_error_log_count | ||
exp: nginx_error_log_count.sum(['level','service','service_instance_id']).downsampling(SUM).instance(['service'],['service_instance_id'], Layer.NGINX) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. I will fix this.