Skip to content

Commit

Permalink
Merge branch 'test'
Browse files Browse the repository at this point in the history
  • Loading branch information
neophyte57 committed May 24, 2024
2 parents 8338580 + abe0b6f commit 7ff8df8
Show file tree
Hide file tree
Showing 18 changed files with 18,758 additions and 8 deletions.
33 changes: 33 additions & 0 deletions infrastructure/fluentbit/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# A Sidecar for collecting/processing Logs

[Fluent-bit](https://docs.fluentbit.io/manual/about/what-is-fluent-bit) is basically a `log forwarder` tool that can be run as a sidecar container (a docker image) in each pod containing our apps, although it can also be deployed as a stand-alone service (servicing multiple apps). Fluent-bit can forward logs to lots of different outputs for example, HTTP, Opensearch, Slack, AWS Lamda and a lot more (see: https://docs.fluentbit.io/manual/pipeline/outputs). The steps required for deploying a Fluent-bit sidecar are shown below. Note: These steps are to deploy the sidecar.

You can read about a common-service-showcase in BCGOV which has deployed a Fluent-bit sidecar for CDOGS node application [here](https://github.com/bcgov/common-service-showcase/wiki/Logging-to-a-Sidecar).

## Implementing Fluent-bit for PRIME application on OpenShift

We deploy a Fluent-bit sidecar container in both backend/webapi and frontend/nginx applications to collect/process/monitor Logs inside the apps and alert the PRIME team if certain keywords, regular expressions, etc. are matched in the log stream. Each release of Fluent-bit comes with a debug version (for example: fluent-bit:2.X-debug) that includes some other Linux tools such as busybox, bash, etc. and make testing the installation easier. In this example, we use `fluent-bit:3.0.3`. We forward alerts to a private Slack channel (a webhook). So our overall flow of logs is:

`logs from frontend and backend app > fluent-bit sidecar > webhook (Slack channel)`.


### Fluent-bit for frontend/nginx app

Our nginx app outputs logs to a configurable file path (`/tmp/error_flentbit.log`) that can be set in [nginx.configmap](../prime-app-template.yml). Our Fluent-bit container will mount the directory `/tmp/` from nginx app and read the log file `error_flentbit.log`.
Fluent-bit has its own configuration file that we can create using an OpenShift [configmap template](./fluentbit-configmap.yaml. This config of the Fluent-bit uses `tail` plugin in the `INPUT` section to receive logs from nginx log file (/tmp/error_flentbit.log), and uses a `parsers` to define the log formats by providing a Path in the SERVICE section that links to a separate file parser.conf `(Parsers_File parsers.conf)`. It also uses `grep` plugin in the `FILTER` section to filter logs based on specific keywords (e.g. error). At the end it uses the `SLACK` plugin in the `OUTPUT` section to communicate with our Slack webhook and outputs error notifications to the PRIME Team's Slack channel.

### Fluent-bit for backend/webapi app

Our webapi app outputs logs to a directory (`/opt/app-root/app/logs/`) and the Fluent-bit container will mount this directory to a log-storage path. A [Fluent-bit configmap](./fluentbit-configmap.yaml) with the same configuration is used to read the log files `*.log` from mounted path, filter the log streams based on keyword `ERR` and output the logs to the PRIME Team's Slack channel.

### create the configmaps for nginx and webapi apps using the [fluentbit-configmap template](./fluentbit-configmap.yaml)

```
oc process -n $NAMESPACE -f fluentbit-configmap.yaml \
-p NAMESPACE=$NAMESPACE \
-p OC_ENV=$OC_ENV \
-p SLACK_ERROR_NOTIFICATION_WEBHOOK=$SLACK_ERROR_NOTIFICATION_WEBHOOK \
-o yaml | oc -n $NAMESPACE apply -f -
```

You can find the value of `SLACK_ERROR_NOTIFICATION_WEBHOOK` parameter under a secret called `slack-error-notification-webhook` on OpenShift.
181 changes: 181 additions & 0 deletions infrastructure/fluentbit/fluentbit-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
apiVersion: template.openshift.io/v1
kind: Template
labels:
build: ${APP_NAME}
template: ${APP_NAME}-template-bc
metadata:
name: ${APP_NAME}-template-bc
objects:
- apiVersion: v1
kind: ConfigMap
metadata:
name: frontend-${APP_NAME}-config
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/part-of: ${OC_ENV}
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Daemon Off
# define the log format (see additional config map key/value)
Parsers_File parsers.conf
Log_Level info
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
[INPUT]
# get logs from file written by nginx app
Name tail
Path /tmp/error_fluentbit.log
Tag app
[FILTER]
# filter logs based on certain keyword(s)
name grep
match app
regex log error
[FILTER]
# Ignore "No such file or directory" error, exclude from fluentbit output
name grep
match app
Exclude log No\ssuch\sfile\sor\sdirectory
[FILTER]
name parser
match app
Key_Name log
Parser parser
Preserve_Key On
[FILTER]
# modify log entry to include namespace and container name
name record_modifier
match app
# add namespace
Record namespace ${NAMESPACE}
# add container name
Record product ${OC_ENV}-frontend
[FILTER]
Name rewrite_tag
Match app
Rule $level ([a-zA-Z]*)$ $TAG.$level true
Emitter_Name re_emitted
[OUTPUT]
name slack
match app
webhook ${SLACK_ERROR_NOTIFICATION_WEBHOOK}
[OUTPUT]
Name stdout
Match app
Format json_lines
parsers.conf: |
[PARSER]
Name parser
Format regex
Regex (?<timestamp>\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2}) \[(?<level>.*)\]\ (?<process_id>\d*)#(?<thread_id>\d*): (?<message>.*)
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
Types thread_id:integer process_id:integer
- apiVersion: v1
kind: ConfigMap
metadata:
name: webapi-${APP_NAME}-config
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/part-of: ${OC_ENV}
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Daemon Off
# define the log format (see additional config map key/value)
Parsers_File parsers.conf
Log_Level info
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
[INPUT]
# get logs from file written by webapi app
Name tail
Path /opt/app-root/app/logs/*.log
Tag app
[FILTER]
# filter logs based on certain keyword(s)
name grep
match app
regex log ERR
[FILTER]
name parser
match app
Key_Name log
Parser parser
Preserve_Key On
[FILTER]
# modify log entry to include namespace and container name
name record_modifier
match app
# add namespace
Record namespace ${NAMESPACE}
# add container name
Record product ${OC_ENV}-webapi
[FILTER]
Name rewrite_tag
Match app
Rule $level ([a-zA-Z]*)$ $TAG.$level true
Emitter_Name re_emitted
[OUTPUT]
name slack
match app
webhook ${SLACK_ERROR_NOTIFICATION_WEBHOOK}
[OUTPUT]
Name stdout
Match app
Format json_lines
parsers.conf: |
[PARSER]
Name parser
Format regex
Regex \[(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})(?<level>.*)]\ (?<message>.*)
Time_Key time
Time_Format %H:%M:%S
parameters:
- name: APP_NAME
description: Application name
displayName: Application name
required: true
value: fluentbit
- name: NAMESPACE
description: Namespace
displayName: Namespace
required: true
value: 9c33a9-dev
- name: OC_ENV
description: OpenShift Environment
displayName: OpenShift Environment
required: true
value: dev
- name: SLACK_ERROR_NOTIFICATION_WEBHOOK
description: Slack error notification Webhook URL
displayName: Slack Webhook URL
required: true
value: ""
115 changes: 112 additions & 3 deletions infrastructure/prime-app-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,15 +122,17 @@ objects:
- name: nginx-config
configMap:
name: ${SVC_NAME}-nginx-config
defaultMode: 420
- name: plr-integration-volume
secret:
secretName: plr-integration
defaultMode: 420
- name: env-config
configMap:
name: ${SVC_NAME}-env-config
defaultMode: 420
- name: log-storage
emptyDir: {}
- name: fluentbit-config
configMap:
name: frontend-fluentbit-config
containers:
- name: ${SVC_NAME}-frontend
image: >-
Expand Down Expand Up @@ -161,6 +163,8 @@ objects:
readOnly: true
mountPath: /opt/app-root/src/assets/config-map.json
subPath: config-map.json
- name: log-storage
mountPath: /tmp/
resources:
limits:
cpu: 50m
Expand All @@ -174,6 +178,8 @@ objects:
env:
- name: DOCUMENT_MANAGER_URL
value: https://${VANITY_URL}/api/docman
- name: SERVER_LOGFILE
value: /tmp/error_fluentbit.log
readinessProbe:
httpGet:
path: /
Expand All @@ -187,6 +193,53 @@ objects:
initialDelaySeconds: 5
failureThreshold: 1
periodSeconds: 5
- name: fluent-bit
image: 'docker.io/fluent/fluent-bit:3.0.3'
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
cpu: 10m
memory: 16Mi
readinessProbe:
httpGet:
path: /
port: 2020
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 60
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /
port: 2020
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 60
successThreshold: 1
failureThreshold: 3
env:
- name: SERVER_LOGFILE
value: /tmp/error_fluentbit.log
ports:
- name: metrics
containerPort: 2020
protocol: TCP
- name: http-plugin
containerPort: 80
protocol: TCP
volumeMounts:
- name: fluentbit-config
mountPath: /fluent-bit/etc
- name: log-storage
mountPath: /tmp/
terminationMessagePolicy: File

triggers:
- type: ConfigChange
- type: ImageChange
Expand Down Expand Up @@ -214,6 +267,7 @@ objects:
worker_processes auto;
error_log "/opt/bitnami/nginx/logs/error.log";
error_log "/tmp/error_fluentbit.log";
pid "/opt/bitnami/nginx/tmp/nginx.pid";
events {
Expand Down Expand Up @@ -623,6 +677,8 @@ objects:
key: app-db-name
- name: DB_CONNECTION_STRING
value: "host=$(DB_HOST);port=5432;database=$(POSTGRESQL_DATABASE);username=$(POSTGRESQL_USER);password=$(POSTGRESQL_PASSWORD)"
- name: SERVER_LOGFILE
value: /opt/app-root/app/logs/*.log
ports:
- containerPort: 1025
protocol: TCP
Expand Down Expand Up @@ -699,10 +755,63 @@ objects:
- name: cert-volume
mountPath: /opt/app-root/etc/certs
readOnly: true
- name: log-storage
mountPath: /opt/app-root/app/logs
- name: fluent-bit
image: 'docker.io/fluent/fluent-bit:3.0.3'
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
cpu: 10m
memory: 16Mi
readinessProbe:
httpGet:
path: /
port: 2020
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 60
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /
port: 2020
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 60
successThreshold: 1
failureThreshold: 3
env:
- name: SERVER_LOGFILE
value: /opt/app-root/app/logs/*.log
ports:
- name: metrics
containerPort: 2020
protocol: TCP
- name: http-plugin
containerPort: 80
protocol: TCP
imagePullPolicy: IfNotPresent
volumeMounts:
- name: fluentbit-config
mountPath: /fluent-bit/etc
- name: log-storage
mountPath: /opt/app-root/app/logs
terminationMessagePolicy: File
volumes:
- name: cert-volume
secret:
secretName: pharmanet-api-ssl-certs
- name: log-storage
emptyDir: {}
- name: fluentbit-config
configMap:
name: webapi-fluentbit-config
triggers:
- type: ConfigChange
- type: ImageChange
Expand Down
Loading

0 comments on commit 7ff8df8

Please sign in to comment.