Skip to content

Commit

Permalink
Merge pull request #49 from br3ndonland/logging-filter
Browse files Browse the repository at this point in the history
Add configurable logging filter
  • Loading branch information
br3ndonland authored Apr 2, 2022
2 parents 80cbaf4 + fba2fd2 commit 97e35b7
Show file tree
Hide file tree
Showing 6 changed files with 233 additions and 2 deletions.
13 changes: 13 additions & 0 deletions docs/environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,19 @@ The idea here is to allow a catch-all Uvicorn config variable in the spirit of `
- `LOG_COLORS="true"`
- `LOG_COLORS="false"`
`LOG_FILTERS`
- Comma-separated string identifying log records to filter out. The string will be split on commas and converted to a set. Each log message will then be checked for each filter in the set. If any matches are present in the log message, the logger will not log that message.
- Default: `None` (don't filter out any log records, just log every record)
- Custom: `LOG_FILTERS="/health, /heartbeat"` (filter out log messages that contain either the string `"/health"` or the string `"/heartbeat"`, to avoid logging health checks)
- See also:
- [AWS Builders' Library: Implementing health checks](https://aws.amazon.com/builders-library/implementing-health-checks/)
- [AWS Elastic Load Balancing docs: Target groups - Health checks for your target groups](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html)
- [benoitc/gunicorn#1781](https://github.com/benoitc/gunicorn/issues/1781)
- [Python 3 docs: How-To - Logging Cookbook - Using Filters to impart contextual information](https://docs.python.org/3/howto/logging-cookbook.html#using-filters-to-impart-contextual-information)
- [Python 3 docs: What's new in Python 3.2 - logging](https://docs.python.org/3/whatsnew/3.2.html#logging)
- [Django 4.0 docs: Topics - Logging](https://docs.djangoproject.com/en/4.0/topics/logging/)

`LOG_FORMAT`

- [Python logging format](https://docs.python.org/3/library/logging.html#formatter-objects).
Expand Down
76 changes: 76 additions & 0 deletions docs/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,82 @@ See [environment variable reference](environment.md).
- When running Uvicorn alone, logging is configured programmatically from within the [`start.py` start script](https://github.com/br3ndonland/inboard/blob/HEAD/inboard/start.py), by passing the `LOGGING_CONFIG` dictionary to `uvicorn.run()`.
- When running Gunicorn with the Uvicorn worker, the logging configuration dictionary is specified within the [`gunicorn_conf.py`](https://github.com/br3ndonland/inboard/blob/HEAD/inboard/gunicorn_conf.py) configuration file.

## Filtering log messages

[Filters](https://docs.python.org/3/howto/logging-cookbook.html#using-filters-to-impart-contextual-information) identify log messages to filter out, so that the logger does not log messages containing any of the filters. If any matches are present in a log message, the logger will not output the message.

One of the primary use cases for log message filters is health checks. When applications with APIs are deployed, it is common to perform "health checks" on them. Health checks are usually performed by making HTTP requests to a designated API endpoint. These checks are made at frequent intervals, and so they can fill up the access logs with large numbers of unnecessary log records. To avoid logging health checks, add those endpoints to the `LOG_FILTERS` environment variable.

The `LOG_FILTERS` environment variable can be used to specify filters as a comma-separated string, like `LOG_FILTERS="/health, /heartbeat"`. To then add the filters to a class instance, the `LogFilter.set_filters()` method can produce the set of filters from the environment variable value.

```py
# start a REPL session in a venv in which inboard is installed
.venv ❯ python
>>> import os
>>> import inboard
>>> os.environ["LOG_FILTERS"] = "/health, /heartbeat"
>>> inboard.LogFilter.set_filters()
{'/heartbeat', '/health'}
```

inboard will do this automatically by reading the `LOG_FILTERS` environment variable.

Let's see this in action by using the [VSCode debugger](https://code.visualstudio.com/docs/python/debugging), with the configuration in _inboard/.vscode/launch.json_. We'll have one terminal instance open to see the server logs from Uvicorn, and another one open to make client HTTP requests.

Start the server:

```log
/path/to/inboard
❯ export LOG_FILTERS="/health, /heartbeat"
/path/to/inboard
❯ /usr/bin/env /path/to/inboard/.venv/bin/python \
/path/to/the/python/vscode/extension/pythonFiles/lib/python/debugpy/launcher \
61527 -- -m inboard.start
DEBUG: Logging dict config loaded from inboard.logging_conf.
DEBUG: Checking for pre-start script.
DEBUG: Running pre-start script with python /path/to/inboard/inboard/app/prestart.py.
[prestart] Hello World, from prestart.py! Add database migrations and other scripts here.
DEBUG: Ran pre-start script with python /path/to/inboard/inboard/app/prestart.py.
DEBUG: App module set to inboard.app.main_fastapi:app.
DEBUG: Running Uvicorn without Gunicorn.
INFO: Will watch for changes in these directories: ['/path/to/inboard/inboard']
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: Started reloader process [69531] using watchgod
INFO: Started server process [69552]
INFO: Waiting for application startup.
INFO: Application startup complete.
```

Make a request to an endpoint that should be logged, using an HTTP client like [HTTPie](https://httpie.io/) or the [HTTPX CLI](https://www.python-httpx.org/):

```sh
❯ http :8000 -b
{
"Hello": "World"
}

```

The request will be logged through `uvicorn.access`:

```log
INFO: 127.0.0.1:61575 - "GET / HTTP/1.1" 200
```

Next, make a request to an endpoint that should be filtered out of the logs. The username and password you see here are just test values set in the _launch.json_.

```sh
❯ http :8000/health -a test_user:r4ndom_bUt_memorable -b
{
"application": "inboard",
"message": null,
"status": "active"
}
```

The server does not display a log message.

## Extending the logging config

If inboard is installed from PyPI with `poetry add inboard` or `pip install inboard`, the logging configuration can be easily customized as explained in the [Python logging configuration docs](https://docs.python.org/3/library/logging.config.html).
Expand Down
3 changes: 2 additions & 1 deletion inboard/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@
from .app.utilities_starlette import BasicAuth as StarletteBasicAuth
except ImportError: # pragma: no cover
pass
from .logging_conf import LOGGING_CONFIG, configure_logging
from .logging_conf import LOGGING_CONFIG, LogFilter, configure_logging

__all__ = (
"LOGGING_CONFIG",
"LogFilter",
"StarletteBasicAuth",
"configure_logging",
"fastapi_basic_auth",
Expand Down
63 changes: 62 additions & 1 deletion inboard/logging_conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import os
import sys
from pathlib import Path
from typing import Optional
from typing import Optional, Set


def find_and_load_logging_conf(logging_conf: str) -> dict:
Expand Down Expand Up @@ -48,18 +48,78 @@ def configure_logging(
raise


class LogFilter(logging.Filter):
"""Subclass of `logging.Filter` used to filter log messages.
---
Filters identify log messages to filter out, so that the logger does not log
messages containing any of the filters. If any matches are present in a log
message, the logger will not output the message.
The environment variable `LOG_FILTERS` can be used to specify filters as a
comma-separated string, like `LOG_FILTERS="/health, /heartbeat"`. To then
add the filters to a class instance, the `LogFilter.set_filters()`
method can produce the set of filters from the environment variable value.
"""

__slots__ = "name", "nlen", "filters"

def __init__(
self,
name: str = "",
filters: Optional[Set[str]] = None,
) -> None:
"""Initialize a filter."""
self.name = name
self.nlen = len(name)
self.filters = filters

def filter(self, record: logging.LogRecord) -> bool:
"""Determine if the specified record is to be logged.
Returns True if the record should be logged, or False otherwise.
"""
if self.filters is None:
return True
message = record.getMessage()
return all(match not in message for match in self.filters)

@staticmethod
def set_filters(input_filters: Optional[str] = None) -> Optional[Set[str]]:
"""Set log message filters.
Filters identify log messages to filter out, so that the logger does not
log messages containing any of the filters. The argument to this method
should be supplied as a comma-separated string. The string will be split
on commas and converted to a set of strings.
This method is provided as a `staticmethod`, instead of as part of `__init__`,
so that it only runs once when setting the `LOG_FILTERS` module-level constant.
In contrast, the `__init__` method runs each time a logger is instantiated.
"""
return (
{log_filter.strip() for log_filter in str(log_filters).split(sep=",")}
if (log_filters := input_filters or os.getenv("LOG_FILTERS"))
else None
)


LOG_COLORS = (
True
if (value := os.getenv("LOG_COLORS")) and value.lower() == "true"
else False
if value and value.lower() == "false"
else sys.stdout.isatty()
)
LOG_FILTERS = LogFilter.set_filters()
LOG_FORMAT = str(os.getenv("LOG_FORMAT", "simple"))
LOG_LEVEL = str(os.getenv("LOG_LEVEL", "info")).upper()
LOGGING_CONFIG: dict = {
"version": 1,
"disable_existing_loggers": False,
"filters": {
"filter_log_message": {"()": LogFilter, "filters": LOG_FILTERS},
},
"formatters": {
"simple": {
"class": "logging.Formatter",
Expand Down Expand Up @@ -87,6 +147,7 @@ def configure_logging(
"handlers": {
"default": {
"class": "logging.StreamHandler",
"filters": ["filter_log_message"],
"formatter": LOG_FORMAT,
"level": LOG_LEVEL,
"stream": "ext://sys.stdout",
Expand Down
75 changes: 75 additions & 0 deletions tests/test_logging_conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os
from pathlib import Path
from typing import Set

import pytest
from pytest_mock import MockerFixture
Expand Down Expand Up @@ -145,6 +146,9 @@ class TestLoggingOutput:
---
"""

def _uvicorn_access_log_args(self, path: str) -> tuple:
return ('%s - "%s %s HTTP/%s" %d', "127.0.0.1:60364", "GET", path, "1.1", 200)

def test_logging_output_default(self, capfd: pytest.CaptureFixture) -> None:
"""Test logger output with default format."""
logger = logging_conf.logging.getLogger()
Expand Down Expand Up @@ -177,3 +181,74 @@ def test_logging_output_custom_format(
assert log_level_output in captured.out
assert f"Logging dict config loaded from {logging_conf_file}." in captured.out
assert "Hello, Customized World!" in captured.out

@pytest.mark.parametrize(
"log_filters_input,log_filters_output",
(
("/health", {"/health"}),
("/health, /heartbeat", {"/health", "/heartbeat"}),
("foo, bar, baz", {"foo", "bar", "baz"}),
),
)
def test_logging_filters(
self,
capfd: pytest.CaptureFixture,
log_filters_input: str,
log_filters_output: Set[str],
mocker: MockerFixture,
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""Test that log message filters are applied as expected."""
monkeypatch.setenv("LOG_FILTERS", log_filters_input)
mocker.patch.object(
logging_conf, "LOG_FILTERS", logging_conf.LogFilter.set_filters()
)
mocker.patch.dict(
logging_conf.LOGGING_CONFIG["filters"]["filter_log_message"],
{"()": logging_conf.LogFilter, "filters": logging_conf.LOG_FILTERS},
clear=True,
)
path_to_log = "/status"
logger = logging_conf.logging.getLogger("test.logging_conf.output.filters")
logging_conf.configure_logging(logger=logger)
logger.info(*self._uvicorn_access_log_args(path_to_log))
logger.info(log_filters_input)
for log_filter in log_filters_output:
logger.info(*self._uvicorn_access_log_args(log_filter))
captured = capfd.readouterr()
assert logging_conf.LOG_FILTERS == log_filters_output
assert path_to_log in captured.out
for log_filter in log_filters_output:
assert log_filter not in captured.out

@pytest.mark.parametrize(
"log_filters_input", ("/health", "/healthy /heartbeat", "/healthy foo bar")
)
def test_logging_filters_with_known_limitations(
self,
capfd: pytest.CaptureFixture,
log_filters_input: str,
mocker: MockerFixture,
) -> None:
"""Test known limitations of log message filters.
- Filters in the input string should be separated with commas, not spaces.
- Filters are applied with string matching, so a filter of `/health` will
also filter out messages including `/healthy`.
"""
filters = logging_conf.LogFilter.set_filters(log_filters_input)
mocker.patch.dict(
logging_conf.LOGGING_CONFIG["filters"]["filter_log_message"],
{"()": logging_conf.LogFilter, "filters": filters},
clear=True,
)
logger = logging_conf.logging.getLogger("test.logging_conf.output.filtererrors")
logging_conf.configure_logging(logger=logger)
logger.info(log_filters_input)
logger.info("/healthy")
captured = capfd.readouterr()
assert log_filters_input not in captured.out
if log_filters_input == "/health":
assert "/healthy" not in captured.out
else:
assert "/healthy" in captured.out
5 changes: 5 additions & 0 deletions tests/test_start.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,11 +258,16 @@ def test_set_uvicorn_options_default_from_json(
def test_set_uvicorn_options_custom_from_json(
self,
uvicorn_options_custom: dict,
mocker: MockerFixture,
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""Assert that the correct Uvicorn server options are set, in the correct order,
when the `UVICORN_CONFIG_OPTIONS` environment variable is also set.
The `LOGGING_CONFIG` dictionary can't be encoded as a JSON string when it has
a class definition (the filter class), so the "filters" dict has to be removed.
"""
mocker.patch.dict(uvicorn_options_custom["log_config"]["filters"], clear=True)
uvicorn_options_json = start.json.dumps(uvicorn_options_custom)
monkeypatch.setenv("UVICORN_CONFIG_OPTIONS", uvicorn_options_json)
monkeypatch.setenv("WITH_RELOAD", "false")
Expand Down

0 comments on commit 97e35b7

Please sign in to comment.