Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement for log_format directive configuration #845

Merged
merged 1 commit into from
Feb 21, 2020

Conversation

alxmsl
Copy link

@alxmsl alxmsl commented Feb 10, 2020

Proposed changes

I've found configuration of log_format directive doesn't support escape parameter clearly at this moment.
For example, if I'd like to have json escaping I have to create ConfigMap like this, using fake space-apostrophe substring ' ':

kind: ConfigMap
apiVersion: v1
metadata:
  ...
data:
  log-format: escape=json' '{"request":"$request"}

This PR contains improvement to solve this problem. Now log_format can be configured this way:

kind: ConfigMap
apiVersion: v1
metadata:
  ...
data:
  log-format: |
    escape=json
    {"request":"$request"}

Additionally, I've kept back-compatibility with previous solution, if somebody is using it...

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto master
  • I will ensure my PR is targeting the master branch and pulling from my branch from my own fork

@Rulox
Copy link
Contributor

Rulox commented Feb 10, 2020

Hi @alxmsl

Thanks for the PR, we'll review it shortly and get back to you.

@pleshakov pleshakov added the enhancement Pull requests for new features/feature enhancements label Feb 11, 2020
Copy link
Contributor

@pleshakov pleshakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alxmsl thanks for the PR!

It's nice that now it is possible to support the json escaping without any workaround.

Additionally, I like how now we can split the log format definition across multiple lines.

However, let's consider the following case, where the user wants to split the definition across multiple lines (so that the log format is easier to read/maintain):

  log-format: |
    remote_addr - $remote_user [$time_local] "$request"
    $status $body_bytes_sent "$http_referer"
    "$http_user_agent" "$http_x_forwarded_for"

As a result, the following config will be generated:

    log_format  main 'remote_addr - $remote_user [$time_local] "$request"'
                     '$status $body_bytes_sent "$http_referer"'
                     '"$http_user_agent" "$http_x_forwarded_for"'
                     ''
                     ;

Now if the user looks at the logs, they will see:

. . .  "POST /coffee HTTP/1.1"200 158 "-""curl/7.54.0" "-"

As you can see, there are no space character between the request and the status, the http refer and the user agent.

Would it be possible to ensure that the IC insert the spaces?

Please also see additional comments.

Also, could you rebase against the master?

@@ -34,7 +34,11 @@ http {
{{- end}}

{{if .LogFormat -}}
log_format main '{{.LogFormat}}';
log_format main {{if .LogFormat -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the {{if .LogFormat -}} ... {{end}} is not required here - using just range is enough.
the same applies for the stream log format.

@alxmsl
Copy link
Author

alxmsl commented Feb 13, 2020

Thanks for your fast reply, @pleshakov!
I took a look for this case. And I found it very interesting
It looks like this configuration - with lines remote_addr - $remote_user [$time_local] "$request" and
$status $body_bytes_sent "$http_referer" - can be not so obvious as users may expect, because trailing spaces are pretty invisible

And I think it will be better to pipe log format configuration "as is". For example, if ConfigMap contains:

  log-format: |
    'remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"'

nginx.conf contains:

    log_format  main 'remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '"$http_user_agent" "$http_x_forwarded_for"'
                     ;

This sample looks much obvious. And additionally, users are allowed to use all power of string values. For example, log format could be defined using these ways:

    'remote_addr - "$remote_user" [$time_local]'
    "remote_addr - '$remote_user' [$time_local]"

In this solution I can see just one problem - it doesn't have back-compatibility: users who defined log format like presented below will have error

data:
  log-format: '{ "@timestamp": "$time_iso8601"}'

I don't like that users will have an error and I can omit it using small hack: if len(LogFormat) eq 1 and LogFormat is not quoted -> wrap LogFormat by single quotes
What do you think?

@pleshakov
Copy link
Contributor

Hi @alxmsl

It looks like this configuration - with lines remote_addr - $remote_user [$time_local] "$request" and
$status $body_bytes_sent "$http_referer" - can be not so obvious as users may expect, because trailing spaces are pretty invisible

Agree with that.

And I think it will be better to pipe log format configuration "as is". For example, if ConfigMap contains:
. . .
In this solution I can see just one problem - it doesn't have back-compatibility: users who defined log format like presented below will have error
. . .
I don't like that users will have an error and I can omit it using small hack: if len(LogFormat) eq 1 and LogFormat is not quoted -> wrap LogFormat by single quotes

While this option gives users the maximum control, I don't think it is necessary and also could lead to errors because of hunting for unmatched ' or ".

If we go back to the original problem -- supporting json escaping - perhaps a slightly different way of implementing that can do that job. What if we have a ConfigMap key log-format-escaping that could take two string arguments - json or none (default). If the user set log-format-escaping: json, than the Ingress Controller puts escape=json in the log-format in the NGINX config. What do you think?

The option of supporting multiple lines in the log format is a nice bonus, but considering the original problem, it could be left out of the scope. At the same time, I wonder what you think about a simpler solution for handling multi-line strings- concatenating the strings with the space character. For example (note there are no trailing spaces here):

  log-format: |
    $remote_addr - $remote_user [$time_local] "$request"
    $status $body_bytes_sent "$http_referer"
    "$http_user_agent" "$http_x_forwarded_for"

Becomes

log_format  main '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"';

Thanks

@alxmsl
Copy link
Author

alxmsl commented Feb 18, 2020

Hi @pleshakov

If we go back to the original problem -- supporting json escaping - perhaps a slightly different way of implementing that can do that job. What if we have a ConfigMap key log-format-escaping that could take two string arguments - json or none (default). If the user set log-format-escaping: json, than the Ingress Controller puts escape=json in the log-format in the NGINX config. What do you think?

I found this solution as a little bit redundant, but I agree it, because - you're completely right - it solves initial problem and it keeps back-compatibility without any code which can be unclear. So, let me modify this PR to support log-format-escaping and stream-log-format-escaping

The option of supporting multiple lines in the log format is a nice bonus, but considering the original problem, it could be left out of the scope. At the same time, I wonder what you think about a simpler solution for handling multi-line strings- concatenating the strings with the space character. For example (note there are no trailing spaces here):

I thought about this simple solution before in flow of the maximum control for users. At this moment, I think, this is also well 👍

@alxmsl alxmsl force-pushed the logformat branch 3 times, most recently from af835e6 to c0664de Compare February 18, 2020 16:48
@alxmsl
Copy link
Author

alxmsl commented Feb 18, 2020

@pleshakov , I've completed suggested changes. Commits also have been rebased.
Now if ingress controller is configured using this way:

kind: ConfigMap
apiVersion: v1
metadata:
  ...
data:
  log-format-escaping: "json"
  log-format: |
    {
     "remote_addr": "$remote_addr",
     "remote_user": "$remote_user",
     "request": "$request"
    }

Nginx log module is configured like this:

    log_format  main escape=json '{ '
                     ' "remote_addr": "$remote_addr", '
                     ' "remote_user": "$remote_user", '
                     ' "request": "$request" '
                     '} '
                     ;

And log message looks like:

{  "remote_addr": "127.0.0.1",  "remote_user": "",  "request": "GET /\" HTTP/1.1" }

Copy link
Contributor

@pleshakov pleshakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alxmsl

Thanks for the changes!

I left a small comment about readability of the template.

The PR looks good!

I have a small concern regarding the trailing space that will always be present in the logs now, because of the space in the template:

 {{with $value }}'{{ $value }} '

However, I can't think how the new behavior can break any existing users. I wonder what you think about it?

@@ -35,7 +35,9 @@ http {
{{- end}}

{{if .LogFormat -}}
log_format main '{{.LogFormat}}';
log_format main {{if .LogFormatEscaping}}escape={{ .LogFormatEscaping }} {{end}}{{range $value := .LogFormat -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for readability of the template, could you put the range loop one line below. For example:

    log_format  main {{if .LogFormatEscaping}}escape={{ .LogFormatEscaping }} {{end}}
                     {{range $value := .LogFormat -}}
                     {{with $value }}'{{ $value }} '{{end}}
                     {{end}};

@alxmsl
Copy link
Author

alxmsl commented Feb 20, 2020

Thank you, @pleshakov
I've improved readability of the template as you suggested before

About the trailing space I can see exactly one much serious problem: now logs is bigger at 1 byte per line. Because I don't see any commonly used cases of calling, for example, awk or grep or any other pattern-processing tool which could be affected.
But, I have to tell, that problem may happen on using subset of strict regular expressions based on $ metacharacter, for example. Because string {"request":"GET /\x22 HTTP/1.1"} is matched the regexp ^({.*})$, but string {"request":"GET /\x22 HTTP/1.1"} is not

And I have made small improvement which looks gracefully, I think:

    log_format  main {{if .LogFormatEscaping}}escape={{ .LogFormatEscaping }} {{end}}
                     {{range $i, $value := .LogFormat -}}
                     {{with $value }}'{{if $i}} {{end}}{{ $value }}'
                     {{end}}{{end}};

If you think, this is over-engineering, of course, I can revert it back

Copy link
Contributor

@pleshakov pleshakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alxmsl
The PR looks good! Thanks for the additional investigation around the trailing space problem.

I will ask a coworker to take a final look and we should merge it shortly!

@pleshakov pleshakov requested a review from Rulox February 21, 2020 02:28
Copy link
Contributor

@Rulox Rulox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, did a few tests, couldn't find any problem.

Kudos for the addition of the fields in template_tests

Also it seems this doesn't introduce any breaking change 🎉

@pleshakov pleshakov merged commit 24dc092 into nginxinc:master Feb 21, 2020
pleshakov added a commit that referenced this pull request Mar 25, 2020
pleshakov added a commit that referenced this pull request Mar 30, 2020
pleshakov added a commit that referenced this pull request Mar 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Pull requests for new features/feature enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants