Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for structured logs #624

Closed
1 task done
TrevinTeacutter opened this issue Aug 13, 2018 · 36 comments · Fixed by #1517
Closed
1 task done

Add support for structured logs #624

TrevinTeacutter opened this issue Aug 13, 2018 · 36 comments · Fixed by #1517
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@TrevinTeacutter
Copy link

TrevinTeacutter commented Aug 13, 2018

It would be nice to have access logs that are structured as we use splunk for our logging aggregation and extracting key/value pairs is much nicer in splunk with structured logs than it is with regex. Could this be some sort of flag to enable this or did I just miss that this option is already enabled somewhere?

Blocked

@davecheney davecheney added this to the 0.7.0 milestone Aug 14, 2018
@davecheney davecheney added kind/feature Categorizes issue or PR as related to a new feature. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. labels Aug 14, 2018
@davecheney
Copy link
Contributor

Thank you for raising this issue. The access logs are generated directly from envoy. At the moment we have flags to control their output destination, but their format is currently defaulting to whatever envoy uses by default -- something akin to apache2 log.

https://github.com/heptio/contour/blob/master/internal/contour/listener.go#L338

Looking at

https://www.envoyproxy.io/docs/envoy/v1.6.0/configuration/access_log#config-access-log-format

It looks like its possible to redefine the output of the logging format.

I'm wary of exposing the full configuration syntax of this to end users, it's too close to showing how the sausage is made. With that said, when you say structured logging, do you mean JSON?

@davecheney davecheney removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 14, 2018
@TrevinTeacutter
Copy link
Author

Yes, I mean JSON. We recently turned JSON access logs on for Traefik in our old DC/OS clusters and they have been useful as well as capable of including/redacting/excluding headers of requests coming through the ingress making correlation really easy, stuff that normally doesn't get included in apache2 logs for brevity's sake.

@yinsenyan
Copy link

Maybe support define log format by configmap is a good way , just like nginx-ingress-controller use log-format-upstream parameter, also istio use logentry is a good practice.

@davecheney davecheney removed this from the 0.7.0 milestone Oct 8, 2018
@davecheney
Copy link
Contributor

We're not going to get to this for the 0.7 release. Removing milestone.

@davecheney davecheney added blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Apr 9, 2019
@azuretek
Copy link

Has there been any agreement on this? Right now this is making Contour a less usable replacement for Nginx, having Envoy metrics available is great but getting full access log data would go a long way to making this a viable option.

@davecheney
Copy link
Contributor

davecheney commented Apr 24, 2019 via email

@azuretek
Copy link

azuretek commented Apr 24, 2019

For our use case it would be nice if we could add an annotation or even just load a ConfigMap that will add a structure like so to every listener filter chain.

access_log: [
  {
    name: "envoy.file_access_log",
    config: {
      json_format: {
        downstream_remote_address: "%DOWNSTREAM_REMOTE_ADDRESS%",
        path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
        authority: "%REQ(:AUTHORITY)%",
        protocol: "%PROTOCOL%",
        upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
        upstream_local_address: "%UPSTREAM_LOCAL_ADDRESS%",
        duration: "%DURATION%",
        downstream_local_address: "%DOWNSTREAM_LOCAL_ADDRESS%",
        user_agent: "%REQ(USER-AGENT)%",
        response_code: "%RESPONSE_CODE%",
        response_flags: "%RESPONSE_FLAGS%",
        start_time: "%START_TIME%",
        method: "%START_TIME%",
        request_id: "%REQ(X-REQUEST-ID)%",
        upstream_host: "%UPSTREAM_HOST%",
        x_forwarded_for: "%REQ(X-FORWARDED-FOR)%",
        requested_server_name: "%REQUESTED_SERVER_NAME%",
        bytes_received: "%BYTES_RECEIVED%",
        bytes_sent: "%BYTES_SENT%",
        upstream_cluster: "%UPSTREAM_CLUSTER%"
      },
      path: "/dev/stdout"
    },
  },
]

It looks like we can probably add it into this function with minimal effort https://github.com/heptio/contour/blob/master/internal/envoy/listener.go#L189

The configuration is defined here (and doesn't support nesting) for the json_format dictionary:
https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#format-dictionaries

Support for the format key should probably also be included in case users don't need json formatted output, though I don't know what would happen if both keys are defined.

I've validated this works for the static resources, but for LDS it looks like Contour is the only way to configure it.

@davecheney
Copy link
Contributor

davecheney commented Apr 24, 2019 via email

@azuretek
Copy link

azuretek commented Apr 24, 2019

I agree with that approach, there doesn't seem to be a standard for JSON access logs but this structure appears to have everything that Envoy reports (other than custom headers)

json_format: {
  @timestamp: "%START_TIME%",
  downstream_remote_address: "%DOWNSTREAM_REMOTE_ADDRESS%",
  path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
  authority: "%REQ(:AUTHORITY)%",
  protocol: "%PROTOCOL%",
  upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
  upstream_local_address: "%UPSTREAM_LOCAL_ADDRESS%",
  duration: "%DURATION%",
  downstream_local_address: "%DOWNSTREAM_LOCAL_ADDRESS%",
  user_agent: "%REQ(USER-AGENT)%",
  response_code: "%RESPONSE_CODE%",
  response_flags: "%RESPONSE_FLAGS%",
  method: "%REQ(:METHOD)%",
  request_id: "%REQ(X-REQUEST-ID)%",
  upstream_host: "%UPSTREAM_HOST%",
  x_forwarded_for: "%REQ(X-FORWARDED-FOR)%",
  requested_server_name: "%REQUESTED_SERVER_NAME%",
  bytes_received: "%BYTES_RECEIVED%",
  bytes_sent: "%BYTES_SENT%",
  upstream_cluster: "%UPSTREAM_CLUSTER%"
}

@azuretek
Copy link

azuretek commented Apr 29, 2019

So I went ahead and did a build against the 0.11.0 code base with the changes we discussed.

#1053

The dynamic listener logs now look like this:

{"upstream_cluster":"default/hello-world/8080/da39a3ee5e","downstream_remote_address":"1.2.3.4:49286","path":"/?abc=123","authority":"hello.test.foo.bar","protocol":"HTTP/2","upstream_service_time":"1","@timestamp":"2019-04-25T06:31:07.162Z","upstream_local_address":"-","duration":"1","downstream_local_address":"10.220.3.107:443","response_code":"200","user_agent":"curl/7.54.0","response_flags":"-","method":"GET","request_id":"b4ecadf1-8fc0-4e87-89d8-a3aa851196d4","upstream_host":"10.141.181.15:8080","x_forwarded_for":"1.2.3.4","requested_server_name":"hello.test.foo.bar","bytes_received":"0","uber_trace_id":"-","bytes_sent":"17"}

I also added the same format to the static listener configs, this is the log for the health check endpoiont:

{"response_code":"200","user_agent":"kube-probe/1.11","response_flags":"-","method":"GET","request_id":"aadded21-6b3c-4fa4-bc3a-de62b6b4c2ec","upstream_host":"-","x_forwarded_for":"-","requested_server_name":"-","bytes_received":"0","uber_trace_id":"-","bytes_sent":"0","upstream_cluster":"-","downstream_remote_address":"10.220.40.142:43216","path":"/healthz","authority":"10.141.207.73:8002","protocol":"HTTP/1.1","upstream_service_time":"-","@timestamp":"2019-04-25T06:31:09.869Z","upstream_local_address":"-","duration":"0","downstream_local_address":"10.141.207.73:8002"}

Let me know what needs to be done to get this approved. I was thinking it might make sense to add a flag to enable or disable JSON access logs, but as you said the preferred approach is to create sane defaults which does seem like a good approach.

@bgagnon
Copy link
Contributor

bgagnon commented May 2, 2019

@azuretek is there a reason or restriction for outputting numerical values as strings in the JSON objects?

Bytes and duration are things one would want to do aggregations and calculate metrics on (assuming the logs end up in Elasticsearch or similar database). For the response code, it's convenient for range queries (<100, 200-299, 300-399, 400-499,>500).

It would be nice to avoid an ETL for such simple JSON type coercions.

EDIT: just found the relevant part of the Envoy docs:

Format dictionaries have the following restrictions:

The dictionary must map strings to strings (specifically, strings to command operators). Nesting is not currently supported.

That's too bad :/

@azuretek
Copy link

azuretek commented May 2, 2019

@bgagnon definitely ran into that in my PoC, it's annoying but ultimately it's able to be handled by adjusting the mappings on the elastic search side. To make it report correctly we'd have to make changes to envoy (which should probably be done).

@davecheney
Copy link
Contributor

davecheney commented May 4, 2019 via email

@azuretek
Copy link

azuretek commented May 4, 2019

@davecheney I need the default access log format to be structured, preferably JSON to more easily ingest into storage backends like Elasticsearch. This is a requirement for us to move away from using NGINX as our ingress.

Currently access logs use the default format string, as defined in the envoy documentation: https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#default-format-string

[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%"
%RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION%
%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%"
"%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%"\n

The desired format is JSON specifically with these keys and values. As defined in the envoy documentation https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#format-dictionaries

{
  "@timestamp": "%START_TIME%",
  "authority": "%REQ(:AUTHORITY)%",
  "bytes_received": "%BYTES_RECEIVED%",
  "bytes_sent": "%BYTES_SENT%",
  "downstream_local_address": "%DOWNSTREAM_LOCAL_ADDRESS%",
  "downstream_remote_address": "%DOWNSTREAM_REMOTE_ADDRESS%",
  "duration": "%DURATION%",
  "method": "%REQ(:METHOD)%",
  "path": "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
  "protocol": "%PROTOCOL%",
  "request_id": "%REQ(X-REQUEST-ID)%",
  "requested_server_name": "%REQUESTED_SERVER_NAME%",
  "response_code": "%RESPONSE_CODE%",
  "response_flags": "%RESPONSE_FLAGS%",
  "uber_trace_id": "%REQ(UBER-TRACE-ID)%",
  "upstream_cluster": "%UPSTREAM_CLUSTER%",
  "upstream_host": "%UPSTREAM_HOST%",
  "upstream_local_address": "%UPSTREAM_LOCAL_ADDRESS%",
  "upstream_service_time": "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
  "user_agent": "%REQ(USER-AGENT)%",
  "x_forwarded_for": "%REQ(X-FORWARDED-FOR)%"
},

This is an example of how the access log should look

{
  "upstream_cluster": "default/hello-world/8080/da39a3ee5e",
  "downstream_remote_address": "1.2.3.4:49286",
  "path": "/?abc=123",
  "authority": "hello.test.foo.bar",
  "protocol": "HTTP/2",
  "upstream_service_time": "1",
  "@timestamp": "2019-04-25T06:31:07.162Z",
  "upstream_local_address": "-",
  "duration": "1",
  "downstream_local_address": "10.220.3.107:443",
  "response_code": "200",
  "user_agent": "curl/7.54.0",
  "response_flags": "-",
  "method": "GET",
  "request_id": "b4ecadf1-8fc0-4e87-89d8-a3aa851196d4",
  "upstream_host": "10.141.181.15:8080",
  "x_forwarded_for": "1.2.3.4",
  "requested_server_name": "hello.test.foo.bar",
  "bytes_received": "0",
  "uber_trace_id": "-",
  "bytes_sent": "17"
}

@davecheney
Copy link
Contributor

davecheney commented May 4, 2019 via email

@davecheney davecheney added blocked/needs-design Categorizes the issue or PR as blocked because it needs a design document. and removed blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. labels May 6, 2019
@davecheney
Copy link
Contributor

Thank you to everyone who has contributed to this issue over the last few weeks.

I want to make it clear to all readers that I am not opposed to adding this feature. In fact I want to see it adopted as default.

What I do not want to see happen is JSON or NCSA become selectable especially if there are options to make each of those formats configurable. If this is the outcome we will stick with the default NCSA format.

Please do not send any PRs until there is agreement on which format will be used.

@davecheney
Copy link
Contributor

#1130 landed in 0.14. Moving to 0.15 to implement.

@davecheney davecheney removed the blocked/needs-design Categorizes the issue or PR as blocked because it needs a design document. label Jul 19, 2019
@davecheney davecheney self-assigned this Jul 19, 2019
@davecheney davecheney modified the milestones: 0.14.0, 0.15.0 Jul 19, 2019
@davecheney davecheney removed their assignment Aug 21, 2019
@davecheney davecheney modified the milestones: 0.15.0, 1.0.0-beta.1 Aug 21, 2019
@davecheney
Copy link
Contributor

Sorry, we ran out of time in 0.15 for this. @youngnick has a plan for how to implement this in beta1.

@youngnick youngnick self-assigned this Sep 11, 2019
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 11, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 11, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
@youngnick
Copy link
Member

I've published #1485 with our current design ideas around this feature. I'm hoping to get this reviewed and agreed soon with the aim to land this in beta1.

youngnick pushed a commit to youngnick/contour that referenced this issue Sep 13, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
@youngnick
Copy link
Member

After internal review, I've merged #1485 with the team's preferred design for JSON logging. Please review either #1485 or envoy-json-logging.md for the details. Comments are welcome here.

youngnick pushed a commit to youngnick/contour that referenced this issue Sep 16, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 16, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 16, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 16, 2019
Updates projectcontour#624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit that referenced this issue Sep 17, 2019
Updates #624

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 17, 2019
Fixes projectcontour#624.

Also update a small bug in config file definition that can lead to parsing weirdness.

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit to youngnick/contour that referenced this issue Sep 17, 2019
Fixes projectcontour#624.

Also update a small bug in config file definition that can lead to parsing weirdness.

Signed-off-by: Nick Young <ynick@vmware.com>
youngnick pushed a commit that referenced this issue Sep 17, 2019
Fixes #624.

Also update a small bug in config file definition that can lead to parsing weirdness.

Signed-off-by: Nick Young <ynick@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants