Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surfacing deprecations with rich context from ES warning header #120044

Merged
merged 23 commits into from
Dec 8, 2021

Conversation

rudolf
Copy link
Contributor

@rudolf rudolf commented Nov 30, 2021

Summary

Related #120043

To provide maximum context to developers about the origin of a deprecation warning this instruments the core elasticsearch client to be able to log deprecation warnings, a stack trace and the query that caused the deprecation.

This will create a log entry like:

[2021-12-01T00:24:02.108+01:00][DEBUG][elasticsearch.deprecation] ES DEPRECATION: 299 Elasticsearch-8.1.0-SNAPSHOT-fe7c2d59d80ae9f75c513f7d89879acef665ebe0 "this request accesses system indices: [.kibana_task_manager_8.1.0_001], but in a future major version, direct access to system indices will be prevented by default"
Origin: kibana
Stack trace:
    at Indices.putMapping (/Users/rudolf/dev/kibana/node_modules/@elastic/elasticsearch/src/api/api/indices.ts:1085:12)
    at stateActionMachine (/Users/rudolf/dev/kibana/src/core/server/saved_objects/migrations/state_action_machine.ts:70:28)
    at migrationStateActionMachine (/Users/rudolf/dev/kibana/src/core/server/saved_objects/migrations/migrations_state_action_machine.ts:105:24)
    at async Promise.all (index 1)
    at SavedObjectsService.start (/Users/rudolf/dev/kibana/src/core/server/saved_objects/saved_objects_service.ts:434:9)
    at Server.start (/Users/rudolf/dev/kibana/src/core/server/server.ts:300:31)
    at Root.start (/Users/rudolf/dev/kibana/src/core/server/root/index.ts:62:14)
    at bootstrap (/Users/rudolf/dev/kibana/src/core/server/bootstrap.ts:102:5)
    at Command.<anonymous> (/Users/rudolf/dev/kibana/src/cli/serve/serve.js:244:5)
Query:
200
PUT /.kibana_task_manager_8.1.0_001/_mapping?timeout=60s

Deprecations caused by user requests are logged on the debug level, to see these use the following config:

logging.loggers:
  - name: elasticsearch.deprecation
    level: all

On CI (7.16 branch) it produces deprecation warnings inline with the test results making it easy to track the source of the deprecation:

2021-12-03 21:15:48 CEST	           └-: unknown search field
2021-12-03 21:15:48 CEST	             └-> "before all" hook: beforeTestSuite.trigger for "should return 200 with empty response"
2021-12-03 21:15:48 CEST	             └-> should return 200 with empty response
2021-12-03 21:15:48 CEST	               └-> "before each" hook: global before each for "should return 200 with empty response"
2021-12-03 21:15:48 CEST	               │ proc [kibana]   log   [20:15:48.568] [error][deprecation][elasticsearch] ES DEPRECATION: 299 Elasticsearch-7.16.0-SNAPSHOT-34fb605e5a51f9ecbf4a6ad3d3b910402c5b2121 "this request accesses system indices: [.kibana_7.16.0_001], but in a future major version, direct access to system indices will be prevented by default"
2021-12-03 21:15:48 CEST	               └- ✓ pass  (21ms)
2021-12-03 21:15:48 CEST	             └-> "after all" hook: afterTestSuite.trigger for "should return 200 with empty response"
2021-12-03 21:15:48 CEST	           └-: unknown namespace
2021-12-03 21:15:48 CEST	             └-> "before all" hook: beforeTestSuite.trigger for "should return 200 with empty response"
2021-12-03 21:15:48 CEST	             └-> should return 200 with empty response
2021-12-03 21:15:48 CEST	               └-> "before each" hook: global before each for "should return 200 with empty response"
2021-12-03 21:15:48 CEST	               │ proc [kibana]   log   [20:15:48.588] [error][deprecation][elasticsearch] ES DEPRECATION: 299 Elasticsearch-7.16.0-SNAPSHOT-34fb605e5a51f9ecbf4a6ad3d3b910402c5b2121 "this request accesses system indices: [.kibana_7.16.0_001], but in a future major version, direct access to system indices will be prevented by default"
2021-12-03 21:15:48 CEST	               └- ✓ pass  (18ms)

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
Multiple Spaces—unexpected behavior in non-default Kibana Space. Low High Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces.
Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. High Low Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure.
Code should gracefully handle cases when feature X or plugin Y are disabled. Medium High Unit tests will verify that any feature flag or plugin combination still results in our service operational.
See more potential risk examples

For maintainers

@rudolf rudolf added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Nov 30, 2021
@rudolf rudolf force-pushed the es-log-deprecations branch from 2739916 to 4a5efc1 Compare December 3, 2021 14:41
@rudolf rudolf marked this pull request as ready for review December 3, 2021 17:49
@rudolf rudolf requested review from a team as code owners December 3, 2021 17:49
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@rudolf rudolf added 8.0.0 v7.16.0 release_note:skip Skip the PR/issue when compiling release notes labels Dec 3, 2021
@@ -65,7 +65,13 @@ export const config = {
defaultValue: new Map<string, AppenderConfigType>(),
}),
loggers: schema.arrayOf(loggerSchema, {
defaultValue: [],
defaultValue: [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

realised that configuring any loggers would remove this default and thereby switch the elasticsearch.deprecation logger level to info...

So I need to have logic that's a bit cleverer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what you are running into here is somewhat related to this enhancement issue we have for default appender configs: #92082

What would be useful here is to be able to do something similar for loggers.

Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud

FWIW I would be in support of this, but if we went down this path I would vote that we clean up the structure of the log messages just a bit (see my other comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You chose the hard path by not just using the debug log level 😅

Even if I think this approach to use more normal log levels such as warning and error, and try to disable the elasticsearch.deprecation by default is nice, what is the main reason to not use debug instead?

  • It requires a lot less changes
  • It's more consistent with the other 'disabled by default' logger we currently have, such as elasticsearch.query or http.server.response

src/core/server/elasticsearch/client/configure_client.ts Outdated Show resolved Hide resolved
src/core/server/elasticsearch/client/configure_client.ts Outdated Show resolved Hide resolved
@@ -65,7 +65,13 @@ export const config = {
defaultValue: new Map<string, AppenderConfigType>(),
}),
loggers: schema.arrayOf(loggerSchema, {
defaultValue: [],
defaultValue: [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what you are running into here is somewhat related to this enhancement issue we have for default appender configs: #92082

What would be useful here is to be able to do something similar for loggers.

Unless we decide to enable this logging context by default, might be useful information to see how many logs get triggered on Cloud

FWIW I would be in support of this, but if we went down this path I would vote that we clean up the structure of the log messages just a bit (see my other comment)

packages/kbn-pm/dist/index.js Outdated Show resolved Hide resolved
@jportner jportner added v8.0.0 and removed 8.0.0 labels Dec 4, 2021
@rudolf rudolf added v8.1.0 auto-backport Deprecated - use backport:version if exact versions are needed labels Dec 6, 2021

queryLogger.debug(queryMsg, meta);

if (event.warnings && event.warnings.length > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that warnings header contains only deprecations? Maybe the deprecation warning should have a structure to allow Kibana to distinguish deprecations from other warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that an intermediate proxy injects warning headers here but it won't be the end of the world if we log them. All warnings include Elasticsearch-${semver} so we could use that to filter out noise but I think that's better done in the Elasticsearch-js client https://github.com/elastic/elastic-transport-js/blob/main/src/Transport.ts#L333

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I added a filter to Kibana, I will create an upstream issue so that we can remove this from Kibana long term

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(event.meta.request.params.headers != null &&
(event.meta.request.params.headers[
'x-elastic-product-origin'
] as unknown as string)) === 'kibana'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases it doesn't contain kibana? Why it's always user? Can it have a deprecation originating from a logstash request, for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ES team decided that each product should have it's own origin header like {'x-elastic-product-origin': 'kibana'}. ES does validation on this header for known Elastic products. They will include this header in the deprecation document created in the deprecations index. Then upgrade assistant will then filter out any deprecation documents with an 'x-elastic-product-origin' header an not show this to users because if it's our products then users can't fix it.

When a request comes from a user (e.g. when they use the console app) then we don't include this header so that users can see that they are using deprecated APIs. We haven't gotten as far yet... but once #120201 is done, we can ask teams to go through all their APIs, if there's an API that contains 100% user controlled content then they should mark that API call as coming from the user by not sending the x-elastic-product-origin header.

So from Kibana's perspective if we get a deprecation warning, it's either because us developers wrote a deprecated query or because users wrote it.


const deprecationMsg = `Elasticsearch deprecation: ${event.warnings}\nOrigin:${requestOrigin}\nStack trace:\n${stackTrace}\nQuery:\n${queryMsg}`;
if (requestOrigin === 'kibana') {
deprecationLogger.warn(deprecationMsg);
Copy link
Contributor

@mshustov mshustov Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can an end-user fix the problem? If not, maybe it shouldn't have the warning level? End users are worried about warning and errors for a good reason - they want their deployment to be healthy. If they can't fix a problem, they will contact us. #35004 is a reminder that we should pick a log level carefully

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point, we want these logs to surface by default on cloud so we can monitor and fix any deprecations we caused... but info would be fine.

Copy link
Contributor

@mshustov mshustov Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but info would be fine

Maybe we can use debug and configure Cloud to always produce deprecation logs?

logging.loggers:
  - name: elasticsearch.deprecation
    level: all

but info would be fine.

info is logged by default on-prem version, so it would confuse end users

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use debug and configure Cloud to always produce deprecation logs?

++ Hadn't considered this, but if the main reason for configuring a higher log level was to ensure they are surfaced on Cloud, then I agree this would simplify things greatly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wouldn't want to surface deprecations caused by users on cloud by default since this could add a lot of volume. So we would need two logger contexts, elasticsearch.deprecation.kibana and elasticsearch.deprecation.user so that cloud could only enable debug logs for elasticsearch.deprecation.kibana.

In terms of priorities, it's critical that we backport this to 7.16 and give teams a heads-up if they need to action anything. So I'll create an follow-up issue that can be done later to lower the level and change the configuration on cloud.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? 'kibana'
: 'user';

// Strip the first 5 stack trace lines as these are irrelavent to finding the call site
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really show a place where a request was initiated? I suspect https://github.com/elastic/kibana/pull/120044/files#diff-b5f83498b6f6c9bb148d4ca6705a2b2a3757d0f8c177ae6a56e8236817eb4702R88 doesn't keep the async stack trace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On main/8.0 yes:

Stack trace:
    at Xpack.info (/Users/rudolf/dev/kibana/node_modules/@elastic/elasticsearch/src/api/api/xpack.ts:66:12)
    at LicensingPlugin.fetchLicense (/Users/rudolf/dev/kibana/x-pack/plugins/licensing/server/plugin.ts:180:34)

But on 7.16 where it's more critical to trace deprecations it doesn't :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC it does not always work on 8.0 either. But let's keep it for now as is. What are you going to do for 7.16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove stack traces for 7.16 since it's never useful, but having the full query is usually enough to track down where the request was made.

@rudolf rudolf requested a review from mshustov December 7, 2021 16:16
Copy link
Contributor

@mshustov mshustov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lower the log level for the deprecation logs, LGTM otherwise

Comment on lines +59 to +62
'--logging.appenders.deprecation.type=console',
'--logging.appenders.deprecation.layout.type=json',
'--logging.loggers[0].name=elasticsearch.deprecation',
'--logging.loggers[0].appenders[0]=deprecation',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecation messages contain newlines and then tooling_log_text_writer prints only the first line but ignores subsequent lines with the query and stack trace. Having this in JSON isn't ideal but it's the best I can think of 🤷

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@rudolf rudolf removed the v7.17.0 label Dec 8, 2021
@rudolf rudolf merged commit 1a6a9ae into elastic:main Dec 8, 2021
@rudolf rudolf deleted the es-log-deprecations branch December 8, 2021 17:32
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Dec 8, 2021
…tic#120044)

* First stab at surfacing deprecations from warning header

* Log deprecations with error level but disable logger context by default

* Don't filter out error logs from ProcRunner

* Another try at not having messages ignored on CI

* Log deprecation logs with warn not info

* Tests

* Let write() do it's writing

* Commit pre-built @kbn/pm package

* Second try to commit pre-built @kbn/pm package

* Enable deprecation logger for jest_integration even though logs aren't interleaved

* Apply suggestions from code review

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>

* deprecations logger: warn for kibana and debug for users

* Refactor split query and deprecation logger out of configure_client

* Unit test for tooling_log_text_writer

* Fix TS

* Use event.meta.request.params.headers to include Client constructor headers

* Fix tests

* Ignore deprecation warnings not from Elasticsearch

* Log on info level

* Log in JSON so that entire deprecation message is on one line

* commit built kbn-pm package

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
@kibanamachine
Copy link
Contributor

💔 Backport failed

Status Branch Result
8.0
7.16 Commit could not be cherrypicked due to conflicts

Successful backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 120044

kibanamachine added a commit that referenced this pull request Dec 8, 2021
) (#120800)

* First stab at surfacing deprecations from warning header

* Log deprecations with error level but disable logger context by default

* Don't filter out error logs from ProcRunner

* Another try at not having messages ignored on CI

* Log deprecation logs with warn not info

* Tests

* Let write() do it's writing

* Commit pre-built @kbn/pm package

* Second try to commit pre-built @kbn/pm package

* Enable deprecation logger for jest_integration even though logs aren't interleaved

* Apply suggestions from code review

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>

* deprecations logger: warn for kibana and debug for users

* Refactor split query and deprecation logger out of configure_client

* Unit test for tooling_log_text_writer

* Fix TS

* Use event.meta.request.params.headers to include Client constructor headers

* Fix tests

* Ignore deprecation warnings not from Elasticsearch

* Log on info level

* Log in JSON so that entire deprecation message is on one line

* commit built kbn-pm package

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Rudolf Meijering <rudolf.meijering@elastic.co>
Co-authored-by: Luke Elmers <lukeelmers@gmail.com>
rudolf referenced this pull request in rudolf/kibana Dec 9, 2021
…tic#120044)

* First stab at surfacing deprecations from warning header

* Log deprecations with error level but disable logger context by default

* Don't filter out error logs from ProcRunner

* Another try at not having messages ignored on CI

* Log deprecation logs with warn not info

* Tests

* Let write() do it's writing

* Commit pre-built @kbn/pm package

* Second try to commit pre-built @kbn/pm package

* Enable deprecation logger for jest_integration even though logs aren't interleaved

* Apply suggestions from code review

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>

* deprecations logger: warn for kibana and debug for users

* Refactor split query and deprecation logger out of configure_client

* Unit test for tooling_log_text_writer

* Fix TS

* Use event.meta.request.params.headers to include Client constructor headers

* Fix tests

* Ignore deprecation warnings not from Elasticsearch

* Log on info level

* Log in JSON so that entire deprecation message is on one line

* commit built kbn-pm package
rudolf added a commit that referenced this pull request Dec 9, 2021
#120044) (#120882)

* Surfacing deprecations with rich context from ES warning header (elastic#120044)

* First stab at surfacing deprecations from warning header

* Log deprecations with error level but disable logger context by default

* Don't filter out error logs from ProcRunner

* Another try at not having messages ignored on CI

* Log deprecation logs with warn not info

* Tests

* Let write() do it's writing

* Commit pre-built @kbn/pm package

* Second try to commit pre-built @kbn/pm package

* Enable deprecation logger for jest_integration even though logs aren't interleaved

* Apply suggestions from code review

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>

* deprecations logger: warn for kibana and debug for users

* Refactor split query and deprecation logger out of configure_client

* Unit test for tooling_log_text_writer

* Fix TS

* Use event.meta.request.params.headers to include Client constructor headers

* Fix tests

* Ignore deprecation warnings not from Elasticsearch

* Log on info level

* Log in JSON so that entire deprecation message is on one line

* commit built kbn-pm package

* Remove stack traces as these are useless on 7.x
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
…tic#120044)

* First stab at surfacing deprecations from warning header

* Log deprecations with error level but disable logger context by default

* Don't filter out error logs from ProcRunner

* Another try at not having messages ignored on CI

* Log deprecation logs with warn not info

* Tests

* Let write() do it's writing

* Commit pre-built @kbn/pm package

* Second try to commit pre-built @kbn/pm package

* Enable deprecation logger for jest_integration even though logs aren't interleaved

* Apply suggestions from code review

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>

* deprecations logger: warn for kibana and debug for users

* Refactor split query and deprecation logger out of configure_client

* Unit test for tooling_log_text_writer

* Fix TS

* Use event.meta.request.params.headers to include Client constructor headers

* Fix tests

* Ignore deprecation warnings not from Elasticsearch

* Log on info level

* Log in JSON so that entire deprecation message is on one line

* commit built kbn-pm package

Co-authored-by: Luke Elmers <lukeelmers@gmail.com>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.16.1 v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants