Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Telemetry] [Monitoring] Only retry fetching usage once monitoring bulk upload is successful #54294

Closed
Bamieh opened this issue Jan 8, 2020 · 4 comments · Fixed by #54309
Closed
Labels
bug Fixes for quality problems that affect the customer experience Feature:Stack Monitoring Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.5.2 v7.6.0 v8.0.0

Comments

@Bamieh
Copy link
Member

Bamieh commented Jan 8, 2020

The bulk uploader in monitoring attempts to bulk insert data into Elasticsearch every 10 seconds (defined by the flag xpack.monitoring.kibana.collection.interval).

To avoid performance issues, we have throttled fetching telemetry usage data to once every 24 hours in the bulk uploader when monitoring is enabled.

The current behavior is to keep fetching and trying to insert usage data until ES succeeds. Once it succeeds we start fetching usage every 24 hours.

When monitoring is not enabled, the bulk uploader will keep on retring since ES returns ignored: true (the index does not exist) rendering the operation as unsuccessful, hence fetching usage again.

This is happening on all 7.x and master. It was discovered when running a backport against 7.5 branch. (#54055)

To improve performance when monitoring is not enabled we can start fetching usage data once the bulk uploader gets a success on the bulk insert from ES.

The tiny downside to this approach is that we will not be getting usage data on the first successful insert after enabling monitoring. We will be getting this data on the second tick (in less that 20 seconds).

CC @aaronjcaldwell

@elasticmachine
Copy link
Contributor

Pinging @elastic/pulse (Team:Pulse)

@Bamieh Bamieh changed the title [7.5] [Monitoring] [Telemetry] Bulk upload is failing to insert telemetry payload into ES [Telemetry] [Monitoring] Only retry fetching usage once monitoring bulk upload is successful Jan 8, 2020
@TinaHeiligers
Copy link
Contributor

@Bamieh I could reproduce the ES response having ignored: true in 7.5 and also in master.

@Bamieh
Copy link
Member Author

Bamieh commented Jan 8, 2020

@TinaHeiligers you are correct. I have updated the issue with more accurate description after i've debugged this further. I've submitted a PR to fix this so please have a look there as well 🙂 .

@Bamieh Bamieh added the bug Fixes for quality problems that affect the customer experience label Jan 8, 2020
@timroes timroes added v7.5.2 and removed 7.5.2 labels Jan 10, 2020
@lukeelmers lukeelmers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Oct 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Stack Monitoring Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.5.2 v7.6.0 v8.0.0
Projects
None yet
5 participants