Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance search telemetry to avoid high load to OpenSearch cluster #928

Closed
seraphjiang opened this issue Nov 9, 2021 · 3 comments · Fixed by #1427
Closed

Enhance search telemetry to avoid high load to OpenSearch cluster #928

seraphjiang opened this issue Nov 9, 2021 · 3 comments · Fixed by #1427
Assignees
Labels
enhancement New feature or request help wanted Community development is encouraged v2.0.0

Comments

@seraphjiang
Copy link
Member

seraphjiang commented Nov 9, 2021

Is your feature request related to a problem? Please describe.
Search Telemetry is used to analyze performance for sucess/failed search request in Dashboards. It will save telemetry data into .kibana_1 index, in scenario there are thousands of concurrent search request from Dashboards. It causes significant load in OpenSearch cluster. We have been seen some case 1/3 Search/Update request of total are come from this.

Describe the solution you'd like
there are multiple options to enhance this to avoid high load to cluster

  • Option A: provide options to disable search telemetry - Preferred
  • Option B: Reduce the frequency to save the telemetry data - e.g. aggregate and save every 5 minutes

Code:
https://github.com/opensearch-project/OpenSearch-Dashboards/blob/9a2d3e6918760c4bba8b44b9e8dfcd14e459a916/src/plugins/data/server/search/collectors/usage.ts

Data Saved in .kibana

curl -s "http://localhost:9200/.kibana/_doc/search-telemetry:search-telemetry" | jq
{
  "_index": ".kibana_1",
  "_type": "_doc",
  "_id": "search-telemetry:search-telemetry",
  "_version": 196824640,
  "_seq_no": 197841597,
  "_primary_term": 18,
  "found": true,
  "_source": {
    "search-telemetry": {
      "successCount": 90694131,
      "errorCount": 397950,
      "averageDuration": 6.615643114854054e-08
    },
    "type": "search-telemetry",
    "references": [],
    "updated_at": "2021-11-08T17:38:54.877Z"
  }
}
@seraphjiang seraphjiang added the enhancement New feature or request label Nov 9, 2021
@seanneumann seanneumann changed the title Enhance search telemetry to avoid high load to opensearch culster Enhance search telemetry to avoid high load to OpenSearch cluster Nov 10, 2021
@ahopp
Copy link
Contributor

ahopp commented Nov 11, 2021

Without a deeper dive I wouldn't know the second order affects, but at face value I'd prefer Option A. There doesn't seem anything wrong with letter admins/users configure based on their preferences.

EDIT: There is an Option C which would be have configuration for both on/off and frequency.

@tmarkley tmarkley added help wanted Community development is encouraged and removed untriaged labels Nov 16, 2021
@manojfaria
Copy link

+1 for option C. One of the OpenSearch 1.1 deployments that i am working with has 100s of kibana dashboard end users who rely on rich kibana dashboards (with ~80 visualizations per dashboard) and dashboards are set to auto refresh every 3 to 10 seconds. This cluster observes approximately 600k to 1 million search-telemetry requests per hour i.e. ~600k to ~1 million updates per hour posted to .kibana index. Current observations is that # of "POST /.kibana/_update/search-telemetry%3Asearch-telemetry" requests maps to the number of _search requests that originate from Kibana dashboards. Hence as the # of users of kibana dashboards grows, it results in increased search telemetry requests logged to kibana index - leading to increased IndexRate for .kibana index thereby leading to incresed resource consumption (e.g. high CPU) on 2 nodes (since kibana index is setup by default with 1primary,1replica shard) which eventually leads to node drops and cluster instability.

At the moment it is not yet clear to us what kind of search-telemetry information is being logged and how one is expected leverage this info and chalk out next steps. Hence it helps if opensearch can provide an option to disable search-telemetry info logged to kibana, as well as provide an option to collect search-telemetry info less aggressively when such a need to record the search-telemetry info arises. Option C seems like a good path forward.

@kavilla
Copy link
Member

kavilla commented Apr 5, 2022

@Flyingliuhub and @sichend, thanks for picking this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Community development is encouraged v2.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants