Skip to content

[APM] Collect telemetry about data/queries #50757

@dgieselaar

Description

@dgieselaar

We currently don't have a lot of insight into the amounts of data that our customers have and how long it takes for our ES queries to be processed. This makes it hard to judge at which scale current and new functionalities need to operate. For instance, in some cases it might be reasonable to process things in memory on the Node server rather than in ES, which might simplify our implementation. Additionally, optimizing right now is hard because we don't know where our users are experiencing slowness.

Ideally we would have telemetry about:

  • The data volume (how many errors/error groups? how many transactions? how many transactions/spans per trace? how many services? etc). This could be collected with a Kibana task that queries the data indices at a set interval and sends the data back home.
  • Query response times. We could instrument our ES client facade to store telemetry about ES response times. There's also the possibility of using the nodejs agent to instrument Kibana, but the ongoing efforts are explicitly scoped to non-production usage: Instrument Kibana with Elastic APM #43548

@graphaelli: any idea if the monitoring data that we have provides an answer to any of these questions?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions