-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate moving sourcemapping to an enrich processor #3606
Comments
There are some ramifications to this change that I had not previously considered, as there are several things that depend on the stacktrace:
|
Regarding Error culprit identification should be straightforward. |
In Painless, regex support is limited to constant patterns. At the top of https://www.elastic.co/guide/en/elasticsearch/painless/7.10/painless-regexes.html:
So, the only way we could match filenames against a configurable regular expression would be by recreating the pipeline each time the config changes. Alternatively, we could switch the config over to using wildcard patterns like we use in the agents, such as in sanitize_field_names. Then we could inject the config into events, pick that up in the pipeline, and apply them with wildcard-matching logic written in Painless. I'm leaning towards the latter at the moment. It'll mean a more complicated pipeline script, but it will provide more consistency in configuration across APM. |
Apart from error grouping key calculation (elastic/apm-data#146), I've got everything (I think?) working in master...axw:sourcemap-enrich. It's a little bit messy, but should demonstrate how things would work. There's a substantial amount of Painless. This includes: sourcemapping, identifying library frames (using wildcard matching, see #3606 (comment)), and identifying the error culprit. |
I've created a new branch rebased on master, moving all the ingest node stuff into the pipeline we install: master...axw:sourcemap-enrich-take2 In that branch the pipeline is defined in The pipeline uses the fingerprint processor to compute
The other issue that we'll have is that since the fingerprint processor was only added in 7.12, we can't just introduce it to the pipeline as that would break compatibility with older versions of Elasticsearch. So if we are doing this, I think we can only add it in the integration package. Seeing as the hashes will change, we may as well:
|
I'm running some performance tests. Sending 1000x RUM errors and comparing methods (server vs. ingest).
I ran the test three times each and checked node stats each time, looking at the time spent in the "apm" ingest pipeline. Over each three runs, the pipeline averages ~0.1ms per event for in-server source mapping, and ~1ms per event for ingest source mapping. I also instrumented the time spent in the server in applying sourcemaps, and it works out to ~0.1ms per error each with 6 stack frames = 60000 frames. So the ingest approach is considerably slower with worst case 10x slowdown for ingestion. Now we need to answer:
For (1): I'm tending towards a no. The performance loss may not be apparent with a single APM Server given its current ingestion rate (in)capability, but with a cluster APM Servers handling heavy RUM traffic we could end up bottlenecked on ingest node. On top of all that, moving the process to ingest node carries some risk, and requires breaking changes. For (2): knowing what we know now about Fleet hooks, we could do something like as follows
apm-server:
rum:
source_maps:
- service.name: opbeans-rum
service.version: 1.2.3
bundle.filepath: /test/e2e/general-usecase/bundle.js.map
sourcemap.url: http://somewhere.com/bundle.js.map
|
This performance difference comes a bit unexpected; great that you measured it. The alternative with injecting a reference to the artifact sounds like a good approach. |
One thing to note: if we don't go with ingest node, then we don't get to fix #2724. We shouldn't kill performance in pursuit of that goal though. |
https://github.com/axw/kibana/tree/apm-sourcemap-routes is a hackish POC which adds source map upload, list, and delete routes to Kibana, storing sourcemaps as fleet artifacts. On upload/delete, references to the artifacts are injected into APM policies. |
@vigneshshanmugam is going to put together some thoughts on how we can improve our source maps experience overall, so I'll wait until that's available before we make a final call and open implementation issues. At this stage it looks likely that we will move ahead with the artifacts approach in favour of ingest node. |
I'm going to open implementation issues now covering the creation of a new Kibana endpoint with minimal differences compared to the existing APM Server source map upload endpoint, to simplify migration. We can always introduce another, simpler, endpoint later on. |
Closing this in favour of #5002 and elastic/kibana#95393 @vigneshshanmugam when you have time to write up your proposal, please share we me and we can create new issues |
I would like us to investigate moving sourcemapping logic out of apm-server, and into an ingest node pipeline. This would enable us to fix #2724, and would likely also speed things up by doing everything in Elasticsearch, where the sourcemaps are stored.
In order to do this, I think we could use the Enrich processor to enrich ingested documents with the sourcemap, followed by a script processor which adjusts/enriches stacktrace fields, and finally removes the sourcemap field.
To use the Enrich processor, we would need to store a field which concatenates the properties we use to match them into one field: service name, service version, and file/URL path.
The text was updated successfully, but these errors were encountered: