Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model: sanitize/de-dot keys for labels, custom, and marks #4465

Merged
merged 7 commits into from
Dec 1, 2020

Conversation

axw
Copy link
Member

@axw axw commented Nov 30, 2020

Motivation/summary

Sanitize/de-dot keys for labels, custom, and marks, in the server when transforming to documents. Reserved characters (*, ., and ") will be replaced with _.

Remove the validation from the modeldecoder types, as agents will no longer be required to sanitize the keys. This changes various patternProperties to additionalProperties in the generated JSON Schema.

Finally, remove the model.Labels and models.Custom types and use common.MapStr directly (and without an extra level of pointer indirection). Use the knowledge that label keys cannot have dots to avoid the need for deep updates, replacing the relatively expensive utility.DeepUpdate calls with shallow map merging.

Checklist

I have considered changes for:
- [ ] documentation
- [ ] logging (add log lines, choose appropriate log selector, etc.)
- [ ] metrics and monitoring (create issue for Kibana team to add metrics to visualizations, e.g. Kibana#44001)

How to test these changes

make test

Related issues

Closes #4225

axw added 3 commits November 30, 2020 17:02
Replace reserved label key characters with '_', alleviating clients
from having to do this.

Also, optimise how we merge global and event-specific labels. Instead
of adding all global labels and then using utility.DeepUpdate, de-dot
the label keys first and range over the two maps to create a second map.
Because the labels keys cannot have dots, we do not need to deep merge.
Sanitize the keys of "custom" context using the same
logic applied to labels. Also, drop the model.Custom
type and just use common.MapStr. This eliminates some
unnecessary pointer indirection.
@apmmachine
Copy link
Contributor

apmmachine commented Nov 30, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #4465 updated

  • Start Time: 2020-12-01T00:17:28.311+0000

  • Duration: 44 min 36 sec

Test stats 🧪

Test Results
Failed 0
Passed 4614
Skipped 141
Total 4755

Steps errors 4

Expand to view the steps failures

Run Window tests

  • Took 9 min 9 sec . View more details on here

Compress

  • Took 0 min 0 sec . View more details on here
  • Description: tar --exclude=coverage-files.tgz -czf coverage-files.tgz coverage

Compress

  • Took 0 min 0 sec . View more details on here
  • Description: tar --exclude=system-tests-linux-files.tgz -czf system-tests-linux-files.tgz system-tests

Test Sync

  • Took 3 min 7 sec . View more details on here
  • Description: ./.ci/scripts/sync.sh

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement!

(just realized the PR is still in draft, looked ready to me ;) )

@axw axw marked this pull request as ready for review December 1, 2020 00:17
@axw axw merged commit b0e6d02 into elastic:master Dec 1, 2020
@axw axw deleted the labels-dedot branch December 1, 2020 01:01
axw added a commit to axw/apm-server that referenced this pull request Dec 14, 2020
* model: sanitize label keys

Replace reserved label key characters with '_', alleviating clients
from having to do this.

Also, optimise how we merge global and event-specific labels. Instead
of adding all global labels and then using utility.DeepUpdate, de-dot
the label keys first and range over the two maps to create a second map.
Because the labels keys cannot have dots, we do not need to deep merge.

* model: sanitize "custom" attribute keys

Sanitize the keys of "custom" context using the same
logic applied to labels. Also, drop the model.Custom
type and just use common.MapStr. This eliminates some
unnecessary pointer indirection.

* model: sanitize "marks" keys

* model/modeldecoder: drop label key regexp rules

* Add changelog entry
axw added a commit that referenced this pull request Dec 14, 2020
)

* model: sanitize label keys

Replace reserved label key characters with '_', alleviating clients
from having to do this.

Also, optimise how we merge global and event-specific labels. Instead
of adding all global labels and then using utility.DeepUpdate, de-dot
the label keys first and range over the two maps to create a second map.
Because the labels keys cannot have dots, we do not need to deep merge.

* model: sanitize "custom" attribute keys

Sanitize the keys of "custom" context using the same
logic applied to labels. Also, drop the model.Custom
type and just use common.MapStr. This eliminates some
unnecessary pointer indirection.

* model: sanitize "marks" keys

* model/modeldecoder: drop label key regexp rules

* Add changelog entry
@simitt simitt self-assigned this Dec 23, 2020
@simitt
Copy link
Contributor

simitt commented Dec 23, 2020

Tested with BC1 - sent key "gr.o*u\"p_" was indexed as "gr_o_u_p_".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add de-dot logic to server
3 participants