Fix priority sampling mishandling agent response #591

delner · 2018-10-17T14:05:59Z

While adding debugging metrics to the tracer, I discovered that when priority sampling is enabled, the writer mishandles the JSON response it receives from the agent, when trying to use it to update its sampling rates.

It was receiving a response formatted like:

{"rate_by_service":{"service:,env:":1,"service:rspec,env:none":1}}

However, the logic was written such that it didn't expect a top level key "rate_by_service", so it consumes that key as a service improperly, and throws an error when trying to process the Hash like a Float.

This bug was hidden because the callback that did this was wrapped with exception handling which silently appended to a debug log, instead of raising an exception or increasing the internal error metric.

This pull request fixes the bug by consuming the value properly, and now properly asserts that the internal error metric does not increase in the test, which should prevent a similar bug from going unnoticed in the future.

pawelchcki

Would be great to cover this with tests, but it can be done in a separate PR.

Otherwise looks good!

delner · 2018-10-17T14:34:29Z

@pawelchcki It should be covered by both the RSpec test that was updated as well as the integration test for priority sampling (which wasn't changed here but was already asserting that the internal error metric did not increase. by adding this metric to the error handler, we're making this test pass/fail properly now.)

delner added bug Involves a bug core Involves Datadog core libraries labels Oct 17, 2018

delner self-assigned this Oct 17, 2018

delner requested a review from pawelchcki October 17, 2018 14:06

delner force-pushed the fix/priority_sampling_response_parsing branch from 1b93248 to baf393e Compare October 17, 2018 14:07

Fixed: Priority sampling mishandling agent response.

8ae1836

delner force-pushed the fix/priority_sampling_response_parsing branch from baf393e to 8ae1836 Compare October 17, 2018 14:12

delner added this to the 0.16.1 milestone Oct 17, 2018

pawelchcki approved these changes Oct 17, 2018

View reviewed changes

delner merged commit 3be61a8 into master Oct 17, 2018

delner deleted the fix/priority_sampling_response_parsing branch October 17, 2018 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix priority sampling mishandling agent response #591

Fix priority sampling mishandling agent response #591

delner commented Oct 17, 2018 •

edited

Loading

pawelchcki left a comment

delner commented Oct 17, 2018

Fix priority sampling mishandling agent response #591

Fix priority sampling mishandling agent response #591

Conversation

delner commented Oct 17, 2018 • edited Loading

pawelchcki left a comment

Choose a reason for hiding this comment

delner commented Oct 17, 2018

delner commented Oct 17, 2018 •

edited

Loading