-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[auto-discovery] re-pipe configurations after JMXFetch restart #3415
Conversation
It seems that SD-based jmx metric collection is borked at the moment on Kubernetes. Friendly question: Is there any ETA on merging this? |
@@ -69,6 +69,8 @@ | |||
'list_limited_attributes': "List attributes that do match one of your instances configuration but that are not being collected because it would exceed the number of metrics that can be collected", | |||
JMX_COLLECT_COMMAND: "Start the collection of metrics based on your current configuration and display them in the console"} | |||
|
|||
JMX_LAUNCH_FILE = 'jmx.launch' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should be a hidden file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's in jmxfetch's temp directory I'd say a regular file is okay
@bai this just barely missed the window for the upcoming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment, let me know what you think
@@ -69,6 +69,8 @@ | |||
'list_limited_attributes': "List attributes that do match one of your instances configuration but that are not being collected because it would exceed the number of metrics that can be collected", | |||
JMX_COLLECT_COMMAND: "Start the collection of metrics based on your current configuration and display them in the console"} | |||
|
|||
JMX_LAUNCH_FILE = 'jmx.launch' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's in jmxfetch's temp directory I'd say a regular file is okay
agent.py
Outdated
try: | ||
jmx_launch = JMXFetch._get_jmx_launchtime() | ||
if self.last_jmx_piped and self.last_jmx_piped < jmx_launch: | ||
self.sd_backend.reload_check_configs = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this jmxfetch-specific code block should be here exactly: JMXFetch
restarts aren't really related to the updates in the config template store (which is what the parent block is about).
On top of that, when JMXFetch restarts, it'd be nice to reload only the jmx-related check configs instead of all the check configs.
So:
- Could we move this logic to outside this
if
block (i.e. right afterself.reload_configs_flag = False
)? - To avoid reloading all the check configs, we could do what's done here: https://github.com/DataDog/dd-agent/blob/5.15.0/agent.py#L281-L285 (which could be out in a separate function) instead of setting
self.sd_backend.reload_check_configs = True
Let me know how that sounds to you, I might have missed something.
Thanks for the review @olivielpeau, insightful comments. I scrambled to put this together ASAP and wasn't as careful as I should have. Let me know if you prefer the current logical path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
If JMXFetch happens to go down, use the
atime
stat date on the JMX launch file to decide if the auto-discovery configurations should be re-piped to JMX. JMXFetch will touch that file when coming up so we should have a relatively good idea of when JMXFetch started - alternatively, we could look at the process table to figure this stuff out (but that would be computationally more expensive).Motivation
Customer bug.
Testing Guidelines
An overview on testing
is available in our contribution guidelines.
Additional Notes
Sister PR: DataDog/jmxfetch#143