-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Robustness of Agent Foreground and Background Execution Modes #288
Comments
12/11/2024I've started reproducing the problem and researching different approaches to solve this issue. 13/11/2024I was testing another platforms to know how they works. I'm trying to do some tests using 19/11/2024I've changed the approach. I started implementing the solution using a lock file with the 20/11/2024I have completed the development for Linux and performed tests on both Linux and macOS. Some adjustments were necessary to ensure compatibility with macOS. I need to update the service-related files to reflect the changes made to the execution modes. Draf PR opened. 21/11/2024I have been testing on macOS and fixing some details for the systemd service. 22/11/2024I've finished the tests and closed de issue. |
OpenSearchDuring the testing of OpenSearch, it was observed that it is possible to execute another instance of the executable while the OpenSearch service is already running. This behavior appears to create an additional node, which aligns with the fact that OpenSearch is designed as a cluster-based system. However, it was noticed that the directory containing the PID file becomes empty after launching the second OpenSearch instance. This behavior raises questions about how the process manages resources and whether this is expected in cluster configurations. To clarify, OpenSearch uses the same executable for all node types, including:
Each OpenSearch instance runs as an independent Java process and can be configured for different roles. FilebeatTesting Filebeat revealed that it does not allow multiple instances to run simultaneously with the same configuration. When attempting to execute a second instance of Filebeat while the service is already running, the following error was encountered:
This error indicates that Filebeat locks the path.data directory, preventing concurrent executions unless a separate data path is specified. Further tests showed that it is possible to launch multiple instances if distinct path.data configurations are provided. While it is technically feasible to run multiple Filebeat instances on the same server, this practice is uncommon. Typically, a single instance is configured to handle data ingestion from multiple sources, streamlining operations. ConclusionsAfter analyzing these two products, I believe that wazuh-agent will not behave the same way, as it is designed to run as a single instance. Tests could be done to observe what happens with data persistence when two instances are running simultaneously. Alternatively, it could be worth considering running two instances with different data paths to avoid issues related to this. If the idea of executing a new instance with |
Hi @sdvendramini, Thank you for the detailed analysis. Based on your findings and further discussions, we propose the following adjustments to streamline the behavior of the Proposal
To reliably detect whether another process is running, we suggest implementing a robust mechanism using lockfiles. This approach addresses scenarios where PID files or lockfiles might remain stale, such as:
Proposed Lockfile Implementation
Let me know if you agree with this approach or have further suggestions. Once finalized, we can proceed with implementing and testing these changes. Best regards. |
Parent Issue: #241
Description
The Wazuh agent currently has issues handling its execution modes when run with
--run
(foreground) or--start
(background) flags. Specifically, launching the agent in foreground with./wazuh-agent --run
can sometimes print the following message:This message typically indicates that an instance of the agent is already running. However, it may also appear if the agent's previous process terminated unexpectedly, which leads to unreliable behavior.
Proposed Solution
--run
and--start
behavior:--run
: Should only launch the agent in the foreground without checking if an instance is already running.--start
: Should launch the agent in the background and include checks to ensure no other instance of the agent is running.When using
--start
, the agent should:--run
).The text was updated successfully, but these errors were encountered: