Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Agent] Enable logging introspection of a specific beat #14437

Closed
elasticmachine opened this issue Apr 15, 2019 · 13 comments
Closed

[Agent] Enable logging introspection of a specific beat #14437

elasticmachine opened this issue Apr 15, 2019 · 13 comments
Assignees

Comments

@elasticmachine
Copy link
Collaborator

Original comment by @michalpristas:

When commanded from UI agent should enable real-time logging for a specific beat
As stated in LINK REDACTED this can be achieved in multiple ways:

  • we use the supervisor to read the logs from the stdout/stderr and the supervisor redirect that to backend
    -we have a filebeat subprocess that supervisor starts to watch the logs of the other beats.
@elasticmachine
Copy link
Collaborator Author

Original comment by @michalpristas:

we need to bring this issue up to date.
so far we discussed that we will launch additional filebeat/metricbeat configured to monitor output of running beats and report it to pre-configured output location.

open questions:

  • will filebeat be started
    • automatically or
    • the desire for a beat to be monitored will be part of the config or
    • will there be a trigger (command [rest/grpc/something else])
  • will we use
    • one beat to monitor multiple beats with compact(merged) configuration or
    • one beat per each monitored beat (each major beat will have its own sidecars)

cc @ph

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

@michalpristas

I think we have three somewhat related instrospection mechanism for a beat.

  1. Internal monitoring: We send events to the appropriate fleet api or we log to the console.
  2. Loggings Send logs of running process to Elasticsearch
  3. Metrics Send monitoring information to Elasticsearch

For 1, I think it it will automatic.

For 2 and 3, I think they should be an actual configuration.

When Agent is run in standalone, we expect the users to turn on that option and points the logs and metrics to the appropriate place. Note it could be different clusters.

When Agent is run with Fleet, Fleet should provide that monitoring information and it will be configuration per configuration.

one beat to monitor multiple beats with compact(merged) configuration or
one beat per each monitored beat (each major beat will have its own sidecars)

For the two above point, I think we should only have a single sidecar type per runnings Agent instance, where type is Metricbeat and Filebeat in this case. Meaning yes we should merge that configuration.

Now, with that in mind, the HOW to monitor the process should be defined in the program specs.

We will need to define the options that will be added to the agent.yml.

cc @mattapperson @urso

@elasticmachine
Copy link
Collaborator Author

Original comment by @michalpristas:

thanks @ph
with how i agree, i did some diagrams with possible solutions and came out to the same conclusion that defining output should be part of agent spec but defining inputs for MB sidecar and FB sidecar should be part of program spec

@elasticmachine
Copy link
Collaborator Author

Original comment by @urso:

@elastic/stack-monitoring I guess the stack monitoring team might want to chime in here. There is still the question about a minified Beat focusing on stack monitoring. Would it make sense to use this one? Will our current solution on collecting logs/metrics be temporary?

@elasticmachine
Copy link
Collaborator Author

Original comment by @ycombinator:

Just to provide a bit of background on what Stack Monitoring is currently thinking of doing (this is far from set in stone, it's more of a proposal that's gathering some momentum at this point):

We're imagining that every monitorable stack product (Elasticsearch, Kibana, Logstash, the various Beats, and APM Server) will bundle within it's own package a lightweight Metricbeat and possibly Filebeat as well. The product instance will be responsible for configuring and managing the runtime lifecycle of the bundled Metricbeat/Filebeat.

In the context of the Fleet project, what I believe this would mean is that the Agent would start the Beats it needs to per user-provided configuration. Those Beats would then start their bundled lightweight Metricbeat/Filebeat instances if so configured. As @ph mentioned above, that configuration might get injected either by the user (if Agent is running standalone) or from Fleet (if the Agent is working with the Fleet UI).


Personally, I don't think we (Stack Monitoring and Fleet projects) need to block each other on progress here. The good news is that the monitoring bits are more of an implementation detail so if we decide to go one way now and then change it later, the user shouldn't be affected.

Concretely, what I mean is that I imagine there will be some configuration settings in the Agent related to monitoring of the Beats processes being managed by that Agent. Whether the Agent then chooses to apply this configuration to a) sidecar Metricbeat and Filebeat processes, or b) pass it down to the Beats processes for further configuring their bundled Metricbeat or Filebeat processes, or c) something else, it's all hidden away from the user. Thoughts?

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

@michalpristas One thing we didn't discuss is logging when the agent is run in standalone, where are we sending logs in that specific scenario? I think the following are the current options.

  1. Each beats can write their own logs. (/var/log/$BEATNAME.log)
  2. We combine all the individual logs of the beats into the Agent's logs
  3. We might need to be able to configure the agent to send logs of all the runnings process to ES.

Option 2 seems like a way to really mess up the event logs / ordering and would probably make things really hard to follow.

Option 1 and Option 3 seems what we should do, WDYT?

@elasticmachine
Copy link
Collaborator Author

Original comment by @michalpristas:

Update after last sync:

  1. We use files to monitor logs (later moving to pipes with a valid dropping strategy)
  2. Monitoring is defined on a global level
  3. Monitoring keeps pipelines isolated (sidecar per pipeline)
  4. Each beat within a pipeline logs into: /var/log/{pipeline}/{beat_name}

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

  1. /var/log/agent/pipeline/{beat_name}

We need to also setup the rotation strategy when we start it up.

@elasticmachine
Copy link
Collaborator Author

Original comment by @michalpristas:

we were discussing right place for sockets/pipes to live (the one using which beat exposes stats to metricbeat)

we agreed that for linux it is most likely /var/run/fleet/{pipeline}/{beat}.sock

i need to check the correct place for these in windows

cc @ph

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

Looking at the FHS (FileSystem hierachy standards) at LINK REDACTED

System programs that maintain transient UNIX-domain sockets must place them in this directory.

Not sure about named pipe under windows, since I've never worked with them.

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

@michalpristas I just had a quick chat with @graphaelli concerning UNIX socket and APM-Server, there is one big configuration option that differentiates beats from APM-Server, is the process is not run as root. So we will make sure when we start the process the beats can create the UNIX socket on the filesystem and the agent can read it correctly.

With that in mind, we might want to take a closer looks at the windows implementation and see what are the requirements for a named pipe. WDYT?

@elasticmachine
Copy link
Collaborator Author

Original comment by @michalpristas:

my thought was to create a socket with and set an owner to that socket to the same user/group as the one running the beat. so in this case beat should be able to manage it correctly.

windows is a one big question and i will need to get into them, as I as well have no experiences with running named pipes

@elasticmachine
Copy link
Collaborator Author

Original comment by @ph:

I've created #13577

@ph ph added the Agent label Nov 11, 2019
@ph ph changed the title [fleet] Enable logging introspection of a specific beat [Agent] Enable logging introspection of a specific beat Nov 19, 2019
@michalpristas michalpristas assigned ph and unassigned michalpristas Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants