-
Notifications
You must be signed in to change notification settings - Fork 17
[Meta][Feature] Enable filebeat and metricbeat to publish data to the shipper #8
Comments
@cmacknz I'm a bit confused about this sentence. When we talked 1 on 1, we agreed that during the very first iteration the gRPC server will be one of the output options along with Elasticsearch, File output, Kafka, etc. Later at the team call I asked the same question to widen the discussion circle but then you answered something different about having a feature flag and switching some logic in the code. I think we have some miscommunication about this. I see 2 options how to approach this task: Option 1We have it as a new experimental output type which we could configure like this: output:
shipper:
server: "localhost:50051" # The server address in the format of host:port
tls: true # Connection uses TLS if true, else plain TC
ca_file: "/home/cert" # The file containing the CA root cert file
server_host_override: "x.test.example.com" # The server name used to verify the hostname returned by the TLS handshake This can be achieved with the following steps:
In this case, changes of the existing code are none or minimal and we can start working with the new setup, debug and perform tests. The new output type can be excluded from the documentation if needed. Later we can just replace the whole pipeline implementation when we feel the shipper is ready. Option 2We have a feature flag to switch the pipeline to a separate implementation that starts sending events to the shipper instead of configured outputs. This will require us:
shipper:
server: "localhost:50051" # The server address in the format of host:port
tls: true # Connection uses TLS if true, else plain TC
ca_file: "/home/cert" # The file containing the CA root cert file
server_host_override: "x.test.example.com" # The server name used to verify the hostname returned by the TLS handshake
The major drawback here is that we would need more time and to make a lot of changes to the existing code instead of just adding new that can affect stability. On the other hand, we would need to do that at some point too. |
I recommend option 1 as it will be simpler to implement and maintain in the long term. It follows the model currently used by Elastic agent to configure outputs for beats. |
I prefer also option 1, so we don't have a special case or transformation to do. |
I'm not sure how option 1 fits with the other pending pieces. I think perhaps there's been some confusion with the "output" language that is being used for two different stages of processing: (1) sending data from the input to the processor / shipper before it enters the queue, and (2) sending final event data from the shipper to the upstream target (elasticsearch, logstash etc) after it exits the queue. So I'm not sure how option 1 would fit right now -- the I wonder if the confusion about approaches comes from the use of "output" to refer to both of those components? Because option 1 sounds to me like a reasonable sketch of the output of the shipper, but as I understand it in the first pass we're just handling that with a placeholder raw-file output. |
Yes, the language isn't precise enough, neither does the fact that the beat pipeline and the shipper will have overlapping functionality. My view is that the development needs to be an iterative process where we start with some duplication between the beat and shipper just to get them connected to each other, and then slowly migrate functionality from the beat side into the shipper when run under agent. I think initially we start with option 1, where we just make it possible for a beat to communicate with the shipper over gRPC. Both the beat and the shipper at this stage have a memory queue, and the processors only exist on the beat side. This is what the diagram in the issue description is trying to show :) Once we have that, we next work on trying to remove the queuing from the beat side, followed by processing. At this point we may need to consider something like option 2 to try to strip down what the beat/input needs to run. I like starting with Denis' option 1 to get a faster end to end prototype. Once we have that and can test the interaction between the beats and shipper we will likely need to consider something like option 2. I think we'll be better positioned to make design adjustments after we have a quick prototype than pursuing larger changes from the beginning. I could be convinced otherwise though. |
Ah ok, so the redundancy in the memory queue is an intentional temporary workaround? In that case fair enough, let's continue :-) |
Does adding a feature flag make sense in beats? It is just basically a setting that enables or disables features. How is that different from setting |
I've updated the description and added a checklist for tracking the progress. One thing which is not 100% clear to me is input and data stream options. I could not find a simple way to propagate these parameters through the event batches so I'm going to address this as a separate issue after the initial implementation is there, so it's not blocking any experiments with the new shipper architecture. The same goes about the integration tests, they will be implemented separately. |
Thanks! I have a separate issue already for returning acknowledgements from the shipper: #9. I expected that would be too much work to fold into this issue. The input and data stream will have to be propagated from the agent policy, which we may not do yet. We may not need the data stream until we implement processors in the shipper, at which point we'll need a way to apply the correct processors to events based on the input and data stream. |
Added #34 as part of this work. |
All tasks complete, closing. |
This is a feature meta issue to allow filebeat and metricbeat to publish data to the shipper when run under Elastic agent. All other beats are out of scope.
An output for existing beats should be implemented that publishes to the shipper gRPC interface. When the shipper gRPC output is used, the beat output pipeline should be configured to be as simple as possible. Using a per beat disk queue with the shipper is forbidden. A memory queue may be used with the shipper output, but how it should be configured by users will require careful consideration. Ideally any necessary queue configuration can be made automatic.
Removing processors from beats is out of scope for this issue. Processors will be removed in a later issue.
This feature is considered complete when at least the following criteria are satisfied for both filebeat and metricbeat:
The assignee of this issue is expected to create the development plan with all child issues for this feature. The following set of tasks should be included in the initial issues at a minimum:
UPD by @rdner
I split this in the following steps:
shipper
output type #22ResourceExhausted
code is returned from the gRPC server, TTL does not decrease in this caseThe text was updated successfully, but these errors were encountered: