Synchronize the shipper configuration model with what is implemented in the agent #161

cmacknz · 2022-11-04T19:15:53Z

The shipper currently defines a configuration file with the following format: https://github.com/elastic/elastic-agent-shipper/blob/main/elastic-agent-shipper.yml

This configuration file was implemented before the design decisions needed to integrate the shipper into the agent were finalized. We need to update it to match the final design decisions. Specifically it was decided that:

The agent will provide the shipper with a copy of the entire agent policy, so that it can parse out the fields it needs to implement features that depend on input configuration like processing. Provide the relevant subset of the agent policy to the shipper as its configuration elastic-agent#617 (comment)
The shipper will be enabled on a per output basis by including a shipper: sub-object in the configuration for each agent output. All new shipper specific configuration should be nested under this shipper: configuration key, including queue configuration. Define a feature flag for enabling the shipper in the agent policy, defaulting to false. elastic-agent#217 (comment). For example:

outputs:
  default:
    type: elasticsearch
    hosts: https://localhost:9200
    shipper:
       enabled: true
       queue: {}

The shipper's control protocol connection will receive both input and output units. The output units will match the output section of an agent policy. There will be one input unit per connected component (process or Beat) configuring the gRPC connection between that component and the shipper, including the mTLS certificate to use.

These changes are implemented in elastic/elastic-agent#1527 which adds support for the shipper to agent.

The scope of this issue is to incorporate the new shipper configuration model into the shipper, and test that it is compatible with with the agent changes in elastic/elastic-agent#1527.

Key cases to test are:

Multiple components (Beat processes) can successfully connect to a single shipper instance and publish events.
Multiple outputs can be configured in an agent policy, each individually configuring a data shipper. Ensure that each individual shipper process is started and configured correctly.

The text was updated successfully, but these errors were encountered:

cmacknz · 2022-11-08T18:39:39Z

This is the Fleet UI implementation issue for supporting the shipper in the agent policy: elastic/kibana#141508

In particular I've provided some examples of how the shipper queue and output parameters can be configured in elastic/kibana#141508 (comment)

fearful-symmetry · 2022-11-14T20:44:23Z

The shipper will be enabled on a per output basis by including a shipper: sub-object in the configuration for each agent output.

@cmacknz is the plan to support multiple outputs in that config? Right now it's built on the assumption that we'll get one "main" output config that determines the behavior of the shipper.

cmacknz · 2022-11-14T20:56:49Z

When an agent policy configures multiple outputs, the agent will start multiple independent shipper processes. Each shipper process will have a single output type configured (ES, Logstash, Kafka, etc.).

We may support multiple outputs in a single shipper process in the far future. For now we do not.

fearful-symmetry · 2022-11-17T00:23:50Z

Alright, updating this with the current state, and issues, since it's kind of a large set of changes to be made:

Still figuring out how to handle the config coming from the input units. The shipper wasn't designed on the assumption that each input unit would get its own gRPC socket, so there will be a certain amount of retooling the gRPC server code
Logging config is a bit hacky as we wait for this: Capture stdout/stderr of spawned components elastic-agent#1702
My ability to test this accurately is somewhat limited, since fleet doesn't seem to generate any kind of shipper config, so I'm just adding it myself.
The output config could do with hard-coded shipper settings: Namespace config values in fleet output elastic-agent#1729
The input settings could use hard-coded fields for shipper-specific connection config: Input configs from units should hard-code shipper connection settings elastic-agent#1744
Right now, elastic-agent spins up two instances of the shipper (a normal one, and one for monitoring), and I'm a tad worried that things certain network ports might step on each other.
I'm not sure how certain auxiliary things, like expvar monitoring, will be configured. The output unit? the CLI?

cmacknz · 2022-11-17T00:50:18Z

that each input unit would get its own gRPC socket, so there will be a certain amount of retooling the gRPC server code

There should be a single gRPC server in the shipper, which accepts a connection from each Beat. In agent terminology there should be a connection per component (process), not per unit (input). If this isn't what is being configured we should consider changing it.

Right now, elastic-agent spins up two instances of the shipper (a normal one, and one for monitoring), and I'm a tad worried that things certain network ports might step on each other

We shouldn't be configuring a second shipper for monitoring, unless monitoring is shipping to a separate Elasticsearch cluster.

I'm not sure how certain auxiliary things, like expvar monitoring, will be configured. The output unit? the CLI?

I would follow what Beats does, which I haven't looked at it in a while to know off the top of my head what this is. Following what is set in the agent.monitoring section of the policy.

fearful-symmetry · 2022-11-17T03:44:23Z

@cmacknz

There should be a single gRPC server in the shipper, which accepts a connection from each Beat. In agent terminology there should be a connection per component (process), not per unit (input)

Ah, sorry, still learning the logic of how input configs work. The fake shipper used for testing elastic agent spins up a new server with every input, using the TLS and server settings specified in that input unit config, but looking at the configs I'm seeing as I develop against elastic-agent, a given instance of the shipper will get the same server endpoint in every input config, which seems to indicate that we can use any given input config to spin up the shipper's gRPC server. However, it also implies that different inputs can potentially expect different gRPC server endpoints. Might need some clarification from @blakerouse here.

We shouldn't be configuring a second shipper for monitoring, unless monitoring is shipping to a separate Elasticsearch cluster.

Right now we're starting two instances of the shipper, one with the ID shipper-default and another with shipper-monitoring:

ps aux | grep shipper
alexk    1499610  0.0  0.0 1691352 27096 pts/17  Sl+  19:36   0:00 /home/alexk/go/src/github.com/elastic/elastic-agent/build/distributions/elastic-agent-8.6.0-linux-x86_64/data/elastic-agent-8eb334/components/shipper -E logging.level=debug -E logging.files.path=./ -E logging.files.name=shipper-hack -d * -E path.data=/home/alexk/go/src/github.com/elastic/elastic-agent/build/distributions/elastic-agent-8.6.0-linux-x86_64/data/elastic-agent-8eb334/run/shipper-default
alexk    1499675  0.0  0.0 1765852 27880 pts/17  Sl+  19:36   0:00 /home/alexk/go/src/github.com/elastic/elastic-agent/build/distributions/elastic-agent-8.6.0-linux-x86_64/data/elastic-agent-8eb334/components/shipper -E logging.level=debug -E logging.files.path=./ -E logging.files.name=shipper-hack -d * -E path.data=/home/alexk/go/src/github.com/elastic/elastic-agent/build/distributions/elastic-agent-8.6.0-linux-x86_64/data/elastic-agent-8eb334/run/shipper-monitoring

They seem to get the same output config, but different input units, presumably based on what inputs are used for cluster self-monitoring.

cmacknz · 2022-11-17T06:45:49Z

Got it thanks for clarifying.

However, it also implies that different inputs can potentially expect different gRPC server endpoints. Might need some clarification from @blakerouse here.

The way to view this is that the agent is incredibly flexible in what it can provision. In the extreme each individual input is its own process connected to an input specific shipper. From an architecture perspective it makes to have this level of flexibility.

However in practice we aren't going to do that. We are going to connect each Beat process (component) to a shipper, not each input. At least in the current iteration of V2.

They seem to get the same output config, but different input units, presumably based on what inputs are used for cluster self-monitoring.

This makes sense. It is certainly easier to always provision a monitoring shipper and a non-monitoring shipper unconditionally, but I don't think we should be creating two shipper instances (and therefore two queues and outputs to be tuned) unless the user explicitly wants to send monitoring to a different instance.

Regardless this is something to address in the agent and you can ignore it for the purpose of this issue.

blakerouse · 2022-11-23T00:11:07Z

@cmacknz

There should be a single gRPC server in the shipper, which accepts a connection from each Beat. In agent terminology there should be a connection per component (process), not per unit (input)

Ah, sorry, still learning the logic of how input configs work. The fake shipper used for testing elastic agent spins up a new server with every input, using the TLS and server settings specified in that input unit config, but looking at the configs I'm seeing as I develop against elastic-agent, a given instance of the shipper will get the same server endpoint in every input config, which seems to indicate that we can use any given input config to spin up the shipper's gRPC server. However, it also implies that different inputs can potentially expect different gRPC server endpoints. Might need some clarification from @blakerouse here.

The fake shipper is very simple and you are correct in its implementation it would start multiple GRPC listeners per unit, but the tests know that there will only every be one so it doesn't really need to worry about that. The real shipper should only start 1 listener per its process. The "server" address will always be the same for each input unit, the certificates will be different. You can use the connecting address to determine the correct certificate to serve. Example here on how the Elastic Agent does that for its control protocol.

https://github.com/elastic/elastic-agent/blob/main/pkg/component/runtime/manager.go#L718

blakerouse · 2022-11-23T00:12:49Z

Regardless this is something to address in the agent and you can ignore it for the purpose of this issue.

@cmacknz We will need to fix how the monitoring output configuration is added to the Elastic Agent. At the moment it is defined as a seperate output, which is why the Elastic Agent is performing that behavior.

cmacknz · 2022-11-29T01:03:14Z

Draft PR for shipper support in Fleet: elastic/kibana#145755

fearful-symmetry · 2022-12-05T17:47:58Z

@blakerouse thanks for the clarification, I got somewhat confused just looking at the code.

cmacknz · 2023-01-10T19:26:04Z

@cmacknz We will need to fix how the monitoring output configuration is added to the Elastic Agent. At the moment it is defined as a seperate output, which is why the Elastic Agent is performing that behavior.

I created an issue to track this change elastic/elastic-agent#2078

cmacknz added Team:Elastic-Agent Label for the Agent team v8.7.0 labels Nov 4, 2022

cmacknz assigned fearful-symmetry Nov 4, 2022

cmacknz added the estimation:Week Task that represents a week of work. label Nov 4, 2022

This was referenced Nov 4, 2022

[Meta] Elastic Agent Shipper Project #16

Open

Add integration tests between the Beats shipper client and the shipper server elastic/beats#33205

Closed

[Fleet] Add option to enable disk queue for elastic-agent-shipper elastic/kibana#141508

Closed

fearful-symmetry mentioned this issue Nov 17, 2022

Update the shipper to work with the elastic-agent #185

Merged

cmacknz mentioned this issue Jan 10, 2023

The agent should use a single shipper for regular and monitoring data by default elastic/elastic-agent#2078

Closed

fearful-symmetry closed this as completed in #185 Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronize the shipper configuration model with what is implemented in the agent #161

Synchronize the shipper configuration model with what is implemented in the agent #161

cmacknz commented Nov 4, 2022

cmacknz commented Nov 8, 2022

fearful-symmetry commented Nov 14, 2022

cmacknz commented Nov 14, 2022

fearful-symmetry commented Nov 17, 2022 •

edited

Loading

cmacknz commented Nov 17, 2022

fearful-symmetry commented Nov 17, 2022 •

edited

Loading

cmacknz commented Nov 17, 2022

blakerouse commented Nov 23, 2022

blakerouse commented Nov 23, 2022

cmacknz commented Nov 29, 2022

fearful-symmetry commented Dec 5, 2022

cmacknz commented Jan 10, 2023

Synchronize the shipper configuration model with what is implemented in the agent #161

Synchronize the shipper configuration model with what is implemented in the agent #161

Comments

cmacknz commented Nov 4, 2022

cmacknz commented Nov 8, 2022

fearful-symmetry commented Nov 14, 2022

cmacknz commented Nov 14, 2022

fearful-symmetry commented Nov 17, 2022 • edited Loading

cmacknz commented Nov 17, 2022

fearful-symmetry commented Nov 17, 2022 • edited Loading

cmacknz commented Nov 17, 2022

blakerouse commented Nov 23, 2022

blakerouse commented Nov 23, 2022

cmacknz commented Nov 29, 2022

fearful-symmetry commented Dec 5, 2022

cmacknz commented Jan 10, 2023

fearful-symmetry commented Nov 17, 2022 •

edited

Loading

fearful-symmetry commented Nov 17, 2022 •

edited

Loading