Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type field fixes in central management #33921

Merged
merged 8 commits into from
Dec 6, 2022

Conversation

fearful-symmetry
Copy link
Contributor

What does this PR do?

This is a fix for elastic/elastic-agent#1807 . It fixes two different but related issues:

  • The index field was generated with the top-level data_stream field, and not the stream-level field, which produced the wrong index string.
  • There's an issue where not all input configs have a stream-level type field; this adds a fix to filebeat and auditbeat to manuall add the type field to the stream based on the top-level input type.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@fearful-symmetry fearful-symmetry self-assigned this Dec 1, 2022
@fearful-symmetry fearful-symmetry requested review from a team as code owners December 1, 2022 23:52
@fearful-symmetry fearful-symmetry requested review from belimawr and leehinman and removed request for a team December 1, 2022 23:52
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 1, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@mergify
Copy link
Contributor

mergify bot commented Dec 1, 2022

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @fearful-symmetry? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Dec 1, 2022

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2022-12-05T19:05:54.219+0000

  • Duration: 19 min 9 sec

Test stats 🧪

Test Results
Failed 0
Passed 7
Skipped 1
Total 8

Steps errors 1

Expand to view the steps failures

Install precommit
  • Took 0 min 3 sec . View more details here
  • Description: PIP_COMMAND=pip if command -v pip3 ; then PIP_COMMAND=pip3 fi ${PIP_COMMAND} install pre-commit

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will fix things. Have you confirmed that starting Filebeat under agent with an input other than filestream or logs works as expected?

The winlog input reproduced this problem reliably: elastic/elastic-agent#1800

Winlog should only run on Windows though, but I think this problem would occur with any non log or filestream input type like httpjson that is quite widely used: https://github.com/elastic/elastic-agent/blob/8aae0d629e3765fb2fb42e4e111df17a895a6745/specs/filebeat.spec.yml#L94

x-pack/filebeat/cmd/agent.go Show resolved Hide resolved
@@ -23,13 +24,22 @@ var RootCmd *cmd.BeatsRootCmd
// heartbeatCfg is a callback registered via SetTransform that returns a Elastic Agent client.Unit
// configuration generated from a raw Elastic Agent config
func heartbeatCfg(rawIn *proto.UnitExpectedConfig, agentInfo *client.AgentInfo) ([]*reload.ConfigWithMeta, error) {
//grab and properly format the input streams
inputStreams, err := management.CreateInputsFromStreams(rawIn, "metrics", agentInfo)
modules, err := management.CreateInputsFromStreams(rawIn, "metrics", agentInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
modules, err := management.CreateInputsFromStreams(rawIn, "metrics", agentInfo)
modules, err := management.CreateInputsFromStreams(rawIn, "synthetics", agentInfo)

I think synthetics is the right default here most of the time, although defaulting to metrics is more obviously wrong so it makes mistakes more obvious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few other Beats where the default probably isn't right:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, alright, good catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re, synthetics, I thought the index fields had to be one of logs or metrics ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should be, but the type part of the datastream is definitely being set to synthetics in some cases:

ds.Type = "synthetics"

We have to keep the existing behaviour here to avoid breaking things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packetbeat and auditbeat default type still needs to be updated to from "metrics" to "logs" from what I can see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synthetics has its own type and always has, directly supported by ES's list of golden types (or whatever it's properly called).

Let us know if you have any other Qs about how synthetics handles stuff here.

x-pack/heartbeat/cmd/root.go Outdated Show resolved Hide resolved
}

configList, err := management.CreateReloadConfigFromInputs(inputStreams)
// Extract the stream-level type from the input
typeField := strings.Split(rawIn.Type, "/")[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec files have an exhaustive list of the possible input types, perhaps we should have tests ensuring they are correctly translated into either an input type or metricset for each Beat?

For synthetics the list is here: https://github.com/elastic/elastic-agent/blob/d3496acb1fcd0da7e4f6a837c339c9b5cd174c82/specs/heartbeat.spec.yml#L3-L46

We can at least link to the spec file in a comment here to explain what we are parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems right to me. We may add a "scripted" type in the future, but for now it's just browser http tcp icmp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should mention however, that browser monitors output 3 different stream types browser browser_screenshot and browser_network. Does that impact your work here. So, that one input creates three separate outputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should mention that those are data sets, not types. Everything goes under synthetics for us. The full list of data set types is listed here: https://www.elastic.co/guide/en/observability/current/synthetics-manage-retention.html

Copy link
Member

@cmacknz cmacknz Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, heartbeat/synthetics is able to parse all of its configuration itself so at this level we just need to pass it through and let it be dealt with in Heartbeat. We actually removed looking at the type field here in Heartbeat, as it isn't the same as Filebeat for example where the type field at the input level is mandatory to start the right Filebeat input.

x-pack/libbeat/management/generate.go Outdated Show resolved Hide resolved
x-pack/libbeat/management/generate.go Show resolved Hide resolved
x-pack/libbeat/management/generate_test.go Outdated Show resolved Hide resolved
@fearful-symmetry
Copy link
Contributor Author

Going through all the different input configs I've seen while debugging this, it's making me a tad paranoid and I want to think about a more defensive way to do this, since the data_stream and type field are a lot less predictable than I anticipated.

Agreed that we should have a more formalized spec for some of this too.

@cmacknz
Copy link
Member

cmacknz commented Dec 2, 2022

type field are a lot less predictable than I anticipated.

The top level input type field controls which binary the agent starts, so it is predictible in that it always has to be there and be listed in the spec file. The thing to watch out for is that the agent policy supports aliases for several input types, for example there are three ways to specify that the Filebeat log input should be started: https://github.com/elastic/elastic-agent/blob/8aae0d629e3765fb2fb42e4e111df17a895a6745/specs/filebeat.spec.yml#L115

  - name: log
    aliases:
      - logfile
      - event/file

@fearful-symmetry
Copy link
Contributor Author

@cmacknz yah, I'm referring to the stream-level type fields. This could be a potentially be a quirk/bug in the heartbeat integrations themselves though, since it looks like some heartbeat integrations have the type field in the stream, and some don't.

@cmacknz
Copy link
Member

cmacknz commented Dec 2, 2022

I'm referring to the stream-level type fields. This could be a potentially be a quirk/bug in the heartbeat integrations themselves though, since it looks like some heartbeat integrations have the type field in the stream, and some don't.

My understanding is that we should only care about two type fields, the input level type which we've already discussed, and the data_stream.type field which is optional and can be in either input.data_stream.type or input.streams.stream[n].data_stream.type.

If I look at the sample synthetics configuration we were provided, it includes a type: icmp at the stream level that is non-standard and we shouldn't have to care about. That is just another piece of integration specific configuration that can freely be placed at the stream level, it's essentially equivalent to the name field above it.

    streams:
      - id: ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5
        name: 8.8.8.8
        type: icmp # <--- I don't think we care about this in the agent
        enabled: true
        data_stream:
          dataset: icmp
          type: synthetics # <--- We do care about this type, since it controls the target index

Does that clarify things?

@fearful-symmetry
Copy link
Contributor Author

@cmacknz yah, I'm referring to the type field issues found here: elastic/elastic-agent#1807 (comment) while heartbeat was parsing its own config. This is the type field that I'm worried about stepping on in the individual beat callbacks like this:

	// Extract the module name from the stream-level type
	// these types are defined in the elastic-agent's specfiles
	for iter := range modules {
		if _, ok := modules[iter]["type"]; !ok {
			modules[iter]["type"] = rawIn.Type
		}
	}

Looking at the integrations, it appears that sometimes the synthetic integrations have the stream-level type field, and sometimes they don't, which is where I assume the errors are coming from in the linked issue. Considering how this kind of inconsistency is apparently possible, I just want to make sure we're not breaking anything.

@cmacknz
Copy link
Member

cmacknz commented Dec 2, 2022

Right, I am thinking this from the perspective of parsing the agent policy. There is the second problem of where to put the type field in the resulting beat configuration.

Given we have:

- id: id1
  type: synthetics/icmp # <--- This is mandatory to run under agent
  use_output: default
  data_stream:
    namespace: default
  streams:
    - id: ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5
      name: 8.8.8.8
      type: icmp # <--- This is specific to the synthetics integration configuration, likely has the same meaning as the input.type above.
      enabled: true
      data_stream:
        dataset: icmp
        type: synthetic

Looking at the reference heartbeat configuration at https://www.elastic.co/guide/en/beats/heartbeat/current/heartbeat-reference-yml.html I believe both the input.type and the input.streams.stream[n].type field here could both end up in the heartbeat.monitor.type.

The Heartbeat configuration format is completely different than the one for Filebeat, I don't think that just inserting the type field unconditionally makes sense. It looks like it would need to go into heartbeat.monitor.type.

I'm not even sure that adding logic to extract type makes sense in the heartbeatCfg function, we should just pass through the configuration unmodified. Heartbeat already knows how to parse this I think, it wasn't relying on configuration transformations in V1.

The primary bug for heartbeat is us parsing the data_stream fields incorrectly.

@fearful-symmetry
Copy link
Contributor Author

Alright, a few last-minute changes before the weekend; I feel like we're going to be tinkering with the logic for extracting from data_streams for a while, though.

@cmacknz
Copy link
Member

cmacknz commented Dec 5, 2022

Can we always log the configurations that are created at info level? Or at least the input type (when applicable) and data streams that are extracted from the Unit configurations?

Without this problems in this code are undebuggable from agent diagnostics bundles.

@fearful-symmetry
Copy link
Contributor Author

@cmacknz there's a PR here to aid with the debugability here: #33940

I have another upcoming PR that adds some more debug-level statements about unit config at the level of the elastic-agent-client, which might also help.

Do you thing it's worth it to add more debug statements about data_stream state?

@cmacknz
Copy link
Member

cmacknz commented Dec 5, 2022

I think #33940 will get us 99% of what we need.

@fearful-symmetry fearful-symmetry requested a review from a team as a code owner December 5, 2022 20:48
Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@cmacknz cmacknz added the backport-v8.6.0 Automated backport with mergify label Dec 5, 2022
@fearful-symmetry
Copy link
Contributor Author

So, now realizing that this only works with some inputs but not others, as the syslog integrations has a type logfile which doesn't seem to be integrated in filebeat:

Error creating runner from config: Error creating input. No such input type exist: 'logfile'

Not sure if logfile is just weird, or if I'm misunderstanding something in filebeat.

@cmacknz
Copy link
Member

cmacknz commented Dec 5, 2022

Hmm, so the agent defines the logfile input as an alias of the log input in the spec file: https://github.com/elastic/elastic-agent/blob/eaecd0d44014c592128d365f01f40f5f4ed26a29/specs/filebeat.spec.yml#L115

  - name: log
    aliases:
      - logfile
      - event/file

We have two options:

  1. My preferred option which is to have the agent translate the aliases to the expected log name the Beat understands.
  2. Handle this translation on the Beat side, but since we can add new aliases at any time via the spec it seems like the agent side should be dealing with.

@blakerouse do you have any opinion here?

@fearful-symmetry
Copy link
Contributor Author

Yah, using type "log" works.

@blakerouse
Copy link
Contributor

@cmacknz It is agents job to translate this not beats. This is a bug in agent. I will look into getting this fixed properly.

@cmacknz
Copy link
Member

cmacknz commented Dec 6, 2022

@fearful-symmetry let's get this merged then and backported into the 8.6 branch to make the build candidate today.

@andrewvc
Copy link
Contributor

andrewvc commented Dec 6, 2022

I'm a bit confused about the discussion here, it'd help to make sure we are on the same page wrt data_stream.dataset vs. data_stream.type

For synthetics it should always be:

data_stream.type: synthetics
data_stream.dataset: http|icmp|tcp|browser|browser_network|browser_screenshot

The type in the input yml for synthetics dictates which code executes the config, which for http|icmp|tcp matches the data stream type.

For browsers the input yml type of browser will result in three separate datasets being written: browser browser_screenshot
and browser_network.

@cmacknz
Copy link
Member

cmacknz commented Dec 6, 2022

I'm a bit confused about the discussion here, it'd help to make sure we are on the same page wrt data_stream.dataset vs. data_stream.type

@andrewvc thanks for following up. The only thing we explicitly set here is the default type, which was mistakenly metrics but should have been synthetics. This only applies when it isn't specified in the agent policy now.

At a fundamental level most of this PR is addressing bugs in how we parsed the agent input type and data stream fields out of the agent policy. For example see the resolved comment #33921 (comment)

We don't actually look at or modify the data stream fields, we are mostly concerned with extracting them properly to configure processors and set the target index name. What is in the integration configuration should be what gets configured at the Beat level.

We can try to set up a quick zoom chat to clarify further if needed, there is already a lot of discussion in this PR making it hard to follow.

@fearful-symmetry
Copy link
Contributor Author

Alright; I fixed the test to use log instead of logfile. After that, we should be good.

@sonarcloud
Copy link

sonarcloud bot commented Dec 6, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@cmacknz
Copy link
Member

cmacknz commented Dec 6, 2022

All tests have passed, we are just waiting for the packaging jobs to complete. Going to merge this to unblock the next 8.6 BC.

@cmacknz cmacknz merged commit 7bfae23 into elastic:main Dec 6, 2022
mergify bot pushed a commit that referenced this pull request Dec 6, 2022
* fix use of types in filebeat, index generation

* add map checks, heartbeat

* fix up per-beat processors, add new logic for data_streams

* try to make linter happy

* still making linter happy

* change fallback types for auditbeat and packetbeat

* fix test

(cherry picked from commit 7bfae23)
cmacknz pushed a commit that referenced this pull request Dec 6, 2022
* fix use of types in filebeat, index generation

* add map checks, heartbeat

* fix up per-beat processors, add new logic for data_streams

* try to make linter happy

* still making linter happy

* change fallback types for auditbeat and packetbeat

* fix test

(cherry picked from commit 7bfae23)

Co-authored-by: Alex K <8418476+fearful-symmetry@users.noreply.github.com>
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-12-06T18:44:41.386+0000

  • Duration: 87 min 30 sec

Test stats 🧪

Test Results
Failed 0
Passed 7344
Skipped 338
Total 7682

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* fix use of types in filebeat, index generation

* add map checks, heartbeat

* fix up per-beat processors, add new logic for data_streams

* try to make linter happy

* still making linter happy

* change fallback types for auditbeat and packetbeat

* fix test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.6.0 Automated backport with mergify bug Team:Elastic-Agent Label for the Agent team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants