[Stack Monitoring] Testing strategy for agent/integration data #119658

neptunian · 2021-11-24T19:19:38Z

Aside from manual testing we need to write functional and api integration tests that use the new metrics-* agent data for each integration. There are a few ways to go about it

@chrisronline locally modified the es_archiver to take the existing .monitoring archive data and rebuild it into metricbeat data. See Part 2. This means he was able to copy over all the existing functional and api integration tests without having to change anything except the archive that was to be loaded. We could do something similar and convert the existing metricbeat-* data to metrics-* data. Since we never actually supported metricbeat-* or ensured the UI worked with it, I don't know if this data is totally correct.
Generate new data. This would mean we'd need the es_archiver to support us saving new data, which it does not yet fully Add support for data streams in ES Archiver #69061 . It would also mean going over our tests and creating new tests and have them match this new data. It would also provide us an opportunity to make sure the data is correct and understand the app better. I am leaning towards this.
@klacabane Suggested using live data in the tests but I think this would probably add too much time and flakiness

@klacabane and I are going to investigate the best approach, see what others have done, and see how difficult it would be use/modify the es_archiver to work with data streams.

** Update **

After some discussion with @klacabane, we decided the following:

Use the package registry docker image to load actual packages so we can have the latest mappings. We can pass in a custom configuration yaml that points to a server and install the latest package with the endpoint /api/fleet/epm/packages/${pkg}/ which should install the latest version
generate agent/integration data by transform the existing _mb archived data, adding the correct datastream index value or any other values that are necessary (using something similar to this script)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-11-24T19:19:39Z

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

matschaffer · 2021-11-25T01:06:53Z

I like parts of both the first two bullet points. Being able to run the same test code over two data sets sounds like a nice move.

But also if the new code will be data streams, it seems like the test should be loaded as a data stream too. Of course, it might not matter much.

The UI will query metrics-* which could match indices or data streams and I wouldn't expect the query responses to change.

miltonhultgren · 2021-11-25T09:24:44Z

I know we're still early on this but @matschaffer and I are looking into how to simulate Stack Monitoring data with apm-synthtrace, which supports writing to data streams. Maybe it's an option to consider even though it would require a bit more investment in our data generation tooling but I think we're on that path anyway and this would be for a concrete problem.

klacabane · 2021-11-29T16:27:34Z

I'm wondering if we currently have validation of the mappings, or if that's even necessary with our versioning model.
In an agent world the mappings are defined by the Integrations package and we have a frontend component that expects a specific mapping definition to query and present data. What if the mapping changes, for example removing a property. Would that be guaranteed to be caught by end-to-end tests (or another mechanism), or are there scenarios where it could go unnoticed ?

matschaffer · 2021-11-30T06:39:54Z

are there scenarios where it could go unnoticed ?

I'm fairly certain the answer to that is "yes". It's sounds very close to the failure scenario I demoed during the nov 15th team meeting though granted, in your scenario the mappings come from packages rather than ES itself.

klacabane · 2021-11-30T14:58:54Z

I guess we could benefit from contract/schema tests on the mappings. It will be valuable to initially validate an integration package mappings against our expectations, but I don't know how doable it is currently - maybe we can leverage type definitions as a source of truth for monitoring expectations. Parsing our queries would be another (complex) solution.

Regarding the strategy we could start with option 1 which sounds cheap to implement and would provide an initial coverage that quickly surface bugs in the datastream usage. Once datastream support is added to esarchiver (or we have an alternative to load data in stream), we could work on the second option

klacabane · 2021-12-02T17:33:50Z

So it looks like we could use esArchiver if data streams are already created (see #68794). In our case the datastreams and relevant assets are installed by the fleet application on request. I'm thinking about hitting that endpoint during the test suite setup - while it will add coupling and complexity to our testing environment (we'll need a package registry running in a docker env?), I like the idea of a tighter integrations with the packages assets, comparing to defining static mappings that could go out-of-sync.

With datastreams available, we can use esArchiver to insert the archives already defined. Transforming metricbeat-* into metrics-* documents should be straightforward and serve as a first step. As a follow up we can replace the esArchiver with another data-generation solution.

matschaffer · 2021-12-03T01:51:59Z

we'll need a package registry running in a docker env

I don't think anything in the current yarn-powered testing requires docker yet. Might be worth finding out what it'd take to make something like https://github.com/elastic/kibana/blob/main/packages/kbn-es/README.md for the package registry. I forget who to talk to about this, but I remember @ycombinator presenting about it at the last GAH so maybe he knows.

ycombinator · 2021-12-03T02:04:42Z

I'm not sure what the state of the art is with the package registry and package tooling. But I suspect @mtojek and/or @jsoriano might be able to provide the necessary guidance here.

mtojek · 2021-12-03T10:10:52Z

FYI Package Registry/Storage will be under active development soon. We'd like to replace the Git repository with true object storage.

Regarding testing, tools we have can help with policy- and agent-oriented testing, but they don't include any Selenium or Kibana UI tests. They just depend on the available Fleet API. Keep in mind that we don't exercise Kibana, but Elastic packages. You can find more information here, especially system tests.

miltonhultgren · 2021-12-03T11:17:15Z

@dgieselaar Can you share some insight on how the APM team handles this since you also need to launch APM server in your tests? (maybe you don't do it for E2E tests?)

klacabane · 2021-12-03T12:50:34Z

@matschaffer this is already supported but I need to verify if we actually need this or not

I was able to get a subset of elasticsearch functional tests running and succeeding against datastreams with the esArchiver approach:

install elasticsearch integration assets (index templates..) through the fleet api
transform metricbeat-* archived data into metrics-* data and remove mappings.json from the archive
update esArchiver to load new datastream data with { useCreate: true } option
update test_user permissions to allow access to datastream

dgieselaar · 2021-12-03T13:00:01Z

@dgieselaar Can you share some insight on how the APM team handles this since you also need to launch APM server in your tests?

We use an empty esArchive that only contains mappings/ empty indices for APM data. in the future we'll probably install the APM integration package. We don't use APM Server with Synthtrace.

Separately we might have integration tests that spin up the full APM stack (Kibana, ES, APM Server, APM Agents), but I have not worked with those before.

matschaffer · 2021-12-06T06:55:52Z

@matschaffer this is already supported but I need to verify if we actually need this or not

Oh, wow. Nice find! Seems like it's main usage is for the package registry, so seems like you're on the right track.

miltonhultgren · 2022-01-26T15:09:49Z

This issue is related: #123345 (action point from our team meeting on data generation tooling)

neptunian · 2022-01-31T21:11:22Z

After discussing with @klacabane we decided:

Use the package registry docker image to load actual packages so we can have the latest mappings. We can pass in a custom configuration yaml that points to a server and install the latest package with the endpoint /api/fleet/epm/packages/${pkg}/ which should install the latest version
generate agent/integration data by transform the existing _mb archived data, adding the correct datastream index value or any other values that are necessary (using something similar to this script)
copy over the tests, similar to _mb, we'll have something like _agent copies

We can start with elasticsearch #119109

matschaffer · 2022-02-01T06:02:06Z

Is this maybe a good opportunity to try to work out a way to do testing without copying the tests themselves?

I'm thinking maybe something like a helper that sets up the same test for different setup methods.

jsoriano · 2022-02-01T10:40:38Z

Use the package registry docker image to load actual packages so we can have the latest mappings.

If the only reason to start the registry is to load packages, take into account that there may be additional methods in the near future that don't require a registry, such as #122297 or #70582.

neptunian · 2022-02-01T16:01:47Z

Use the package registry docker image to load actual packages so we can have the latest mappings.

If the only reason to start the registry is to load packages, take into account that there may be additional methods in the near future that don't require a registry, such as #122297 or #70582.

Thanks @jsoriano . Yes, that would be our only reason. In #122297, it looks like packages that are being shipped with Kibana are being automatically installed during Kibana setup. Is this going to change to allow packages to be shipped that aren't automatically installed? Also, are our packages considered "stack-aligned"? As in we don't need the ability to upgrade our packages out-of-band from stack releases (to fix a bug for instance) and want to make sure our packages upgrade with Kibana ("for example new features in the UI that depend on new fields that were added in the same Stack version") @sayden

klacabane · 2022-11-09T14:33:44Z

Rethinking this we can skip the functional tests and only focus on api tests: we already have functional (e2e) coverage of the .monitoring-* data, so if we can validate that API responses are similar when reading from metrics-* we have the guarantee that UI will respond similarly with both data sources. This will save significant effort and computing power.

The following steps will setup an initial coverage:

Transform .monitoring-* data into metrics-* data. Here's a script that transforms the data https://gist.github.com/klacabane/f2ef21b0c2722f312f2d983d9870dc68:
- it adds the field used by esArchiver to detect whether the target is a datastream or an indice
- it adds the data_stream object that replaces metricset construct for query filtering
Extract metrics-* mappings with esArchiver and bundle them with the output of 1.
Create a copy of the _mb tests that loads the bundle created in 2. and make the same assertions

As a follow up we can replace the static mappings created in 2. by installing the packages as a setup step to the test suite. This will require spawning a local package registry and hitting the fleet api. This step will allow easier updates to the mappings so that we have continuous integration in the SM tests since it will only require an update of the packages (we'll have to bundle the packages in the kibana repo to avoid any network reliance)

## Summary Part of #119658 Add api integration tests for kibana routes to validate behavior when reading data ingested by elastic-agent. We currently have a testing suite for legacy and another one for metricbeat. Since metricbeat and agent documents only differ in their metadata, for example agent will populate a `data_stream.*` property to identify the document types while metricbeat uses `metricset.*`, the tests assertion validating _business_ data should pass regardless of the documents source. With this in mind the metricbeat tests were updated to run the tests twice, one time with metricbeat data and a second time with package data. To generate the archives the `metrics-*` mappings were extracted with esArchiver from an elasticsearch with the package installed, and the documents were transformed from the metricbeat documents with [this script](https://gist.github.com/klacabane/654497ff86053c60af6df15fa6f6f657). Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

## Summary Part of elastic#119658 Add api integration tests for kibana routes to validate behavior when reading data ingested by elastic-agent. We currently have a testing suite for legacy and another one for metricbeat. Since metricbeat and agent documents only differ in their metadata, for example agent will populate a `data_stream.*` property to identify the document types while metricbeat uses `metricset.*`, the tests assertion validating _business_ data should pass regardless of the documents source. With this in mind the metricbeat tests were updated to run the tests twice, one time with metricbeat data and a second time with package data. To generate the archives the `metrics-*` mappings were extracted with esArchiver from an elasticsearch with the package installed, and the documents were transformed from the metricbeat documents with [this script](https://gist.github.com/klacabane/654497ff86053c60af6df15fa6f6f657). Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 90f0ae2)

# Backport This will backport the following commits from `main` to `8.6`: - [[Stack Monitoring] api tests for kibana (#145230)](#145230)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Kevin Lacabane <kevin.lacabane@elastic.co>

### Summary Part of #119658 Add api integration tests for logstash routes to validate behavior when reading data ingested by elastic-agent. We currently have a testing suite for legacy and another one for metricbeat. Since metricbeat and agent documents only differ in their metadata, for example agent will populate a `data_stream.*` property to identify the document types while metricbeat uses `metricset.*`, the tests assertion validating _business_ data should pass regardless of the documents source. With this in mind the metricbeat tests were updated to run the tests twice, one time with metricbeat data and a second time with package data. To generate the archives the `metrics-*` mappings were extracted with esArchiver from an elasticsearch with the package installed, and the documents were transformed from the metricbeat documents with [this script](https://gist.github.com/klacabane/654497ff86053c60af6df15fa6f6f657). Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

### Summary Part of #119658 Add api integration tests for cluster and elasticsearch routes to validate behavior when reading data ingested by elastic-agent. We currently have a testing suite for legacy and another one for metricbeat. Since metricbeat and agent documents only differ in their metadata, for example agent will populate a `data_stream.*` property to identify the document types while metricbeat uses `metricset.*`, the tests assertion validating _business_ data should pass regardless of the documents source. With this in mind the metricbeat tests were updated to run the tests twice, one time with metricbeat data and a second time with package data. To generate the archives the `metrics-*` mappings were extracted with esArchiver from an elasticsearch with the package installed, and the documents were transformed from the metricbeat documents with [this script](https://gist.github.com/klacabane/654497ff86053c60af6df15fa6f6f657). Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

…45138) ### Summary Part of elastic#119658 Add api integration tests for cluster and elasticsearch routes to validate behavior when reading data ingested by elastic-agent. We currently have a testing suite for legacy and another one for metricbeat. Since metricbeat and agent documents only differ in their metadata, for example agent will populate a `data_stream.*` property to identify the document types while metricbeat uses `metricset.*`, the tests assertion validating _business_ data should pass regardless of the documents source. With this in mind the metricbeat tests were updated to run the tests twice, one time with metricbeat data and a second time with package data. To generate the archives the `metrics-*` mappings were extracted with esArchiver from an elasticsearch with the package installed, and the documents were transformed from the metricbeat documents with [this script](https://gist.github.com/klacabane/654497ff86053c60af6df15fa6f6f657). Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 5cf0d0f)

…5138) (#145985) # Backport This will backport the following commits from `main` to `8.6`: - [[Stack Monitoring] api tests for cluster and elasticsearch (#145138)](#145138)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Kevin Lacabane <kevin.lacabane@elastic.co>

klacabane · 2022-11-22T14:35:31Z

Closing this as initial test coverage is merged. Follow up in #146000

neptunian added the Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services label Nov 24, 2021

neptunian assigned neptunian and klacabane Nov 24, 2021

miltonhultgren mentioned this issue Nov 25, 2021

[Infrastructure Monitoring] Better data generation #119491

Closed

1 task

klacabane mentioned this issue Aug 1, 2022

[Stack Monitoring] Add metricbeat errors to Health API response #137288

Merged

klacabane unassigned neptunian Nov 15, 2022

miltonhultgren added the Feature:Stack Monitoring label Nov 15, 2022

This was referenced Nov 16, 2022

[Stack Monitoring] api tests for logstash #145351

Merged

[Stack Monitoring] api tests for cluster and elasticsearch #145138

Merged

[Stack Monitoring] api tests for kibana #145230

Merged

klacabane mentioned this issue Nov 22, 2022

[Stack Monitoring] Enable testing of latest package versions #146000

Closed

klacabane closed this as completed Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack Monitoring] Testing strategy for agent/integration data #119658

[Stack Monitoring] Testing strategy for agent/integration data #119658

neptunian commented Nov 24, 2021 •

edited

Loading

elasticmachine commented Nov 24, 2021

matschaffer commented Nov 25, 2021

miltonhultgren commented Nov 25, 2021

klacabane commented Nov 29, 2021

matschaffer commented Nov 30, 2021

klacabane commented Nov 30, 2021

klacabane commented Dec 2, 2021 •

edited

Loading

matschaffer commented Dec 3, 2021

ycombinator commented Dec 3, 2021

mtojek commented Dec 3, 2021

miltonhultgren commented Dec 3, 2021

klacabane commented Dec 3, 2021 •

edited

Loading

dgieselaar commented Dec 3, 2021

matschaffer commented Dec 6, 2021

miltonhultgren commented Jan 26, 2022

neptunian commented Jan 31, 2022 •

edited

Loading

matschaffer commented Feb 1, 2022

jsoriano commented Feb 1, 2022

neptunian commented Feb 1, 2022 •

edited

Loading

klacabane commented Nov 9, 2022

klacabane commented Nov 22, 2022

[Stack Monitoring] Testing strategy for agent/integration data #119658

[Stack Monitoring] Testing strategy for agent/integration data #119658

Comments

neptunian commented Nov 24, 2021 • edited Loading

elasticmachine commented Nov 24, 2021

matschaffer commented Nov 25, 2021

miltonhultgren commented Nov 25, 2021

klacabane commented Nov 29, 2021

matschaffer commented Nov 30, 2021

klacabane commented Nov 30, 2021

klacabane commented Dec 2, 2021 • edited Loading

matschaffer commented Dec 3, 2021

ycombinator commented Dec 3, 2021

mtojek commented Dec 3, 2021

miltonhultgren commented Dec 3, 2021

klacabane commented Dec 3, 2021 • edited Loading

dgieselaar commented Dec 3, 2021

matschaffer commented Dec 6, 2021

miltonhultgren commented Jan 26, 2022

neptunian commented Jan 31, 2022 • edited Loading

matschaffer commented Feb 1, 2022

jsoriano commented Feb 1, 2022

neptunian commented Feb 1, 2022 • edited Loading

klacabane commented Nov 9, 2022

klacabane commented Nov 22, 2022

neptunian commented Nov 24, 2021 •

edited

Loading

klacabane commented Dec 2, 2021 •

edited

Loading

klacabane commented Dec 3, 2021 •

edited

Loading

neptunian commented Jan 31, 2022 •

edited

Loading

neptunian commented Feb 1, 2022 •

edited

Loading