Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an Elastic Integrations track #280

Open
DJRickyB opened this issue Jul 11, 2022 · 0 comments
Open

Create an Elastic Integrations track #280

DJRickyB opened this issue Jul 11, 2022 · 0 comments
Labels
new workload Any work related to adding a new track or functionality within a track

Comments

@DJRickyB
Copy link
Contributor

Overview

A new track should leverage our existing Elastic Solutions benchmarking code to take arbitrary packages and benchmark them.

Things that need to be supported

  • Specifying versions of an integration
  • Specifying ratios of integration data (already supported in elastic/)
  • Specifying data volumes and intervals to be benchmarked (already supported in elastic/)

Considerations

  • Evolution of data format over time - We can make use of a package's sample_data.json file and field definitions, and use https://github.com/elastic/elastic-integration-corpus-generator-tool or similar tooling to handle cardinality in seed data.
  • Tracking changes to Pipelines and index settings - this is the core concern of this issue, and speaks as much to process as it does to the track itself. We can use elastic-package dump as the basis for a workflow to extract Elasticsearch artifacts for an installed Integration.
    • Where do we store the dumped artifacts? What automation should we use? What structure do we use?
  • Onboarding - Should a package developer have to create seed data in order to use the track? Should we be able to support differing levels of maturity to onboard? We should plan sufficient time to document requirements to use this benchmarking tool for a given integration. Given we would need to dump the objects for a package ahead of time, there will likely be some external component to onboarding a package.
  • Automation - We should be able to execute this in our nightly benchmarks to track changes in Elasticsearch as well as the packages.
  • Handling - Integration artifact resolution would likely need to be implemented in a Track Processor, which is in use in other solutions-oriented benchmarks currently but is not well documented.

Initial thinking

The track would be in elastic/integrations/track.json and would have full advantage of the code in elastic/shared.

How best would we handle the interaction between the target stack version, the package version, and the integration ratios?

  • We could completely ignore the relationship, separately configuring --distribution-version (if applicable) from the existing track parameter integration_ratios and the prospective integrations, which could allow to declare explicit package versions. This introduces fragility where there is a hard compatibility concern (breaking change, new feature) which is not supported, and it would be up to the automation or engineer to ensure compatibility.
  • Otherwise we could use the target stack version to retrieve the relevant package version (either not allowing to declare it, or having automagic default behavior). This can be accomplished by an (online) API request to the Elastic Package Repository, in the form of GET https://epr.elastic.co/search?package=<package-name>&kibana.version=<stack-version>. This would allow us to only define the integration_ratios parameter, which may be desirable.
  • The elastic/integrations track could support only one package at a time, and encourage a solution-focused benchmark to instead be developed for mixes of integrations as necessary. There would still need to be matching of package-to-stack-version but this leaves behind the verbose integration_ratios parameter, which could be desirable (especially for package developers who always care only about their own code).
@DJRickyB DJRickyB added the new workload Any work related to adding a new track or functionality within a track label Jul 11, 2022
@gizas gizas mentioned this issue Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new workload Any work related to adding a new track or functionality within a track
Projects
None yet
Development

No branches or pull requests

1 participant