Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Manager] Install uploaded package #77986

Merged
merged 29 commits into from
Oct 5, 2020

Conversation

skh
Copy link
Contributor

@skh skh commented Sep 21, 2020

Summary

Partially implements #70582

This adds the functionality to actually install a package from an uploaded zip or tar.gz archive. A package uploaded this way can be deleted by the existing DELETE /api/ingest_manager/epm/packages/PKGKEY endpoint.

IMPORTANT: The installed package cannot yet be added to a policy and used by enrolled agents. This functionality will be added in a follow-up PR. This PR is opened for review now to keep the amount of code to be reviewed manageable (it is on the large side already).

The API endpoint to upload packages is undocumented and should not be used. Therefore, this PR does not contain documentation or changes to the OpenAPI spec.

In detail, the changes include:

  • Refactoring EPM to allow installation from different sources (upload and registry). This includes types, method names, and control flow.

  • Adding error classes for specific errors related to package upload, add tests for those which don't return 400

  • Changing the saved object epm-packages to include the installation source of a package (registry or upload).

  • Performing some basic verification on the uploaded archive, and use the top-level manifest.yml to extract the information we need for package installation.

  • Picking all information about config templates and datasets from the manifest.xml files in the package.

  • Refactoring the in-memory cache for package data: Add a cache package key->package file list and remove the cache package key->archive location. The existing file cache file path->file buffer is unchanged. This is necessary because the cache now contains package contents that do not come from the registry, and thus don't have an archive location.

How to test this

Try to break it

This change contains a fair amount of refactoring of existing functionality. Ideally, the behavior when installing and using packages from the registry should not have changed. Please review with a focus on the changed code paths, and try to break them in manual testing. The existing integration tests were a great help, but we don't have full test coverage over all areas of the code.

Upload and uninstall packages

Integration tests have been added. The package archives used in these tests can also be used for manual testing. First, export a variable with the path to your kibana project directory:

  • export KIBANA_HOME=./path/to/srcdir/of/kibana

Then, copy & paste the following curl commands to see the behavior of the API endpoint on the command line. (Assuming a locally running dev environment, --no-base-path and port 5601).

Happy path:

  • zip: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • tar.gz: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_0.1.4.tar.gz -H 'kbn-xsrf: xyz' -H 'Content-Type: application/gzip'

Uploaded packages can be deleted as expected:

  • curl -X DELETE -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages/apache-0.1.4 -H 'kbn-xsrf: xxx'

Wrong content type:

  • tar.gz with application/zip: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_0.1.4.tar.gz -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • zip with application/gzip: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/gzip'

Invalid packages

  • Two toplevel directories: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_invalid_two_toplevels_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • No manifest: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_invalid_no_manifest_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • Invalid YAML in manifest: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_invalid_manifest_invalid_yaml_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • Mandatory field missing in manifest: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_invalid_manifest_missing_field_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'
  • Toplevel directory doesn't match name and version from manifest: curl -X POST -u elastic:changeme http://localhost:5601/api/ingest_manager/epm/packages --data-binary @$KIBANA_HOME/x-pack/test/ingest_manager_api_integration/apis/fixtures/direct_upload_packages/apache_invalid_toplevel_mismatch_0.1.4.zip -H 'kbn-xsrf: xyz' -H 'Content-Type: application/zip'

NOTE: When not copy-pasting, be sure to use --data-binary, not --data or the upload is corrupted.

@skh skh self-assigned this Sep 21, 2020
@skh skh added the Team:Fleet Team label for Observability Data Collection Fleet team label Sep 21, 2020
@skh skh force-pushed the 70582-direct-package-upload-part-2 branch 2 times, most recently from 9cbeddf to 0557804 Compare September 23, 2020 15:41
@skh skh force-pushed the 70582-direct-package-upload-part-2 branch 3 times, most recently from e0baec6 to f841ea8 Compare September 28, 2020 08:21
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the PR and left a few comments. Overall I think this is heading in the right direction. It would be nice in some cases to directly build the struct for the manifest so it can be reused in multiple places.

I also want to take this as an opportunity to share a bit on how I think the end result could look like.

image

There are different sources from where to install a package. In the end, in all cases it is a .zip file. The fetcher for the specific sources knows how to get this zip file from the source. From there, installation is always the same. It goes through the validator and caches / stores the package locally. As soon as a package is installed, all data should be always fetched from this cache / store.

The only source where packages can be browsed without being installed is the package registry. This is our special case. The package list package + package details page for uninstalled packages comes directly from the registry. All the installed package details are served locally.

My hope is that the only place where we need to use the json from the registry is in the uninstalled package list and the details page for uninstalled package.

@skh skh force-pushed the 70582-direct-package-upload-part-2 branch 3 times, most recently from 1771d65 to ff01e6e Compare September 30, 2020 10:13
@skh skh marked this pull request as ready for review October 1, 2020 14:41
@skh skh requested a review from a team as a code owner October 1, 2020 14:41
@skh skh requested a review from a team October 1, 2020 14:41
@skh skh requested a review from a team as a code owner October 1, 2020 14:41
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@skh skh added release_note:skip Skip the PR/issue when compiling release notes v7.10.0 v8.0.0 Feature:EPM Fleet team's Elastic Package Manager (aka Integrations) project labels Oct 1, 2020
@skh skh requested a review from jen-huang October 1, 2020 14:43
@skh skh force-pushed the 70582-direct-package-upload-part-2 branch from 7746bd2 to d182c7d Compare October 2, 2020 13:29
Copy link
Contributor

@jfsiii jfsiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Great stuff! I've read this through a few times but haven't had a chance to run it.

I am still processing a few of the flow changes (basically around new cache/stores) but everything reads well and seems isolated enough (and with tests) that it'd be easy to change later if we needed to.

@skh skh force-pushed the 70582-direct-package-upload-part-2 branch from d182c7d to 523d8dc Compare October 5, 2020 10:40
@skh
Copy link
Contributor Author

skh commented Oct 5, 2020

@jen-huang @jonathan-buttner @jfsiii thanks for your comments! I have either answered, or made amends in the code, or both.

This is ready for another look.

@skh skh force-pushed the 70582-direct-package-upload-part-2 branch from 0881314 to df23d82 Compare October 5, 2020 13:42
@@ -0,0 +1,332 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like all the verify methods in this file

const supertest = getService('supertest');
const dockerServers = getService('dockerServers');
const log = getService('log');

const testPkgArchiveTgz = path.join(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is already done if the tests pass, but do these new fixtures have the new package changes? (I recall having to adjust apache_0.1.4.tar.gz during my PR)

Copy link
Contributor Author

@skh skh Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to update all package fixtures. Right now I adjusted the tests to the fact that the datasets (and assets related to them) are not found in the uploaded packages.

Copy link
Contributor

@jen-huang jen-huang Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is worth tracking the unpacked version of the fixtures. binary diffs are not super useful 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of unpackaged packages is that they may contain a lot of files, which clutter PRs -- I accidentally opened one with so many files that the github review interface became unusable. I consider these fixtures like the packages contained in the dockerized registry -- unfortunately a bit difficult to handle, but they shouldn't change that often.

Also, I think I can reduce the number of these packages and move some of the validation tests to unit tests.

@jen-huang
Copy link
Contributor

Oops, meant to leave a comment with that review ^: A few minor comments but LGTM!!

@skh
Copy link
Contributor Author

skh commented Oct 5, 2020

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

distributable file count

id before after diff
default 47107 47108 +1

Saved Objects .kibana field count

id before after diff
epm-packages 14 15 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@skh skh merged commit ce4641a into elastic:master Oct 5, 2020
@skh skh deleted the 70582-direct-package-upload-part-2 branch October 5, 2020 16:51
skh added a commit to skh/kibana that referenced this pull request Oct 5, 2020
* Refactor: installPackage -> installPackageFromRegistry

* Refactor: factor out source-agnostic installation steps

* Unpack and cache uploaded zip and tgz files.

* Add basic archive verification and parse manifest.

* Catch error when zip archive is uploaded as gzip.

* Add API integration tests.

* Remove unnecessary use of "package key" concept.

* Add 'install_source' property to saved object epm-packages.

* Adjust tests.

* Add API integration test for manifest missing fields.

* Refactor loadArchive -> loadArchivePackage.

* Refactor caching of package archive content

* Get datasets and config templates from manifest files.

* Use file paths from archive instead of asset paths from registry.

* Correctly load registry packages into cache

* Use InstallablePackage instead of RegistryPackage where possible.

* Actually install uploaded package.

* Add missing field to saved objects in tests.

* Adjust unit test to pick buffer extractor.

* Adjust unit test.

* Fix and re-enable getAsset() test.

* Adjust integration tests.

* Make error message match test.

* Pick data_stream.dataset from manifest if set.

* dataset -> data_stream also in comments

* Remove unused variable.

* Use pkgToPkgKey() where appropriate.

* More dataset -> data stream renaming.

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
skh added a commit that referenced this pull request Oct 5, 2020
* Refactor: installPackage -> installPackageFromRegistry

* Refactor: factor out source-agnostic installation steps

* Unpack and cache uploaded zip and tgz files.

* Add basic archive verification and parse manifest.

* Catch error when zip archive is uploaded as gzip.

* Add API integration tests.

* Remove unnecessary use of "package key" concept.

* Add 'install_source' property to saved object epm-packages.

* Adjust tests.

* Add API integration test for manifest missing fields.

* Refactor loadArchive -> loadArchivePackage.

* Refactor caching of package archive content

* Get datasets and config templates from manifest files.

* Use file paths from archive instead of asset paths from registry.

* Correctly load registry packages into cache

* Use InstallablePackage instead of RegistryPackage where possible.

* Actually install uploaded package.

* Add missing field to saved objects in tests.

* Adjust unit test to pick buffer extractor.

* Adjust unit test.

* Fix and re-enable getAsset() test.

* Adjust integration tests.

* Make error message match test.

* Pick data_stream.dataset from manifest if set.

* dataset -> data_stream also in comments

* Remove unused variable.

* Use pkgToPkgKey() where appropriate.

* More dataset -> data stream renaming.

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:EPM Fleet team's Elastic Package Manager (aka Integrations) project release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants