Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve packages archived (as zip) on storage #703

Merged
merged 46 commits into from
Oct 7, 2021

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Jul 30, 2021

Add experimental support to serve packages stored directly as zip files, so users receive the same packages as stored in the backend.

An "Indexer" interface is introduced to allow to provide different implementations for sources of packages, it abstracts how packages are indexed and queried.
List of packages is not global anymore, allowing better isolation of test cases.
All handlers use the indexers to find packages and files.

This is done in context of #670, to decouple package storage from the registry.

Checklist:

  • Refactor the artifacts handler so it uses the paths from the indexer.
  • Some conditional code is added to decide how to handle packages depending on how they are stored, make it more explicit and safer in general.
  • In the filesystem indexer, collect the Package objects from the zip files without extracting them to disk.
  • When should the index.json files be generated and served?

Postponed:

@jsoriano jsoriano self-assigned this Jul 30, 2021
@elasticmachine
Copy link

elasticmachine commented Jul 30, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-10-06T18:10:08.443+0000

  • Duration: 5 min 46 sec

  • Commit: e2e2894

Test stats 🧪

Test Results
Failed 0
Passed 109
Skipped 0
Total 109

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of indexer seems to be the right direction, but it also shows the spaghetti code we have here.

I understand that this will change the way we store packages in Package Storage - zip archives instead of raw files? Please keep in mind that we'll have to adjust our tooling to support this distribution form and also inform other parties about the initiative (teams contributing directly to Package Storage like endpoint or APM).

There are also some features we use thank to raw files:

  • we can see differences between next package revisions (PR changes)
  • "grepping" is easy

I don't think it's impossible to plan replacements for them.

util/fs.go Outdated
case info.IsDir():
return NewExtractedPackageFileSystem(path)
case strings.HasSuffix(path, ".zip"):
return NewZipPackageFileSystem(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was hoping to find something like this 🙂 With the new interfaces added in 1.16 these things should be more common.

main.go Outdated Show resolved Hide resolved
@jsoriano jsoriano marked this pull request as ready for review September 20, 2021 17:32
@jsoriano
Copy link
Member Author

jsoriano commented Sep 20, 2021

@mtojek I am opening this for review.

I have updated the branch with recent changes related to categories and migrated the new code to follow the refactor, API tests pass now as they are in master.

I have also hidden the new feature of serving zipped packages behind an experimental flag, so we don't have surprises by trying to serve additional files. Still with this refactor we would have API and storage less coupled, and we would have the package filtering (aka search) in a single place.

We can continue later with further optimizations or refactors as the one to use io.FS implementations, I have added them as postponed tasks in the description.

@mtojek mtojek self-requested a review September 21, 2021 07:49
Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, but nothing really breaking the idea. Good job!

I'm not sure if you have done it, but maybe you can try to build a Docker image with real content from package storage to check the feature (including compressed packages).

util/packages.go Outdated Show resolved Hide resolved
util/packages.go Outdated Show resolved Hide resolved
util/packages.go Outdated
@@ -5,6 +5,7 @@
package util
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an option to place it outside of util :) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this util package contains too many things, I will leave this for the end, or for a follow up :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have started refactoring this and I had opened a couple of Pandora's boxes with naming and dependency cycles, I think I am going to leave it for a follow up to don't polute this already big PR.

util/packages.go Outdated Show resolved Hide resolved
util/packages.go Outdated Show resolved Hide resolved
static.go Outdated Show resolved Hide resolved
search.go Outdated Show resolved Hide resolved
util/fs.go Outdated Show resolved Hide resolved
util/packages.go Outdated Show resolved Hide resolved
util/packages.go Outdated Show resolved Hide resolved
@ruflin
Copy link
Member

ruflin commented Sep 23, 2021

I really like the idea of introducing the Indexer especially as it also gets rid of all the passing around of paths. It also means assuming at one point we need to have an Indexer as a service (ES?)) it can be added to the list.

I only skimmed through the changes and it contains a lot of refactoring too which is great. As you know, I'm not too big of a fan of large PRs and if possible it would be nice to take out some parts of the refactoring / cleanup into separate PRs as it will make quicker to review and later in case of issues figuring out when it happened. But as I'm not the main reviewer leave it to @mtojek and you to decide if you want to get it all in at once.

@jsoriano
Copy link
Member Author

jsoriano commented Oct 6, 2021

I really like the idea of introducing the Indexer especially as it also gets rid of all the passing around of paths. It also means assuming at one point we need to have an Indexer as a service (ES?)) it can be added to the list.

Exactly, the idea is to open the gate to external indexers, as could be ES 🙂. The "virtual" package file system used now only on tests could also help in such an indexer that wouldn't have local access to the packages.

I only skimmed through the changes and it contains a lot of refactoring too which is great. As you know, I'm not too big of a fan of large PRs and if possible it would be nice to take out some parts of the refactoring / cleanup into separate PRs as it will make quicker to review and later in case of issues figuring out when it happened. But as I'm not the main reviewer leave it to @mtojek and you to decide if you want to get it all in at once.

Yep, I am also not a fan of large PRs. This PR would be a bit over the limit. I could add first the indexer and then the implementation for zips, and then do the changes in each API handler, but this could be splitting things too much, and it can be cumbersome to adapt the tests for each PR. I will leave the decision to @mtojek 🙂

@mtojek mtojek self-requested a review October 7, 2021 09:20
Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get it in and iterate then :)

@jsoriano jsoriano merged commit 0a3c71b into elastic:master Oct 7, 2021
@jsoriano jsoriano deleted the serve-zipped-packages branch October 7, 2021 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants