-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serve packages archived (as zip) on storage #703
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
95a9e40
to
80950a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of indexer seems to be the right direction, but it also shows the spaghetti code we have here.
I understand that this will change the way we store packages in Package Storage - zip archives instead of raw files? Please keep in mind that we'll have to adjust our tooling to support this distribution form and also inform other parties about the initiative (teams contributing directly to Package Storage like endpoint or APM).
There are also some features we use thank to raw files:
- we can see differences between next package revisions (PR changes)
- "grepping" is easy
I don't think it's impossible to plan replacements for them.
util/fs.go
Outdated
case info.IsDir(): | ||
return NewExtractedPackageFileSystem(path) | ||
case strings.HasSuffix(path, ".zip"): | ||
return NewZipPackageFileSystem(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was hoping to find something like this 🙂 With the new interfaces added in 1.16 these things should be more common.
@mtojek I am opening this for review. I have updated the branch with recent changes related to categories and migrated the new code to follow the refactor, API tests pass now as they are in master. I have also hidden the new feature of serving zipped packages behind an experimental flag, so we don't have surprises by trying to serve additional files. Still with this refactor we would have API and storage less coupled, and we would have the package filtering (aka search) in a single place. We can continue later with further optimizations or refactors as the one to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, but nothing really breaking the idea. Good job!
I'm not sure if you have done it, but maybe you can try to build a Docker image with real content from package storage to check the feature (including compressed packages).
util/packages.go
Outdated
@@ -5,6 +5,7 @@ | |||
package util |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an option to place it outside of util :) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this util
package contains too many things, I will leave this for the end, or for a follow up :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started refactoring this and I had opened a couple of Pandora's boxes with naming and dependency cycles, I think I am going to leave it for a follow up to don't polute this already big PR.
This reverts commit c4bb12a.
I really like the idea of introducing the Indexer especially as it also gets rid of all the passing around of paths. It also means assuming at one point we need to have an Indexer as a service (ES?)) it can be added to the list. I only skimmed through the changes and it contains a lot of refactoring too which is great. As you know, I'm not too big of a fan of large PRs and if possible it would be nice to take out some parts of the refactoring / cleanup into separate PRs as it will make quicker to review and later in case of issues figuring out when it happened. But as I'm not the main reviewer leave it to @mtojek and you to decide if you want to get it all in at once. |
Exactly, the idea is to open the gate to external indexers, as could be ES 🙂. The "virtual" package file system used now only on tests could also help in such an indexer that wouldn't have local access to the packages.
Yep, I am also not a fan of large PRs. This PR would be a bit over the limit. I could add first the indexer and then the implementation for zips, and then do the changes in each API handler, but this could be splitting things too much, and it can be cumbersome to adapt the tests for each PR. I will leave the decision to @mtojek 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get it in and iterate then :)
Add experimental support to serve packages stored directly as zip files, so users receive the same packages as stored in the backend.
An "
Indexer
" interface is introduced to allow to provide different implementations for sources of packages, it abstracts how packages are indexed and queried.List of packages is not global anymore, allowing better isolation of test cases.
All handlers use the indexers to find packages and files.
This is done in context of #670, to decouple package storage from the registry.
Checklist:
Package
objects from the zip files without extracting them to disk.index.json
files be generated and served?Postponed:
util
package, it has too many responsibilities.