Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement remote resolver for storage indexer #823

Merged
merged 35 commits into from
Jun 27, 2022

Conversation

mtojek
Copy link
Contributor

@mtojek mtojek commented Jun 9, 2022

This PR implements hijack handlers for the storage indexer to redirect package zip, artifacts, static requests to https://package-storage.elastic.co/.

It uses HTTP 303 to pass the request. It was the easiest way to implement because I don't have to write content streaming at all. No need for Accept-Range tests, download tests, etc.

Breaking changes:

  • Package Registry with storage indexer passes all requests to the Package Storage, it means that some resources won't be available to download except for statics (docs, imgs, etc.)

Spotted bugs:

Local testing:

go run . -feature-storage-indexer \
  -storage-indexer-bucket-internal gs://mtojek-temp/package-storage-v2 \
  -log-level debug

How to prepare the development bucket (cursor.json and search-index-all.json):

$ gsutil ls -r gs://mtojek-temp/package-storage-v2/v2
gs://mtojek-temp/package-storage-v2/v2/:
gs://mtojek-temp/package-storage-v2/v2/

gs://mtojek-temp/package-storage-v2/v2/metadata/:
gs://mtojek-temp/package-storage-v2/v2/metadata/
gs://mtojek-temp/package-storage-v2/v2/metadata/cursor.json

gs://mtojek-temp/package-storage-v2/v2/metadata/1/:
gs://mtojek-temp/package-storage-v2/v2/metadata/1/
gs://mtojek-temp/package-storage-v2/v2/metadata/1/search-index-all.json

Questions:

  • Should we move indices to the public bucket and just keep the queue-publishing with package candidates in the private bucket? I don't remember why have we decided to put indices in the internal one...

Fixes #767

@mtojek mtojek self-assigned this Jun 9, 2022
@mtojek mtojek changed the title Add integration tests for storage indexer TODO Add integration tests for storage indexer Jun 9, 2022
@elasticmachine
Copy link

elasticmachine commented Jun 9, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-06-27T11:12:24.727+0000

  • Duration: 6 min 24 sec

Test stats 🧪

Test Results
Failed 0
Passed 207
Skipped 0
Total 207

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@mtojek mtojek changed the title TODO Add integration tests for storage indexer TODO Adjust handlers for compatibility with storage indexer Jun 21, 2022
@mtojek
Copy link
Contributor Author

mtojek commented Jun 23, 2022

Status update:

I tried to use Google Storage client to reach out the public bucket, but failed as http.ServeContent needs objectHandle to implement io.ReadSeekCloser. I found the community implementation, but I can't guarantee that it's stable, but worked. I decided to revert.

I tried with a generic http.Client, but failed to implement streaming as it's mandatory to serve Range requests. I wasn't able to figure out how to implement it nicely. Reverted.

Now, I'm working on enabling the L7 proxy mode when the response can be streamed directly from Package Storage. I don't know what I do with streaming resources from the zip file. In the worst case, we will have to unpack and store them in the public bucket.

@mtojek mtojek changed the title TODO Adjust handlers for compatibility with storage indexer Implement hijack handlers for storage indexer Jun 23, 2022
@mtojek mtojek marked this pull request as ready for review June 23, 2022 17:59
@mtojek mtojek requested a review from a team June 23, 2022 17:59
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a suggestion about explicitly configuring the package to indicate that it requires redirects, instead of relying on the indexer, let me know what you think. I think this may require less refactors.

artifacts.go Outdated
@@ -58,7 +60,13 @@ func artifactsHandler(indexer Indexer, cacheTime time.Duration) func(w http.Resp
return
}

aPackage := packageList[0]
if storageIndexer, assert := aPackage.Indexer().(*storage.Indexer); assert {
storageIndexer.HijackArtifactsHandler(w, r, aPackage)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call it "Hijack", here we are doing a legit redirect 🙂

Some suggestions:

Suggested change
storageIndexer.HijackArtifactsHandler(w, r, aPackage)
storageIndexer.RedirectArtifactsHandler(w, r, aPackage)
Suggested change
storageIndexer.HijackArtifactsHandler(w, r, aPackage)
storageIndexer.RemoteArtifactsHandler(w, r, aPackage)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can apply those renamings, not a problem :)

I found an analogy with http.Hijacker, which lets a handler take over the TCP connection. In this case, we let Storage Indexer handlers process requests, which include a redirection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep this terminology if already present in the standard library :)

artifacts.go Outdated Show resolved Hide resolved
@mtojek mtojek requested a review from jsoriano June 24, 2022 12:35
@mtojek mtojek changed the title Implement hijack handlers for storage indexer Implement remote resolver for storage indexer Jun 24, 2022
packages/indexer.go Outdated Show resolved Hide resolved
packages/http.go Show resolved Hide resolved
artifacts.go Outdated Show resolved Hide resolved
packages/http.go Outdated Show resolved Hide resolved
packages/resolver.go Show resolved Hide resolved
storage/resolver.go Outdated Show resolved Hide resolved
@mtojek mtojek requested a review from jsoriano June 27, 2022 08:18
@mtojek mtojek added the enhancement New feature or request label Jun 27, 2022
packages/http.go Outdated Show resolved Hide resolved
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the comments! 🚀

@mtojek mtojek merged commit dbffb08 into elastic:main Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detach Package Registry from Package Storage
3 participants