-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return single package when multiple indexers are used #849
Return single package when multiple indexers are used #849
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this.
}, | ||
}, | ||
{ | ||
title: "single package unsorted versions", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a case where both indexers contain the same package version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the case where both indexers contain the same package version, what would the behavior be for a query with ?all=true
?
I would think even in this case the response should only contain one item for that package version. The reason being that the other APIs don't provide a way to differentiate between the two items (e.g. the only specifiers are name and version in the download request /epr/example/example-1.0.1.zip
). Assuming the solution is to only return one in the search/?all=true
response then it would be good to specify somewhere what the priority order is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that in that case we have to decide that some indexers have precendence over the others, I will take a look to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a change so indexes defined later have precedence over the ones defined earlier. I have made that because currently we are defining the zip indexer, with support for signatures, after the plain file indexer, so now if a package is in both indexers, the zip with signature will have precedence.
return packages, nil | ||
} | ||
|
||
func latestPackagesVersion(source packages.Packages) (result packages.Packages) { | ||
sort.Sort(byNameVersion(source)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call sort.Sort
on the source, it won't alter the original packages
slice? I'm double-checking if this is concurrency-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will modify the packages
slice defined in the CombinerIndexer.Get
, that is built on every call to this method and shouldn't be used concurrently in any other place.
CHANGELOG.md
Outdated
@@ -10,6 +10,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
|
|||
### Bugfixes | |||
|
|||
* Return only the latest version of each package when a combined index is used. [#849](https://github.com/elastic/package-registry/pull/849) | |||
* Return only last appearance of the same version of a package when it is available in multiple indexes. [#849](https://github.com/elastic/package-registry/pull/849) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss following scenario:
Developer is working on an integration locally. The package manifest defines the same version as for the embedded package already available in the Docker image, so there are 2 different package revisions published under the same version. Will EPR serve the local mounted package or the Docker embedded one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends, if both paths are defined in the same file system indexer, it will return the first one it finds, the second one is discarded as duplicated.
If they are defined in different indexers (plain file system vs zip) in a composable indexer, it will return the last one it finds. This is what is being implemented in this PR.
I can make the composable indexer to return the first package found, to be consistent with a file system indexer with multiple paths. But then I think that I would also reorder the initialization of the zip and plain file system indexers so the zip one takes precedence.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I'm asking is that users will struggle to figure out which package is served when. Maybe it's just better to change the order of indexers to ensure priorities:
- Local ZIP indexer.
- Local "unpacked" indexer.
- Google Storage Indexer.
In the nearest future, I'd like to introduce another indexer, a Proxy indexer?, which will connect to https://epr.elastic.co to pull results from them. This way it won't be necessary to ship EPR as heavy Docker images if the user has access to the internet and can stream relevant images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 0116753.
Combined indexer just merges the results of each individual indexer.
Select the latest version of each package when not all versions are expected.
In case the same version of a package is available in multiple indexers, the first one found is used.