Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17406: Introduce Version Catalogs #2706

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

malliaridis
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-17406

Description

With the version catalogs all versions and references to them are managed in a single file and updated from there.

Solution

This PR introduces version catalogs and removes the palantir consistent versions plugin, since it is no longer needed. Additionally, the build module(s) are restructured and further changes from apache/lucene#13484 were applied to reflect the same behavior.

For syncing versions a new gradle file (dependencies.gradle) is introduced, where the version of a dependency (either transitive or non-transitive) is set via constraints.

This PR contains the changes from #2646 without the dependency updates made, that will be handled in separate PR(s).

Closes #2646.

Tests

Minor adjustments have been made to some test files that fix some references (due to restructuring of build files), and besides that spotless and tidy are added to run on build files too.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge effort on your part! Even considering a ton of this was lifted from Lucene.

I have some dislike for having to come up with lib names in the TOML and referencing them everywhere instead of well-known "group:artifact" strings. I'm not sure if we are forced into this?

To validate this, I'd want to look at the tar.gz contents before & after for ground-truth on what we ship.

build.gradle Outdated Show resolved Hide resolved
thetaphi-forbiddenapis = { id = "de.thetaphi.forbiddenapis", version.ref = "thetaphi-forbiddenapis" }
udnercouch-download = { id = "de.undercouch.download", version.ref = "undercouch-download" }

[libraries]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah well, we've got to follow what Gradle wants us to do now but IMO coming up with a unique library string for each library is annoying. Maintaining this is annoying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to come up with your own name. Intellij will help you with suggestion prompts, if you're using it but otherwise you're down to using those custom names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. Lack of namespacing will get us into trouble. Would it be better if we agree on some simple convention to use the original coordinate with dots replaced by space? E.g.

[versions]
org-apache-commons_commons-compress = "1.26.1"
org-apache-lucene = "9.12.0"

[libraries]
org-apache-commons_commons-compress = { module = "org.apache.commons:commons-compress", version.ref = "org-apache-commons_commons-compress" }
org-apache-lucene_lucene-foo = { module = "org.apache.lucene:lucene-foo", version.ref = "org-apache-lucene" }
org-apache-lucene_lucene-bar = { module = "org.apache.lucene:lucene-bar", version.ref = "org-apache-lucene" }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think before we run into namespace limits we would have other issues first. 😅 However, even if a collection of libraries use the same group and eventually similar artifact namings, we always can add more words in the alias, like prefixes and suffixes. This is already the case for some libraries.

Using the fully qualified name would require renaming the alias and all places where it is used if a library changes group or artifact name (like when splitting into multiple modules). So I would still prefer a simplified alias, rather than the fully qualified name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do that, although there is a meaning to prefixes like this in gradle catalogs: they form artifact groups [1]. When I started using them, I followed a convention similar to yours... but over time I resigned from doing that. It wasn't really that helpful (or intuitive, with those dashes everywhere).

[1] https://docs.gradle.org/current/userguide/platforms.html#sub:mapping-aliases-to-accessors

apply plugin: "com.github.node-gradle.node"
apply plugin: libs.plugins.nodegradle.node.get().pluginId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I liked it as it was -- a well known string instead of some dotted namespace from the toml that we invent for each dependency.

Comment on lines +42 to +53
[libraries]
...
foo-bar-baz = { module = "foo.bar:baz", version.ref = "foo-bar-baz" }

Note that the used names separated by dashes are later referenced with dots
instead of dashes, but more on that later.

The chosen name for the module should more or less reflect the module's
group name and module id in a way that it groups related dependencies under
the same "prefix" (see below). There is no specific convention here and
group prefixes for domain names like "com" and "io" are avoided, as they
do not add any value and increase the size of the reference / alias name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use libs.versions.toml; are we actually forced to add such invented names here and use them as such as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point and I agree with you. It is clearer if we use the "group:artifact" notation in most cases and just not provide the version information in the string. This would then pick the version from the version catalog as far as I understand it. I tried it with the strings before and it worked just fine.

There is however one risk with this approach that may also be a feature we want: if we do not force the developers to use the aliases, they may add at some point version information to dependencies without adding them to version catalogs. This could potentially lead to different versions of the same library (if the current plugins allow it, not sure), which is the case at the moment for some plugins for example (not checked by palantir's version plugin).

I am open to use strings instead, just want a few more opinions / confirmations for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would then pick the version from the version catalog as far as I understand it.

It would surprise me if it did it automatically... You can have a version catalog with multiple versions of the same artifact (with identical coordinates) - how would it choose the right one?

The rare upside of using those aliases is that if the coordinates of a library change (which infrequently happens), you only need to change it in one place. ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have a version catalog with multiple versions of the same artifact (with identical coordinates) - how would it choose the right one?

That's interesting, I didn't know that. 🤔 I guess it would follow the version resolution strategy like usual in that case, but I think even if that is the case, this is not what we want probably.

I guess going with the aliases is the best option we have so far.

return metrics::writeMap;
return metrics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed in this PR, I assume

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was causing linting errors in the CI/CD, but not locally, which is why I had to fix it. Same case for p::writeMap in another commit.

@dsmiley
Copy link
Contributor

dsmiley commented Sep 25, 2024

spotless and tidy are added to run on build files too

Awesome!

@dsmiley
Copy link
Contributor

dsmiley commented Sep 28, 2024

I checked out this branch and did gw assemble then renamed solr/packaging/build/solr-10.0.0-SNAPSHOT/ to have an "-after" suffix. Then I checked out a recent main (before cross-DC addition) and did gw assemble.
Then compared: diff -qr solr-10.0.0-SNAPSHOT/ solr-10.0.0-SNAPSHOT-after/

There were many differences. Just one example: the opentelemetry module didn't have netty libs but now it does.
Surely some differences are because main has moved along since when you started.

Can you please re-sync with main and repeat this exercise so we can hopefully get to no differences, or at least those differences that exist needs to be spelled out specifically and why. Then lets merge this; I'll approve.

@malliaridis
Copy link
Contributor Author

@dsmiley thanks for the instructions, I didn't know before how to check the builds for consistency.

I am using now IntelliJ for a better overview and I see that the configurations that were included in gradle/validation/dependencies.gradle are not sufficient. I have now extended the list locally to the following:

def consolidatedConfigurations = project.configurations.matching {
    it.name in [
            "annotationProcessor",
            "api",
            "apiElements",
            "compileClasspath",
            "compileOnly",
            "libExt",
            "implementation",
            "runtimeClasspath",
            "runtimeElements",
            "runtimeOnly",
            "runtimeLibs",
            "serverLib",
            "testCompileClasspath",
            "testCompileOnly",
            "testImplementation",
            "testRuntimeClasspath",
            "testRuntimeOnly"
    ]
}

@dweiss any recommendations how to constellate this list of configurations?

Now, besides a few new libraries included in some modules that I need to figure out where they are coming from, the main differences are in netty libraries (consistently now 2.0.65.Final), and google-api-grpc-proto libraries (consistently now 2.42.0).

The affected modules by these upgrades are:

  • modules/gcs-modules
  • modules/opentelemetry
  • modules/s3-repository
  • server/solr-webapp

I will pick up work again and try to solve this before the next review.

@henrik242
Copy link

spotless and tidy are added to run on build files too

Awesome!

It's cool, but I really would have preferred to have those in a separate PR. It clutters this PR quite a lot.

@malliaridis malliaridis force-pushed the SOLR-17406/rebase-version-catalogs branch from 71d9511 to 699480d Compare October 1, 2024 16:55
@malliaridis
Copy link
Contributor Author

It's cool, but I really would have preferred to have those in a separate PR. It clutters this PR quite a lot.

I totally agree with you. I should have tried to separate these two changes. The reason I ended up doing it in this PR is the reference I used, which was apache/lucene#13484. I couldn't separate one from the other, so I decided to do it in a similar way.

About the configurations and dependencies: I am still not very sure about the configurations that we should sync the dependencies / include in the consolidated configurations list, so I ended up with the below state.

Updated dependencies

  • hamcrest:hamcrest-core is replaced with hamcrest:hamcrest - module was renamed and included twice with different names before
  • netty-tcnative-broingssl-static was updated through version syncing from 2.0.61.Final to 2.0.65.Final
  • netty-tcnative-classes was updated through version syncing from 2.0.61.Final to 2.0.65.Final
  • proto-google-common-protos was updated through version syncing from 2.41.0 to 2.42.0

module: gcs-repository

  • uses the updated proto-google-common-protos library (version 2.42.0)

module: opentelemetry

  • uses the updated proto-google-common-protos library (version 2.42.0)

server/solr-webapp

  • netty-tcnative-broingssl-static was updated through version syncing from 2.0.61.Final to 2.0.65.Final
  • netty-tcnative-classes was updated through version syncing from 2.0.61.Final to 2.0.65.Final

@dsmiley the diff output should now show only the above changes if I have done everything correct.

@dsmiley
Copy link
Contributor

dsmiley commented Oct 1, 2024

Thanks Christos.

I suggest a separate PR for the version bumping listed. It'd also be nice to backport that to 9x.

May I suggest that we have a separate PR that merely formats our gradle files? Could be just a quick automated thing. It'll make this PR much easier to read and moreover, over time, we'll see what this change here did, not obscured by formatting. A sweeping formatting change like this would not go to 9x.

Finally, this PR here is then just version catalogs refactoring, and only happens in v10.

@dsmiley
Copy link
Contributor

dsmiley commented Oct 1, 2024

BTW if you are worried about how to update this PR if those other things merge; I could figure that out. The end-state will be no different, so it should be possible to do a bit of git magic on the merge from main into this to say that this side (the PR) "wins" for all differing files.

@malliaridis
Copy link
Contributor Author

Thanks for the input and considering handling non-relevant changes in separate PRs. I feel more at ease if we do so, so I will continue with that and create these PRs as suggested before continuing work on this.

Changing this to a draft PR for now.

@malliaridis malliaridis marked this pull request as draft October 2, 2024 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants