Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize source set retrieval. #168

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

Arthurm1
Copy link
Contributor

@Arthurm1 Arthurm1 commented Jul 9, 2024

This PR got away from me slightly so has a number of changes.

The idea here was to retrieve sourcesets concurrently by creating an action for each project and then submitting all those actions at once. If Gradle > 6.7 then it will attempt to execute these actions in parallel.
Scala in particular can be slow to populate some of its information so this is faster. I haven't tested to see if it speeds up any large public Gradle projects like Gradle or Kafka.
This also simplifies the SourceSetsModelBuilder as the data retrieved is purely for 1 project.

Changes...

  1. Changed SourceSetsModelBuilder to only populate 1 project to allow parallel retrieval in GradleSourceSetsAction.
  2. Changed GetSourceSetsAction to get the model for all projects/subprojects/included builds.
  3. Allow multiple output dirs per sourceSet - the single output dir wasn't working when projects had Scala & Java
  4. Shift output dir population to specific language model builders - output dir is language dependent
  5. Map archives to multiple output dirs for correct project dependency population - wasn't working out sourceset dependencies correctly.
  6. Use GradleBuild#getEditableBuilds instead of GradleBuild#getIncludedBuilds if available. This will include the buildSrc project. See here
  7. Substitute project jars on the classpath with their original classes dirs. In Scala and in included builds, the classpath includes the project jars rather than the classes directories. So for BSP clients, they now realize that the class files are from project sources and can navigate to the source rather than decompiling the classfile from the jar.
  8. Removed main source set only limitation on inter-project dependencies. Any project with custom configs (or one using TestFixtures) needs this.
  9. Upgraded Spock dependency in tests so tests run on JDK 21.
  10. Remove GradleSourceSet#ProjectName - it's not used
  11. Added a test with a missing repository to check for exceptions thrown by getCompileClasspath. Although this really means that the user's build file is wrong but at least they don't get an exception from the plugin.

@Arthurm1 Arthurm1 force-pushed the parallel_retrieval branch from d8c08fb to ff89166 Compare July 10, 2024 14:15
@Arthurm1
Copy link
Contributor Author

@jdneo I've pushed all the language specific stuff to be stored in the LanguageExtension (e.g. source dirs, generated dirs) instead of queried from the LanguageModelBuilder.
This means that there are fewer lookups of the compilation task and so an increase in performance.

I've also moved SupportedLanguages out of GradleBuildServerPlugin. It was a static but it was populated in the class constructor and so kept getting overwritten when running in parallel causing exceptions.

I've tested anecdotally using Kafka. I'm getting speed ups on sourceset retrieval of around 5x when org.gradle.parallel=true compared to before this PR.

@Arthurm1
Copy link
Contributor Author

I've experienced ConcurrentModificationException on project.getTasks().withType(XX) so added synchronized to guard against them.

I've tested against Gradle and it's 12x faster to retrieve source sets on my machine.

@Arthurm1 Arthurm1 force-pushed the parallel_retrieval branch from ff89166 to e21ddf6 Compare July 11, 2024 13:32
@Arthurm1
Copy link
Contributor Author

Fixed test. The supported languages weren't being supplied in all GradleBuildServerPluginTest tests.

GradleBuildServerPluginTest now uses GetSourceSetsAction instead of retrieving the model directly, so all Gradle versions are now tested.

@Arthurm1 Arthurm1 force-pushed the parallel_retrieval branch from e21ddf6 to 5cb1205 Compare July 11, 2024 14:44
@jdneo
Copy link
Member

jdneo commented Jul 16, 2024

Hi @Arthurm1, I'm working on some other stuff recently. I'll make sure to review this PR once I've finished the stuff.

@jdneo jdneo added this to the 0.4.0 milestone Jul 19, 2024
@jdneo jdneo added the enhancement New feature or request label Jul 19, 2024
@Arthurm1 Arthurm1 force-pushed the parallel_retrieval branch from 5cb1205 to 49aab29 Compare July 22, 2024 18:11
@jdneo
Copy link
Member

jdneo commented Aug 7, 2024

Hi @Arthurm1, should I add org.gradle.parallel in gradle.properties when verifying this PR? (I guess the answer is No, right?)

@jdneo
Copy link
Member

jdneo commented Aug 7, 2024

I can observe the perf boost for this change. Opening the repo Gradle, the request for workspace/buildTarget is 3 times faster than before on my machine.

Meanwhile, I encountered an error with this change: microsoft/vscode-gradle#1566. This error cannot be observed in the develop branch. Looks like we need to add the dedup logic in the project dependency resolution?

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2024

Hi @Arthurm1, should I add org.gradle.parallel in gradle.properties when verifying this PR? (I guess the answer is No, right?)

I think using org.gradle.parallel or enabling the configuration cache should cause the model to be built in parallel. But it also depends on which gradle version you're using. I've been trying both.

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2024

Meanwhile, I encountered an error with this change: microsoft/vscode-gradle#1566. This error cannot be observed in the develop branch. Looks like we need to add the dedup logic in the project dependency resolution?

I can't see why this would happen.

The classpaths for daemon-main [crossVersionTest] and daemon-main [testFixtures] both contain...

file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/classes/java/main/
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/classes/groovy/main
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/resources/main
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/generated-resources/gradle-snapshots-classpath/
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/classes/java/testFixtures
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/classes/groovy/testFixtures/
file:/C:/repodir/gradle/platforms/core-execution/snapshots/build/resources/testFixtures

and daemon-main [crossVersionTest] contains dependencies...

file:/C:/repodir/gradle/platforms/core-execution/snapshots/?sourceset\u003dtestFixtures
file:/C:/repodir/gradle/platforms/core-execution/snapshots/?sourceset\u003dmain

So it doesn't look like there are duplicates unless the testFixtures and crossVersionTest source sets are being merged into a single daemon-main project?

It doesn't appear to be duplicating display names either.

@jdneo
Copy link
Member

jdneo commented Aug 7, 2024

So it doesn't look like there are duplicates unless the testFixtures and crossVersionTest source sets are being merged into a single daemon-main project

Ah I see. This might be the reason. Because the JDT project model does not have equivalent concept for source set in Gradle. In JDT, one project can only have one classpath. So they are merged together.

Sorry for the confusion.

@jdneo
Copy link
Member

jdneo commented Aug 7, 2024

I think using org.gradle.parallel or enabling the configuration cache should cause the model to be built in parallel. But it also depends on which gradle version you're using. I've been trying both.

Question about this: I tried this change without setting org.gradle.parallel, and observe the perf boost as well. Is this required?

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2024

Ah I see. This might be the reason. Because the JDT project model does not have equivalent concept for source set in Gradle. In JDT, one project can only have one classpath. So they are merged together.

Do you special case the test source set then? If not, I can't see how it wouldn't always fail.

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2024

Question about this: I tried this change without setting org.gradle.parallel, and observe the perf boost as well. Is this required?

I don't know for sure. I think you either have to set org.gradle.parallel=true or org.gradle.configuration-cache=true

@jdneo
Copy link
Member

jdneo commented Aug 7, 2024

Do you special case the test source set then? If not, I can't see how it wouldn't always fail.

Yes, in JDT, each classpath entry can have an attribute named test, but that's all.

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2024

Yes, in JDT, each classpath entry can have an attribute named test, but that's all.

Ah - does the gradle project fail without this PR then?

@jdneo
Copy link
Member

jdneo commented Aug 8, 2024

Ah - does the gradle project fail without this PR then?

No, it doesn't

Copy link
Member

@jdneo jdneo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @Arthurm1 for your contribution

@jdneo jdneo merged commit cd93aa3 into microsoft:develop Aug 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants