Skip to content

Conversation

@snazy
Copy link
Contributor

@snazy snazy commented Nov 17, 2025

Adds a new DeduplicatingResourceTransformer that works different than PreserveFirstFoundResourceTransformer.

PreserveFirstFoundResourceTransformer is to preserve the first resource that matches the configured paths and ignore all other ones.

DeduplicatingResourceTransformer preserves resources by path and identical content and fails for all not explicitly allowed (excluded) resources with different content. It works intentionally against all resources.
The new one is intended to guard a couple of unexpected situations:

  • A (transitive) dependency brings a non-relocated version of a dependency that is also included elsewhere but with a different version. This could normally lead to unexpected exceptions during runtime.
  • Unintended inclusion or removal or legally important license information, see also MergeLicenseResourceTransformer (Add MergeLicenseResourceTransformer to merge licenses #1858).
  • Unintended removal or (false) inclusion of shaded dependency information via META-INF/x/y/pom.xml/.properties files, which can be important for dependency/license analyzation tools.

Adding the functionality of DeduplicatingResourceTransformer to PreserveFirstFoundResourceTransformer became a bit too difficult without breaking the existing behavior of the latter.

Refs #1848.


  • CHANGELOG's "Unreleased" section has been updated, if applicable.

@snazy snazy force-pushed the dedup-content-transformer branch from a26f539 to 4ed3bf3 Compare November 17, 2025 17:37
@snazy snazy changed the title Add transformer to deduplicate identical files based content Add transformer to deduplicate identical files based on path + content Nov 17, 2025
@snazy snazy force-pushed the dedup-content-transformer branch 3 times, most recently from c23e4ab to 5353511 Compare November 18, 2025 08:52
@Goooler Goooler changed the title Add transformer to deduplicate identical files based on path + content Add DeduplicatingResourceTransformer to deduplicate on path and content Nov 19, 2025
@snazy snazy force-pushed the dedup-content-transformer branch from 5353511 to 7ee24d4 Compare November 19, 2025 07:38
@snazy snazy force-pushed the dedup-content-transformer branch from 7ee24d4 to cec54aa Compare November 19, 2025 11:23
@Goooler Goooler requested a review from Copilot November 19, 2025 13:13
Copilot finished reviewing on behalf of Goooler November 19, 2025 13:19

This comment was marked as outdated.

@snazy snazy force-pushed the dedup-content-transformer branch from d94d362 to 2c7c252 Compare November 19, 2025 14:32
snazy and others added 13 commits November 20, 2025 09:35
Adds a new `DeduplicatingResourceTransformer` that works different than `PreserveFirstFoundResourceTransformer`.

`PreserveFirstFoundResourceTransformer` is to preserve the first resource that matches the configured paths and ignore all other ones.

`DeduplicatingResourceTransformer` preserves resources by path _and_ identical content and fails for all not explicitly allowed (excluded) resources with different content.
It works intentionally against all resources.
The new one is intended to guard a couple of unexpected situations:
* A (transitive) dependency brings a non-relocated version of a dependency that is also included elsewhere but with a different version. This could normally lead to unexpected exceptions during runtime.
* Unintended inclusion or removal or legally important license information, see also `MergeLicenseResourceTransformer` (GradleUp#1858).
* Unintended removal or (false) inclusion of shaded dependency information via `META-INF/x/y/pom.xml`/`.properties` files, which can be important for dependency/license analyzation tools.

Adding the functionality of `DeduplicatingResourceTransformer` to `PreserveFirstFoundResourceTransformer` became a bit too difficult without breaking the existing behavior of the latter.
…nsformers/DeduplicatingResourceTransformer.kt

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…shadow/transformers/TransformersTest.kt

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…testkit/JarPath.kt

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…testkit/JarPath.kt

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@snazy snazy force-pushed the dedup-content-transformer branch from 2c7c252 to 58c927a Compare November 20, 2025 08:39
* Multiple files with the same path but different content lead to an error.
*
* Some scenarios for duplicate resources in a shadow jar:
* * Duplicate `.class` files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* * Duplicate `.class` files
*
* - Duplicate `.class` files

* Having duplicate `.class` files with different content is a situation indicating that the resulting jar is
* built with _incompatible_ classes, likely leading to issues during runtime.
* This situation can happen when one dependency is (also) included in an uber jar.
* * Duplicate `META-INF/<group-id>/<artifact-id>/pom.properties`/`xml` files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* * Duplicate `META-INF/<group-id>/<artifact-id>/pom.properties`/`xml` files.
*
* - Duplicate `META-INF/<group-id>/<artifact-id>/pom.properties`/`xml` files.

Comment on lines +39 to +40
* ```kotlin
* tasks.named<ShadowJar>("shadowJar").configure {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ```kotlin
* tasks.named<ShadowJar>("shadowJar").configure {
*
* ```kotlin
* tasks.shadowJar {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants