Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow usage of .collapseKeys() in EntryStream(s) #2291

Merged
merged 4 commits into from
Mar 8, 2023

Conversation

a-k-g
Copy link
Contributor

@a-k-g a-k-g commented Jun 6, 2022

Before this PR

The collapseKeys() API of EntryStream is dangerous and should be avoided. In internal products the API is frequently used as a grouping operation but its not suitable for that use case. The contract requires duplicate keys to be adjacent to each other in the stream, which is rarely the case in production code paths. When this constraint is violated, it leads to a duplicate key error at runtime.

A work around for the issue is to sort the keys prior to running the collapse operation. Since the sort operation is surprising, a comment is often added to explain. Overall the usage of collapseKeys() leads to code that is error prone or surprising.

Hence in place of collapseKeys() we should just use grouping operations which may require creation of intermediate maps but should avoid surprising code.

Notional example of buggy code:

List<Pair<Integer, Integer>> nodesAndPriorities = List.of
        Pair.of(1, 5),
        Pair.of(1, 3),
        Pair.of(4, 2),
        Pair.of(1, 9));

Map<Integer, Set<Integer>> nodesByPriority = EntryStream.of(nodesAndPriorities)
        .mapValues(Pair::getRight)
        .collapseKeys(Collectors.toSet())
        .toMap();

Notional example of applying sort work around:

Map<Integer, Set<Integer>> nodesByPriority = EntryStream.of(nodesAndPriorities)
        .mapValues(Pair::getRight)
        // collapseKeys() requires duplicate keys to be adjacent
        .sorted()
        .collapseKeys(Collectors.toSet())
        .toMap();

Notional example with proposed alternative of grouping:

Map<Integer, Set<Integer>> nodesByPriority = EntryStream.of(nodesAndPriorities)
        .mapValues(Pair::getRight)
        .grouping(Collectors.toSet());

After this PR

==COMMIT_MSG==
Added DangerousCollapseKeysUsage error prone check to disallow usage of collapseKeys() API of EntryStream.
==COMMIT_MSG==

Possible downsides?

Will need manual action for existing usages.

@changelog-app
Copy link

changelog-app bot commented Jun 6, 2022

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Added DangerousCollapseKeysUsage error prone check to disallow usage of collapseKeys() API of EntryStream.

Check the box to generate changelog(s)

  • Generate changelog entry

summary = "Disallow usage of .collapseKeys() in EntryStream(s).")
public final class DangerousCollapseKeysUsage extends BugChecker implements BugChecker.MethodInvocationTreeMatcher {
private static final long serialVersionUID = 1L;
private static final String ERROR_MESSAGE = "The collapseKeys API of EntryStream should be avoided. The "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should" implies info/warning, not SeverityLevel.ERROR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you suggest changing the error level, or the language? I'm inclined to change this to The collapseKeys API of EntryStream must be avoided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer setting this to WARNING to reduce friction of rollout. I might suggest adding a check for a sort called prior to this method, to avoid flagging cases that are safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed most of the internal usage so rollout should be friction free and we can keep it at ERROR. I've changed the should to must.

@a-k-g a-k-g requested a review from carterkozak July 7, 2022 10:34
return Description.NO_MATCH;
}

// Fail on any 'parallel(...)' implementation, regardless of how many parameters it takes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collapseKeys, not parallel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fixing in #2510.

@bulldozer-bot bulldozer-bot bot merged commit 8a5be78 into develop Mar 8, 2023
@bulldozer-bot bulldozer-bot bot deleted the ag/disallow-collapse-keys branch March 8, 2023 14:29
@svc-autorelease
Copy link
Collaborator

Released 4.190.0

bulldozer-bot bot pushed a commit to palantir/witchcraft-api that referenced this pull request Mar 12, 2023
###### _excavator_ is a bot for automating changes across repositories.

Changes produced by the roomba/latest-baseline-oss check.

# Release Notes
## 4.189.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Upgrade error_prone to 2.18.0 (from 2.16) | palantir/gradle-baseline#2472 |


## 4.190.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Feature | Added `DangerousCollapseKeysUsage` error prone check to disallow usage of `collapseKeys()` API of `EntryStream`. | palantir/gradle-baseline#2291 |
| Feature | Prefer common versions of annotations over other copies | palantir/gradle-baseline#2505 |



To enable or disable this check, please contact the maintainers of Excavator.
This was referenced Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants