-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StreamFlatMapOptional error-prone check #2946
Conversation
If one has a `Stream<Optional<T>> stream` of size `N` and does `stream.flatMap(Optional::stream)`, you’ll end up allocating `N` extra streams — one for each `Optional` input element. When `N` is large or on latency, throughput, and allocation sensitive code paths those allocations can cause extra GC cycles or pauses if allocation rate is high enough. `Stream.filter(Optional::isPresent).map(Optional::get)` is more efficient than `Stream.flatMap(Optional::stream)` as it does not allocate a new `Stream` for every element in the stream.
Generate changelog in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fantastic, thanks for putting it together!
I added a few comments which aren't blockers, goal is to give you as much context as possible since you mentioned that you'd like to write more of this style of check :-)
String replacement = state.getSourceForNode(ASTHelpers.getReceiver(tree.getMethodSelect())) | ||
+ ".filter(Optional::isPresent).map(Optional::get)"; | ||
SuggestedFix fix = SuggestedFix.builder() | ||
.addImport("java.util.Optional") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be a bit safer to use qualifyType
for these, in case something like the old guava Optional type is also imported in the same class.
e.g.
Line 64 in 4ae1e05
qualifiedType = SuggestedFixes.qualifyType(state, fix, "com.google.common.collect.Lists"); |
This replaces addImport
and potentially uses a fully qualified reference if another type with the same simple name is already imported.
+ ".filter(Optional::isPresent).map(Optional::get)"; | ||
SuggestedFix fix = SuggestedFix.builder() | ||
.addImport("java.util.Optional") | ||
.replace(tree, replacement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case there are multiple Stream.flatMap(Optional::stream)
occurrences in a single call chain, it may be helpful to make a more specific replacement, allowing a single round of applying the fixes to produce a passing build.
Instead of replacing the full tree
, we can do something along these lines:
.replace(
state.getEndPosition(ASTHelpers.getReceiver(tree.getMethodSelect())),
state.getEndPosition(tree),
".filter(Optional::isPresent).map(Optional::get)")
I imagine a test could produce this case using something along these lines:
Stream<String> f(Stream<Optional<Optional<String>>> in) {
return in.flatMap(Optional::stream).flatMap(Optional::stream);
}
if (STREAM_FLAT_MAP.matches(tree, state)) { | ||
ExpressionTree stream = ASTHelpers.getReceiver(tree); | ||
if (stream != null) { | ||
List<? extends ExpressionTree> arguments = tree.getArguments(); | ||
if (arguments != null && arguments.size() == 1) { | ||
if (OPTIONAL_STREAM.matches(arguments.get(0), state)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error-prone provides a utility function which may be helpful here. Sometimes I opt not to use it in order to get access to parameters required to build the suggested fix, but I figured I'd call it out in case you find it helpful:
private static final Matcher<ExpressionTree>
STREAM_FLATMAP_OPTIONAL_STREAM = Matchers.methodInvocation(
STREAM_FLAT_MAP,
// Any of the three MatchTypes are reasonable in this case, given a single arg
MatchType.LAST,
OPTIONAL_STREAM);
Thanks @carterkozak , that looks much better. Added a few more test cases as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Released 6.1.0 |
Hey I don't think this is a good as a blanket rule. It makes the code harder to read, less resilient to refactors, and encourages potentially error-prone patterns. When devs see things like For example, this check changed roughly 65 instances of this in our internal authentication service in https://g.p.b/foundry/multipass/pull/21715. @carterkozak @schlosna can we consider reverting this? |
@pkoenig10 are there any places you have found that are incorrect? I just looked through that PR and all of the replacements look correct to me, and given the service this seems like a net improvement to avoid unnecessary GC overhead.
I'm not convinced folks will incorrectly use See also "Effective Java" 3rd edition, Item 55: Return optionals judiciously and Stuart Marks' https://stuartmarks.wordpress.com/wp-content/uploads/2017/10/optionalmotherofallbikesheds-devoxxus2017.pdf for external bike shed opinions. |
I'm not suggesting that the automated refactors are incorrect - I agree that they are correct. I can see the arguments both ways here. I certainly agree we should be doing this in performance sensitive places. But these allocations are inconsequential for 99% of the code we write.
I see devs makes these kinds of mistakes all the time - pattern-matching from existing code without thinking more deeply about what they are actually writing. IntelliJ lints and code reviews are tools to mitigate this, but they aren't foolproof in the way that compile-time checks (ie. typechecking) or tests are. I won't belabor the point. I understand the arguments here but just don't personally agree that it is desirable to declare that all uses of |
I think the key piece is that sometimes this optimization is incredibly meaningful, but sometimes code doesn’t need to be optimal. so, do we force folks to make this distinction based on their knowledge of the scope they’re working within, or do we eliminate the option so that folks don’t have to think about it. If we give them the option, are we confident they’ll get it right most of the time? Likely not, since copy/paste is common and tends to occur without much thought about the perf requirements of the original vs new code (and often times code which wasn’t meant to be hot becomes hot in a refactor or new call path). I think the additional verbosity is a reasonable trade off, and likely to lead to fewer problems than the alternative, but it’s not a hill that I’d die on! |
Makes sense, I don't disagree with reasons here, thanks for engaging! |
#2946 is a draft PR to catch cases where one attempts to |
if (receiver != null) { | ||
SuggestedFix.Builder fix = SuggestedFix.builder(); | ||
String optionalType = SuggestedFixes.qualifyType(state, fix, Optional.class.getCanonicalName()); | ||
String replacement = ".filter(" + optionalType + "::isPresent).map(" + optionalType + "::get)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use orElseThrow
instead of get
? The Java docs state that orElseThrow
should be preferred and I've come to agree with this recommendation.
https://docs.oracle.com/javase/10/docs/api/java/util/Optional.html#get()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call #2950
Current JDKs make us choose between what's "readable":
and what's efficient:
Is there a possible API addition to the JDK that would be both readable and efficient? |
noting that for projects that use StreamEx, |
Before this PR
JFR profiles indicated some uses of
Stream<Optional<T>>.flatMap(Optional::stream)
were causing unnecessary allocations & GC pressure.After this PR
==COMMIT_MSG==
If one has a
Stream<Optional<T>> stream
of sizeN
and doesstream.flatMap(Optional::stream)
, you’ll end up allocatingN
extra streams — one for eachOptional
input element. WhenN
is large, those allocations can cause extra GC cycles and pauses if allocation rate is high enough leading to issues with latency, throughput, and allocation sensitive code paths.Stream.filter(Optional::isPresent).map(Optional::get)
is more efficient thanStream.flatMap(Optional::stream)
as it does not allocate a newStream
for every element in the stream.==COMMIT_MSG==
Possible downsides?