Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce StreamMapFilter Refaster rule #1467

Merged
merged 3 commits into from
Dec 23, 2024
Merged

Conversation

zphilipp
Copy link
Contributor

@zphilipp zphilipp commented Dec 18, 2024

Prefer

stream.map(map::get).filter(Objects::nonNull)

over

stream.filter(map::containsKey).map(map::get)

Suggested commit message

Introduce `StreamMapFilter` Refaster rule (#1467)

Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@zphilipp zphilipp force-pushed the pzeipert/stream-map-to-map-get branch from 3b002d9 to cb2e17d Compare December 18, 2024 08:32
Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Unrelated to this code)
The same pattern may also be applicable for Flux where we should call Flux#mapNotNull. Maybe good follow up? 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can create another PR. As @mohamedsamehsalah tells me, this will get me closer to my sticker. 😁

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will definitely receive one for this contribution already!

@zphilipp zphilipp force-pushed the pzeipert/stream-map-to-map-get branch from cb2e17d to 436aded Compare December 18, 2024 10:04
Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@zphilipp zphilipp requested a review from werli December 18, 2024 10:07
@Stephan202 Stephan202 added this to the 0.20.0 milestone Dec 18, 2024
@zphilipp zphilipp changed the title Introduce StreamMapToValuesFromMap Refaster rule Introduce StreamMapFilter Refaster rule Dec 20, 2024
Copy link
Member

@rickie rickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one! Some small points (didn't apply them yet 😬)

static final class StreamMapFilter<K, V> {
@BeforeTemplate
Stream<V> before(Stream<K> stream, Map<K, V> map) {
return stream.filter(map::containsKey).map(map::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the performance. As usually we try to moving filtering to the "earliest" spot, right? I guess preventing the double lookup makes up for that 🤔 ?

Again a reason why having a "benchmarking-framework" in place in EPS would be useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we should find time to progress on such a framework. For now I wrote a custom benchmark:

import static com.google.common.collect.ImmutableMap.toImmutableMap;
import static java.util.function.Function.identity;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import java.util.Objects;
import java.util.concurrent.TimeUnit;
import java.util.regex.Pattern;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.profile.GCProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(
    jvmArgs = {"-Xms256M", "-Xmx256M"},
    value = 1)
@Warmup(iterations = 5)
@Measurement(iterations = 10) 
public class Benchmark {
  private ImmutableList<String> elements;
  private ImmutableMap<String, String> map;

  public static void main(String[] args) throws RunnerException {
    String testRegex = Pattern.quote(Benchmark.class.getCanonicalName());
    new Runner(new OptionsBuilder().include(testRegex).addProfiler(GCProfiler.class).build()).run();
  }

  @Setup
  public void setUp() {
    elements = ImmutableList.of("a", "b", "c", "d", "e", "f", "g", "h", "i", "j");
    map =
        elements.stream()
            .filter(e -> e.charAt(0) % 2 == 0)
            .collect(toImmutableMap(identity(), e -> e + e));
  }

  @Benchmark
  public long filterFirst() {
    return elements.stream().filter(map::containsKey).map(map::get).count();
  }

  @Benchmark
  public long mapFirst() {
    return elements.stream().map(map::get).filter(Objects::nonNull).count();
  }
}

This yielded the following results:

Benchmark                                             Mode  Cnt     Score    Error   Units
Benchmark.filterFirst                     avgt   10     0.116 ±  0.001   us/op
Benchmark.filterFirst:gc.alloc.rate       avgt   10  2559.065 ± 21.986  MB/sec
Benchmark.filterFirst:gc.alloc.rate.norm  avgt   10   312.000 ±  0.001    B/op
Benchmark.filterFirst:gc.count            avgt   10  1687.000           counts
Benchmark.filterFirst:gc.time             avgt   10  1060.000               ms
Benchmark.mapFirst                        avgt   10     0.095 ±  0.001   us/op
Benchmark.mapFirst:gc.alloc.rate          avgt   10  2978.998 ± 14.750  MB/sec
Benchmark.mapFirst:gc.alloc.rate.norm     avgt   10   296.000 ±  0.001    B/op
Benchmark.mapFirst:gc.count               avgt   10  1964.000           counts
Benchmark.mapFirst:gc.time                avgt   10  1416.000               ms

So this looks like a good change, performance-wise 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you two for challenging this and showing that it actually results in an improvement. 💪

@@ -141,6 +141,11 @@ boolean testStreamFindAnyIsPresent() {
return Stream.of(1).findFirst().isPresent();
}

Stream<String> testStreamMapFilter() {
ImmutableMap<String, String> map = ImmutableMap.of("foo", "bar");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can inline these values :)

@@ -297,6 +297,23 @@ boolean after(Stream<T> stream) {
}
}

/**
* When using {@link Map#get(Object)} in a {@link Stream#map(Function)} operation, prefer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rephrase this to Prefer {@link ..} over {@link ..} to avoid double look-up or something :).

@Stephan202 Stephan202 force-pushed the pzeipert/stream-map-to-map-get branch from 436aded to 0821705 Compare December 20, 2024 22:54
Copy link
Member

@Stephan202 Stephan202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and added a commit. Congrats on your first contribution @zphilipp! 🎉

Comment on lines 305 to 314
static final class StreamMapFilter<K, V> {
@BeforeTemplate
Stream<V> before(Stream<K> stream, Map<K, V> map) {
return stream.filter(map::containsKey).map(map::get);
}

@AfterTemplate
Stream<V> after(Stream<K> stream, Map<K, V> map) {
return stream.map(map::get).filter(Objects::nonNull);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map#get and Map#containsKey accept an Object, so we can generalize the rule by not requiring that the Stream has elements of type K.

static final class StreamMapFilter<K, V> {
@BeforeTemplate
Stream<V> before(Stream<K> stream, Map<K, V> map) {
return stream.filter(map::containsKey).map(map::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we should find time to progress on such a framework. For now I wrote a custom benchmark:

import static com.google.common.collect.ImmutableMap.toImmutableMap;
import static java.util.function.Function.identity;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import java.util.Objects;
import java.util.concurrent.TimeUnit;
import java.util.regex.Pattern;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.profile.GCProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(
    jvmArgs = {"-Xms256M", "-Xmx256M"},
    value = 1)
@Warmup(iterations = 5)
@Measurement(iterations = 10) 
public class Benchmark {
  private ImmutableList<String> elements;
  private ImmutableMap<String, String> map;

  public static void main(String[] args) throws RunnerException {
    String testRegex = Pattern.quote(Benchmark.class.getCanonicalName());
    new Runner(new OptionsBuilder().include(testRegex).addProfiler(GCProfiler.class).build()).run();
  }

  @Setup
  public void setUp() {
    elements = ImmutableList.of("a", "b", "c", "d", "e", "f", "g", "h", "i", "j");
    map =
        elements.stream()
            .filter(e -> e.charAt(0) % 2 == 0)
            .collect(toImmutableMap(identity(), e -> e + e));
  }

  @Benchmark
  public long filterFirst() {
    return elements.stream().filter(map::containsKey).map(map::get).count();
  }

  @Benchmark
  public long mapFirst() {
    return elements.stream().map(map::get).filter(Objects::nonNull).count();
  }
}

This yielded the following results:

Benchmark                                             Mode  Cnt     Score    Error   Units
Benchmark.filterFirst                     avgt   10     0.116 ±  0.001   us/op
Benchmark.filterFirst:gc.alloc.rate       avgt   10  2559.065 ± 21.986  MB/sec
Benchmark.filterFirst:gc.alloc.rate.norm  avgt   10   312.000 ±  0.001    B/op
Benchmark.filterFirst:gc.count            avgt   10  1687.000           counts
Benchmark.filterFirst:gc.time             avgt   10  1060.000               ms
Benchmark.mapFirst                        avgt   10     0.095 ±  0.001   us/op
Benchmark.mapFirst:gc.alloc.rate          avgt   10  2978.998 ± 14.750  MB/sec
Benchmark.mapFirst:gc.alloc.rate.norm     avgt   10   296.000 ±  0.001    B/op
Benchmark.mapFirst:gc.count               avgt   10  1964.000           counts
Benchmark.mapFirst:gc.time                avgt   10  1416.000               ms

So this looks like a good change, performance-wise 👍

Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

Copy link
Member

@rickie rickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on your first PR @zphilipp !

Copy link
Member

@rickie rickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on your first PR @zphilipp !

@rickie
Copy link
Member

rickie commented Dec 22, 2024

😓 idk why it's doing it twice now.

Anyway, anything from your side @werli :)?

static final class StreamMapFilter<K, V> {
@BeforeTemplate
Stream<V> before(Stream<K> stream, Map<K, V> map) {
return stream.filter(map::containsKey).map(map::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you two for challenging this and showing that it actually results in an improvement. 💪

@Stephan202 Stephan202 force-pushed the pzeipert/stream-map-to-map-get branch from 0821705 to dadd141 Compare December 23, 2024 08:16
Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@Stephan202 Stephan202 force-pushed the pzeipert/stream-map-to-map-get branch from dadd141 to b30af54 Compare December 23, 2024 08:26
Copy link

Looks good. No mutations were possible for these changes.
Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@rickie rickie merged commit 0fa59a5 into master Dec 23, 2024
16 checks passed
@rickie rickie deleted the pzeipert/stream-map-to-map-get branch December 23, 2024 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants