Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve implicit search times for Filter, FilterNot, Union, Intersection #682

Merged
merged 3 commits into from
Mar 15, 2020

Conversation

aryairani
Copy link
Contributor

@aryairani aryairani commented Jan 22, 2017

Hi,

This patch provides a much faster implementation of Filter and FilterNot, and somewhat faster implementations of Union and Intersection. (Without the patch, my compile times have been prohibitively slow, such that I can only use the static checks for very small operations.)

screen shot 2017-01-22 at 11 19 11 pm

(Note the log-lin scale.)

Codecov check fails only because newly-deprecated code (retained only for bincompat) is no longer used by tests. The non-covered code can be be deleted at the next minor release.

@codecov-io
Copy link

Current coverage is 81.35% (diff: 71.42%)

Merging #682 into master will decrease coverage by 0.20%

@@             master       #682   diff @@
==========================================
  Files            69         69          
  Lines          2641       2655    +14   
  Methods        2375       2392    +17   
  Messages          0          0          
  Branches         92         89     -3   
==========================================
+ Hits           2154       2160     +6   
- Misses          487        495     +8   
  Partials          0          0          

Powered by Codecov. Last update b94bbf2...0006ac9

@milessabin
Copy link
Owner

Can you add tests which explicitly invoke the now redundant methods?

@aryairani
Copy link
Contributor Author

Sure, I will take a look at that.

@aryairani
Copy link
Contributor Author

aryairani commented Jan 26, 2017

Hi — I could use guidance on two issues:

  1. In scalacOptions we have -deprecation and -Xfatal-warnings; with the two together, I can't include tests for deprecated methods. I could not deprecate them, but I think they should be deprecated.

  2. I noticed that the hlist syntax methods .filter and .filterNot directly summon a Partition[L,U] instead of Filter[L,U] / FilterNot[L,U], which seems like a mistake. I'm not sure how to fix these in a bincompatible way. I could:

    a. leave them as-is (linked to the wrong typeclass) or
    b. move them to a HListOpsDeprecated trait (this builds, but not sure how it affects bc yet; I havent been able to run mima tests locally; don't know why.)

@milessabin
Copy link
Owner

The old methods won't appear explicitly in any normal user code because they're intended to be invoked implicitly. That being so I think it's right to leave them undeprecated.

A couple of thoughts. Can you get the same improvements for Partition? In which case it might be better to leave Filter and FilterNot as they are, which would also solve the .filter and .filterNot problem.

Also could you benchmark this change against leaving things as they are but compiling with Typelevel Scala 2.12.1 with the -Yinduction-heuristics flag enabled?

@aryairani
Copy link
Contributor Author

aryairani commented Feb 11, 2017

  1. With respect to @deprecation, I'm happy to replace the annotation with a comment; but remember, the issue with deprecation only came up because codecov wants us to continue referencing the unused methods from tests. You want them to go away eventually, right?

  2. This particular improvement won't help Partition, because Partition by its contract needs to compute both the Filter and FilterNot results. The current implementation of Filter and FilterNot rely on Partition to compute both sets, and then discard one of them. But computing both seems to be way more expensive than just computing the one that's needed.

We could probably flip things around so that Partition relies on the new Filter and FilterNot, instead of the other way around, and end up with a linear slowdown rather exponential. This would "help" the .filter and .filterNot issues somewhat, although it still seems screwy that those methods take a different type class than the ones they're named after. Maybe the methods predate the corresponding type classes, otherwise I dunno how it came to be the way it is. They should really use their corresponding type classes though.

  1. I will benchmark against -Yinduction-heuristics when I am home next week. If it's a huge help, then one option would be to forget this, and leave Lightbend folks to suffer; what do you think?

@milessabin
Copy link
Owner

On (1) yes, agreed.

On (2), my intuitions are that if your rework of Filter/FilterNot can improve the compile time from quadratic to linear, then something similar should be possible for Partition. I agree that Partition will still be more expensive to compile than either of the two individually, but the margin might be small enough that the benefit of having a single implementation rather than three might outweigh it.

On (3) I think it makes sense to do this whether or not -Yinduction-heurisitics improves things, but it would be very useful to see if it helps directly in this case.

@aryairani
Copy link
Contributor Author

So on (1), would you prefer to (1a) not include the deprecated methods in tests, and ignore the codecov result, (1b) allow deprecated methods in tests by removing -Xfatal-warnings in tests, or (1c) not use the @deprecation annotation, but just leave a comment that the methods are deprecated?

On (2), I can get it to two implementations, rather than three, but not down to one. Partition is inferring two HLists simultaneously; the rework for Filter/FilterNot was merely to cut them down to each only infer one HList. I can't apply the same rework to Partition because it needs to produce both. However, I can invert the dependence and have Partition use Filter/FilterNot to infer the lists sequentially instead of simultaneously. Edit: I tried this and surprisingly it didn't help at all (green line).

With respect to -Yinduction-heuristics, it unfortunately seems to make things worse; roughly equivalent to +1 test size. (purple line)

screen shot 2017-02-14 at 9 36 53 am

Finally, I think I will move the simpler Union/Intersection part of this PR to a separate PR, because that patch won't cause bincompat issues, so can probably be merged sooner. Still requires choosing among (1a), (1b), (1c).

Thanks for reading this far ;-)

@milessabin
Copy link
Owner

(1c) ... thanks :-)

@milessabin
Copy link
Owner

@refried could you and @joroKr21 work on combining this PR with #795.

@milessabin milessabin added this to the shapeless-2.4.0 milestone Dec 18, 2017
@joroKr21
Copy link
Collaborator

I don't think #795 is any faster. I hadn't realized FilterNot was implemented in such a weird way. This is a simpler and more general solution 👍

@joroKr21
Copy link
Collaborator

@aryairani it's been long time. Sorry about that. We are finally looking at a 2.4.0 release soon and I think this would be an important part of it. But the PR needs a rebase to resolve conflicts.

@joroKr21
Copy link
Collaborator

actually might make sense to check scala 2.12.10 / 2.13.1 first

@aryairani
Copy link
Contributor Author

aryairani commented Dec 30, 2019

I"m assuming we're no longer worried about bincompat for the purposes of this PR (due to the major version update)? I see some of the surrounding methods have already seen their signatures changed.

@joroKr21
Copy link
Collaborator

Yes, that's right we are breaking bincompat for 2.4.0

@aryairani aryairani force-pushed the faster-union-intersection branch from 0006ac9 to 6985c2d Compare December 30, 2019 07:31
@aryairani
Copy link
Contributor Author

aryairani commented Dec 30, 2019

Okay, I cleaned it up a bit.
sbt validate passed locally for me; let's see what CI says.

@joroKr21
Copy link
Collaborator

@aryairani thanks for coming back to this. Do you have the original code you used to test the performance? I want to try with newer versions of Scala to see if the performance gains are as dramatic as before.

@aryairani
Copy link
Contributor Author

@joroKr21 Sorry for the delay —

That's a bit tough 3 years later. 😅 I don't seem to have code immediately accessible, i.e. haven't found it in Dropbox. The computer I authored it and ran it on is currently in storage, so I can't easily check it. I may have a backup of said machine in the house, on an external backup appliance that's currently boxed up in the basement.

How badly do we want that original code? Inventing something from scratch might be easier.

WDYT?

@joroKr21
Copy link
Collaborator

Ah, don't go through so much trouble. will try something from scratch.

@joroKr21
Copy link
Collaborator

@aryairani I finally found time to try out the performance. Tried with an HList with 100 elements:

  type B = Boolean
  type I = Int
  type L =
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    B :: I :: B :: I :: B :: I :: B :: I :: B :: I ::
    HNil
  Partition[L, I]

I think we can optimise Partition and leave Filter and FilterNot as they are.
The key observation is the difference between:

  • 2 seconds:
    implicit def hlistPartition2[H, L <: HList, U, LPrefix <: HList, LSuffix <: HList](
      implicit e: U =:!= H, p: Aux[L, U, LPrefix, LSuffix]
    ): Aux[H :: L, U, LPrefix, H :: LSuffix]
  • Too long, didn't wait:
    implicit def hlistPartition2[H, L <: HList, U, LPrefix <: HList, LSuffix <: HList](
      implicit p: Aux[L, U, LPrefix, LSuffix], e: U =:!= H
    ): Aux[H :: L, U, LPrefix, H :: LSuffix]

@joroKr21
Copy link
Collaborator

Do you have time to make that change?

@joroKr21 joroKr21 force-pushed the faster-union-intersection branch from cffd7f4 to 683b1ee Compare March 15, 2020 17:01
Copy link
Collaborator

@joroKr21 joroKr21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit to optimise Partition instead.

@aryairani
Copy link
Contributor Author

So wait, then is there no performance difference between the new Partition and the proposed Filter/FilterNot?

I missed a comparison in #682 (comment).

@joroKr21
Copy link
Collaborator

Maybe there is a small difference but nothing noticeable (i.e. all are linear now).

@aryairani
Copy link
Contributor Author

Okay, great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants