-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revive or replace @specialized? #15532
Comments
After discussion with @sjrd @nicolasstucki and @mbovel the sentiment was that we want to do the bare minimum to keep things simple and understandable. In particular, maybe def apply[@specialized(Int, Float, Double) T](...): MatrixLib[T] = new MatrixLib[T](...) {} just write object IntMatrices extends MatrixLib[Int]
object FloatMatrices extends MatrixLib[Float]
object DoubleMatrices extends MatrixLib[Double] Sure, it's code duplication, but it makes it easier to see where the code is generated. On specialized traits, I thought of a simplification that might be worthwhile to look into. Consider a specialized trait like inline trait Ref[@specialized A]:
def set(x: A): Unit
def get(): A That specialized trait generates 10 subtraits trait Ref_Byte extends Ref[Byte]
trait Ref_Short extends Ref[Short]
...
trait Ref_Double extends Ref[Double]
trait Ref_Object extends Ref[Object] So far, so good. But each of these subtraits as well as the original trait We can reduce this code bloat by requiring that a specialized inline trait is only instantiated at the supported subtypes. So an extension of inline trait MatrixLib[@specialized(Int, Float, Double) A] then the only legal extensions of This means that every extension of a specialized inline trait This is a big win since the new specialized traits can be defined very straightforwardly. For instance, here is trait Ref_Int extends Ref[Int]:
override def set(x: Int): Unit
override def get(): Int There is no 10 fold explosion of generated methods. Erasure as it is will already do the rest: rebind method calls to the more efficient unboxed version and insert bridge methods as needed. So with this new scheme, we get three benefits:
|
I'm not sure I follow, why would each trait get 10 versions of each method if A is instantiated to one specific type? |
@smarter You are right. The base version |
Here's a refinement on specialization. Instead of an annotation trait Specialized[T] extends ClassTag[T]
Inline traits with An object or class can then extend a specializable trait only with type arguments that are ($ev.tag match
case 0 => new T[Object] {}
case 1 => new T[Byte] {}
...
case 9 => new T[Unit] {}
).asInstanceOf[T[X]] Here's a little implementation of a specialized array buffer using that technique. Note that the same functionality cannot be achieved with Scala 2 specialization, because we cannot specialize on Object there. inline trait Buffer[T: Specialized] private ():
private var elems: Array[T] = new Array[T](16)
private var len = 0
def += (x: T): this.type =
if len == elems.size then
val newElems = new Array[T](len * 2)
elems.copyToArray(newElems)
elems = newElems
elems(len) = x
len += 1
this
def ++= (xs: T*): this.type =
xs.foldLeft(this)(_ += _)
def head: T =
assert(len > 0)
elems(0)
def iterator: Iterator[T] =
elems.iterator.take(len)
def foreach(op: T => Unit): Unit =
for i <- 0 until len do op(elems(i))
def size: Int = len
object Buffer:
def apply[T: Specialized](xs: T*): Buffer[T] =
(new Buffer[T] {}).++=(xs*) |
A refinement to optionally allow only a subset of specialization types is left for future work. |
Alternative to generalize the "split-by-specialized-tag" technique: Have an object Specialized:
def apply[T: Specialized, F[_])(x: => F[T]): F[T] with Specialized[T, F](body)($ev)
--->
($ev.tag match
case 0 => [T := Object](body)
case 1 => [T := Byte](body)
...
case 9 => [T := Unit](body)
).asInstanceOf[F[T]] In that case the builder method of def apply[T: Specialized](xs: T*): Buffer[T] =
Specialized(new Buffer[T] {}).++=(xs*) |
To summarize: The new design has three elements
inline traits and |
Brainstorming session result of how to inline Numeric ops: inline trait Numeric[T]:
inline def plus(x: T, y: T): T
object NumericInt extends Numeric_Int:
inline def plus(x: Int, y: Int): Int = x + y
inline trait A[T]:
transparent inline given n: Numeric[T] = summonInline
def f(x: T): T = n.plus(x, x)
class B extends A[Int]:
type T = Int // generated
transparent inline given n: Numeric[T] = NumericInt // generated
override def f(x: T): T = n.plus(x, x) // generated
|
More detailed version inline trait Numeric[T: Specialized]:
inline def plus(x: T, y: T): T
def times(x: T, y: T): T
// generated
trait Numeric_Int extends Numeric[Int]/*removed*/:
inline def plus(x: Int, y: Int): Int
def times(x: Int, y: Int): Int
object NumericInt extends Numeric[Int]/*erases to Numeric_Int*/:
inline def plus(x: Int, y: Int): Int = x + y
def times(x: Int, y: Int): Int = x * y
inline trait A[T]:
transparent inline given n: Numeric[T] = summonInline
def f(x: T): T = n.times(n.plus(x, x), x)
class B extends A[Int]/*removed*/::
type T = Int // generated
transparent inline given n: Numeric[T] = NumericInt // generated
override def f(x: T): T =
n.times(n.plus(x, x), x) // generated
// then inlined to
// NumericInt.times(x + x, x) |
An alternative, more expressive and convenient way to do specialization with inline traits is described in |
I'm porting the other modules to the new design in `kyo-prelude` but there's a performance regression that I can't see how to avoid with the new encoding in the current version of the Scala 3 compiler: - Given the lack of specialization, the generated bytecode contains a significant amount of unexpected `<` boxing. Fortunately, the JIT is able to avoid most of these allocations but profiling sessions show `<` allocations in several benchmarks. It isn't a bottleneck in the benchmarks I analyzed but introduces some overhead. - The compiler doesn't allow implementing interfaces with `AnyVal` generic type parameters. As a workaround, I was planning to use a wrapper class like in `MonadLawsTest` but the approach is too expensive in integrations like in `kyo-sttp`. The wrapper class essentially introduces a new allocation to all methods in [KyoSttpMonad](https://github.com/getkyo/kyo/blob/0279f30163759d834b1793cbba5bcf7d08f9f0c3/kyo-sttp/shared/src/main/scala/kyo/internal/KyoSttpMonad.scala#L9), which are heavily used by sttp's transformations. Given how critical the integrations with sttp and tapir are, this regression doesn't seem ideal. As described in #531, the purpose of the new `AnyVal` encoding was to allow arbitrary nesting of Kyo computations. Although it's a nice usability improvement, the current limitations of the Scala 3 compiler doesn't seem to make it worth it and finalizing the migration to the new design seems a much higher priority. I can create a ticket so we can try again later. Related compiler tickets: scala/scala3#11264 scala/scala3#15532
Specialization was rather messy in Scala 2. It had a complicated implementation and hard to understand restrictions and inefficiencies for the user. Nevertheless, when used right, specialization can gain significant performance improvements, which are still unavailable in Scala 3. I am opening this issue to start a discussion what one could do to achieve the essential benefits of specialization in Scala 3, while hopefully avoiding some of its shortcomings.
Pain Points of Scala 2 Specialization
Here are some point points of Scala 2 specialization as far as I can remember them. It might be slightly off or incomplete. Please fill in details that are missing.
A[T]
contains a fieldx: T
, thenA_Int
needs to contain a fieldx_Int: Int
. SinceA_Int
is a subclass ofA
, we end up with two sets of fields, one of them unused.@specialized
annotation. There's also the experimental miniboxing work where instead of 9 specialization targets we only have 3 (Long, Double, and Object). The main problem with miniboxing is that without further complications is gets inefficient for defining arrays.Elements of a Redesign
Given the problems with 1. and 2. above I think we should avoid specialized classes. We could instead experiment with some combination of the following ideas.
Inline Traits
An inline trait is defined like a normal trait with an
inline
modifier.An inline trait is expanded when it is inherited by an object or class (not by another trait). Example:
An inline trait itself translates to a pure interface. All its contents are inlined each time the trait is extended by a class or object. Inline traits avoid all of the pain points of Scala 2 specialization. They can do more than primitive specialization since they also specialize on value parameters and reference types. This helps avoid megamorphic dispatch. Inline traits also profit from all the optimizations available for inline methods, including inline matches, summonFrom, embedded splices. Indeed the analogy of inline traits and inline methods is strong: inline calls correspond to supercalls in extending classes and objects, inline parameters are inline parameters of the trait, inline traits can have nested inline matches, etc. Inline traits should not be too hard to implement, since they can probably draw on much of the implementation for method inlining.
Inline trait expansions are only generated on demand when a class or object extends an inline trait. This avoids the up-front cost of creating specialized copies which might never be needed. On the other hand, one needs to be more careful about de-duplication. If an inline trait is expanded with the same arguments twice, we get two expansions.
I see inline traits primarily as building blocks for large generic, specializable libraries. Typically, an application would instantiate such a library only once or a small number of times.
Compared to specialization, inline traits have one shortcoming, namely that interfaces are not specialized. If some code is parameterized by
MatrixLib[T]
for type parameterT
that code will use unspecialized interfaces (which are implemented via bridge methods using standard erasure). Unspecialized interfaces might impose a boxing/unboxing overhead. This leads us to consider also the following extensions.Specialized Traits
We could do Scala 2 style specialization for traits only. This avoids pain points (1) and (2) and could simplify the implementation, in particular since we could probably make use of the inlining improvements. We can also combine specialized and inline for traits. An inline trait with specialized parameters translates to the specialization of a pure interface. This means a group of interfaces with bridge methods, which is relatively lightweight. So making specialized traits
inline
can reduce the up-front code generation overhead.Specialized Methods
Say we have an inline trait like
MatrixLib
. Consider a factory method forMatrixLib
like this:Unfortunately, that would instantiate
MatrixLib
at genericT
, without any gain in customization (and without any additional code overhead either). We can get better customization by making theapply
aninline
method:Now, we get a new, specialized copy of
MatrixLib
at each call ofapply
. Whether this is good or bad depends on how many call sites there are. But if we had specialized methods, we could also do this:This will produce three copies of
apply
for matrix libraries overInt
s,Float
s orDouble
s.Summary
Inline traits cover some of the use cases of Scala 2 specialization with completely different tradeoffs, and potentially more opportunities for optimization and customization. They can be complemented with specialized traits and methods which each addresses one issue of inline traits:
The text was updated successfully, but these errors were encountered: