Skip to content

Conversation

@lihaoyi
Copy link
Contributor

@lihaoyi lihaoyi commented Aug 24, 2025

No description provided.


| Date | Version |
|-------------|--------------------|
| 23 Aug 2024 | Initial Draft |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| 23 Aug 2024 | Initial Draft |
| 23 Aug 2025 | Initial Draft |

@bracevac bracevac changed the title SIP-XX - Unpack Case Classes into Parameter Lists and Argument Lists SIP-76 - Unpack Case Classes into Parameter Lists and Argument Lists Sep 26, 2025
Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use case addressed by the proposal is important.

The proposal is very long and it seems there are lots of interactions with other features. This makes me a bit hesitant whether an implementation would be feasible. I believe we can reduce complexity and feature interactions dramatically if we pass unpack arguments as a single object instead of passing each field separately.

That's also what Python does and what Kotlin seems to propose. The next to last comment on the Kotlin KEEP seems to indicate this:

It is still planned to be experimental in 2.2 ?

Unfortunately, no. We have a design and a prototype, but to make the feature both expressive and performant, we need scalarization to avoid extra boxing on calling a method. We'd like to achieve an experience where developers can move from a bunch of parameters in their function signatures to a class-centric approach without sacrificing performance.

So we need more time to experiment. However, we’re cautiously confident that the semantics of datargs, combined with Valhalla’s value classes, can help with runtime performance, but it needs some time.

I fields were passed individually there would be no need for value classes, so this seems to indicate that fields are in fact passed in bulk.

So my recommendation would be to rewrite the proposal so that

  • An unpack parameter is passed into just like a regular parameter.
  • At the call site, we transparently construct an unpack class object from individual parameters.
  • Also at the call site, if we have an argument c* where c's type is a case class instance we expand to named parameters as described in the proposal.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 25, 2025

So my recommendation would be to rewrite the proposal so that

  • An unpack parameter is passed into just like a regular parameter.
  • At the call site, we transparently construct an unpack class object from individual parameters.
  • Also at the call site, if we have an argument c* where c's type is a case class instance we expand to named parameters as described in the proposal.

@odersky The challenge with this is that we want to support the ability to migrate a non-unpacked def to an unpacked def without breaking binary compatibility. I expect that would be a very common use case, as people often do not know what common sets of parameters are up-front and only realize that after the library has already grown several methods with similar parameters that need to be consolidated

One option we discussed before is that rather than constructing the case class at every call site, we lower the def containing unpack into two overloads:

  1. An private unchanged version that takes the case class parameters whole, which contains the implementation
  2. An unpacked version that takes the case class parameters separately, which is a simple forwarder to (1) above

(1) never needs to be externally callable, and we can have all calls to the method go through (2). This will be functionally equivalent to constructing the case class at every callsite, but with the added advantage of preserving binary compatibility when loose parameters are grouped into a single unpack parameter, or even if the unpack parameter is split into multiple separate unpack parameters

This seems like it should give us the best of both worlds, with both the implementation simplicity and the the user-facing compatibility?

@sjrd
Copy link
Member

sjrd commented Oct 25, 2025

The challenge with this is that we want to support the ability to migrate a non-unpacked def to an unpacked def without breaking binary compatibility. I expect that would be a very common use case, as people often do not know what common sets of parameters are up-front and only realize that after the library has already grown several methods with similar parameters that need to be consolidated

In that scenario, I expect the library developer to add an appropriate @binaryAPI private def to preserve compatibility.

If we generate the forwarders automatically, we will ease that specific migration scenario. However, we will make another migration scenario much harder: that of adding one field in the unpack class. I expect the latter to be a much more common scenario than the former. So I don't think the added forwarder is a benefit.

Unpacking should work for generic methods and `case class`es:

```scala
case class Vector[T](x: T, y: T)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is Vector[T] needs to be Point[T]

@soronpo
Copy link
Contributor

soronpo commented Oct 25, 2025

I'm missing what should happen in:

  • Unpack on T generic value (my guess is error), even if T is a product.
def foo[T <: Product](unpack t: T): Unit = {}
  • Unpack in implicit blocks (I suggest prohibiting it).
def foo(using unpack config: RequestConfig): Unit = {}

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 25, 2025

If we generate the forwarders automatically, we will ease that specific migration scenario. However, we will make another migration scenario much harder: that of adding one field in the unpack class. I expect the latter to be a much more common scenario than the former. So I don't think the added forwarder is a benefit.

This case would already be handled by usage of @unroll on the case class being unpacked, as mentioned in the SIP

@unroll interaction

@unroll annotations on the parameters of a case class should be preserved when unpacking
those parameters into a method def

case class RequestConfig(url: String, 
                         @unroll connectTimeout: Int = 10000,
                         @unroll readTimeout: Int = 10000)

// These two methods definitions should be equivalent
def downloadSimple(unpack config: RequestConfig)
def downloadSimple(url: String,
                   @unroll connectTimeout: Int = 10000,
                   @unroll readTimeout: Int = 10000)

We expect that both unpack and unroll would be used together frequently: unpack to
preserve consistency between different methods in the same version, unroll to preserve
binary and tasty compatibility of the same method across different versions. The two goals
are orthogonal and a library author can be expected to want both at the same time, and so
unpack needs to preserve the semantics of @unroll on each individual unpacked parameter.

So combined, this would allow us all three migrations without breaking binary compatibility: non-unpack -> unpack, adding extra default parameters to an non-unpack method, and adding extra default parameters to an unpack method.

And the user would need to add an @unroll to the case class to preserve binary compatibility anyway, as the case class being unpacked is part of the public API. So there's no additional cost to the user in needing to do so

@sjrd
Copy link
Member

sjrd commented Oct 26, 2025

Ouch. I had not seen that. That solves this particular issue, but poses even bigger problems. It means you can't use an unpacked case class anywhere @unroll would not be allowed. For example, we would have to forbid it on non final methods. IMO we have to remove that clause.

With the design Martin and I are advocating, it's unnecessary anyway. You get that compatibility for free, without the features having to interact at all.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 26, 2025

It means you can't use an unpacked case class anywhere @unroll would not be allowed. For example, we would have to forbid it on non final methods. IMO we have to remove that clause.

That isn't quite accurate; the flat design means you can't use an unpacked case class and evolve the method in a binary compatible manner anywhere @unroll would not be allowed. In other words, if you couldn't evolve the method in a binary compatible method before, you wouldn't be able to evolve it in a binary compatible manner using unpack. That is exactly the status quo, neither better or worse.

The boxed design provides additional bincompat evolution opportunities over what is allowed today, in that you can add @unroll parameters to the case class even in scenarios where @unroll is not normally allowed on that method. That certainly is a nice upside, but then the downside is we lose the bincompat evolution opportunity of converting a non-unpack method to an unpack method without breaking compatibility. The latter is a huge downside, because as a first approximation we can expect ~every unpack method out there to evolve from a non-unpack method.

So while the boxed design would add some opportunity for evolution of already-unpack-ed methods that is not present today and may be hypothetically useful, the penalty is we lose the opportunity for evolving non-unpacked methods to unpack methods that will present a concrete stumbling block for every usage of unpack in the wild.

The flat design would not improve upon the evolution of unpack methods - they will face the same limitations around @unroll as any other method today - but in exchange users would be able to compatibly evolve flat methods into unpack methods. As a library author who would like to use this feature, I expect the latter workflow will be core to the usage of unpack

@sjrd
Copy link
Member

sjrd commented Oct 26, 2025

I really believe you're overestimating the proportion of scenarios of a) evolve from pre-unpack to post-unpack versus b) evolve post-unpack to extended post-unpack.

Scala does not have a history of designing new replacement features so that they are binary compatible with what we had before (see for example enums and extensions). It's not a major goal. All else being equal, it might tip the scale. But if it weakens the new design when standing on its own, such argument won't hold up.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 26, 2025

It's not just "what we had before" for Scala as a language, but "what they had before" for every individual library and callsite. Even after this feature is released and in the wild for a decade, everything I said still applies.

People using large flat argument sets rarely know what the "meaningful" groupings of those parameters are until after they have grown a long list of them which have been out there for a while. That means they will not be able to up-front define the unpack case classes until later, and will need to add them later, causing breakage.

You've said yourself you never use large lists of default parameters, so maybe this use case never arises for you. But it has arisen for me many times in the libraries I maintain, which is why I opened this SIP. Every example described in the proposal falls under the category above. I maintain 5-6 libraries which match this exactly. If you don't believe me and don't believe the examples in the proposal that show that this use case is common, there isn't really anything else I can say

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 26, 2025

To take a concrete example from the proposal, consider the potential usage of unpack in a library like uPickle.

In March 2018, uPickle started off with a single def write method. Initially it is alone and takes one parameter, so even if unpack existed back in 2018, using it would be useless:

def write[T: Writer](t: T)

Later that month, it grew an indent: Int = -1 parameter. Even now, putting this into an unpack case class at this point would be silly. What would you put into the case class, (t: T), or (indent: Int = -1), or (t: T, indent: Int -1)? You don't know, because you don't know how the signature will evolve, or what other signatures will appear later

def write[T: Writer](t: T, indent: Int = -1): String

Later that month, it grew a def writeTo method that looks very similar, but with an out: java.io.Writer parameter between the two existing params:

def write[T: Writer](t: T, indent: Int = -1): String

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1): Unit

Should def write and def writeTo share the (indent: Int = -1) via an unpack case class? It's possible, but (a) doing so would be a breaking change, (b) an unpack case class for a single field is kind of silly, and (c) we don't know if the methods will continue to co-evolve or whether it's a one-off similarity and the methods will diverge after. But at least we now know that unpacking (t: T) or (t: T, indent: Int -1) would have been a mistake, which is something we didn't know before!

7 months later, in October 2018, both methods grew an escapeUnicode: Boolean = true parameter

def write[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = true): String 

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1, escapeUnicode: Boolean = true): Unit

Now it seems more clear we should move the shared (indent: Int = -1, escapeUnicode: Boolean = true) parameters into an unpack case class. But again, we do not want to do so because it is a breaking change.

In November 2018, the API grew an additional def writeToByteArray method with similar paramers:

def write[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false): String 

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1, escapeUnicode: Boolean = false): Unit

def writeToByteArray[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false): Array[Byte]

A year later, in December 2019, it grew a def stream method:

def write[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = true): String 

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1, escapeUnicode: Boolean = true): Unit

def writeToByteArray[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false): Array[Byte]

def stream[T: Writer](t: T,
                        indent: Int = -1,
                        escapeUnicode: Boolean = false): geny.Writable

2 years later in March 2021, the API grew def writeToOutputStream with similar parameters

def write[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false): String 

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1, escapeUnicode: Boolean = false): Unit

def writeToByteArray[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false): Array[Byte]

def stream[T: Writer](t: T,
                        indent: Int = -1,
                        escapeUnicode: Boolean = false): geny.Writable

def writeToOutputStream[T: Writer](t: T, out: java.io.OutputStream, indent: Int = -1, escapeUnicode: Boolean = false): Unit

3 years later, in September 2024, all APIs grew a sortKeys: Boolean = false parameter

def write[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false, sortKeys: Boolean = false): String 

def writeTo[T: Writer](t: T, out: java.io.Writer, indent: Int = -1, escapeUnicode: Boolean = false, sortKeys: Boolean = false): Unit

def writeToByteArray[T: Writer](t: T, indent: Int = -1, escapeUnicode: Boolean = false, sortKeys: Boolean = false): Array[Byte]

def stream[T: Writer](t: T,
                        indent: Int = -1,
                        escapeUnicode: Boolean = false,
                        sortKeys: Boolean = false): geny.Writable

def writeToOutputStream[T: Writer](t: T, out: java.io.OutputStream, indent: Int = -1, escapeUnicode: Boolean = false, sortKeys: Boolean = false): Unit

The moral of this story is that with the boxed design, there is no point in the evolution of this uPickle library that using unpack would be feasible!

  • The library API doesn't spawn fully-formed out of some brilliant insight, but rather grows and accretes over time. In this case over a period of 6 years for this particular method (and longer over 10 years if you consider the earlier versions of uPickle dating back to 2014)

  • Even if unpack existed back in 2018, it would not have made sense to unpack a single parameter used in a single method. But with the boxed design, that is the only time we could have used unpack without causing breakage

  • As the library grows over the years/decade and the need for unpack becomes more clear over the years, with the boxed design there is no way to apply unpack without causing breakage.

If you never use default parameters then obviously this won't apply to you, but for anyone who does use default parameters this is a story that is told again and again. In ~every project in the scala-toolkit or com-lihaoyi platform: uPickle, os-lib, requests-scala, pprint, mill, cask, we can see the same pattern playing out.

With the status quo, or even with the boxed unpack design @sjrd and @odersky have put forward, there is no solution. These projects are just stuck are have to deal with copy-pasting parameter lists all over the place. But with the flat unpack design, we can cut the gordian knot: as the library API grows and we notice the similarity and duplication appearing, we can retro-actively apply unpack to clean it up and reflect the underlying similarity in the parameter lists without causing binary compatibility breakage.

And that is, after all, the motivation for this proposal in the first place

@sjrd
Copy link
Member

sjrd commented Oct 26, 2025

With the status quo, or even with the boxed unpack design @sjrd and @odersky have put forward, there is no solution.

I already mentioned the solution: a single, one-time @binaryAPI private def. It's that simple. We already have the universal "let me change the API but preserve bin compat" feature. There is no need to burden every other feature with that requirement.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Oct 26, 2025

Does @binaryAPI private def allow us to avoid duplicating parameter lists between the private def and the case class? The point of this feature is to avoid the need to duplicate parameter lists, so if it just changes the duplication between multiple defs to duplication between the @binaryAPI private def and the case class, that helps but doesn't fully solve the problem

@sjrd
Copy link
Member

sjrd commented Oct 27, 2025

No, you'll need to keep the full signature exactly as it was before you decided to migrate to unpack, then never touch it again. So it's a one time thing at the commit where you migrate. Similarly to other similar migrations.

It doesn't solve the duplication that you already had in your to before you migrated, but it won't accumulate more duplication after you migrate. Again, that's quite similar to other migration scenarios, for example when you deprecate methods in favor of new ones that are better designed: you have to keep the deprecated signature exactly as it was before your refactoring, but then you can forget about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants