-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIP-62 - For comprehension improvements #79
Conversation
about allowing |
Just to be clear, is this 3 orthogonal changes or do they build on each other? Just curious. |
perhaps some code out there has side effecting map that could change semantics if dropped? |
Theses can be 3 independent changes. |
Yes, that is possible. But if that's the case, then using |
content/better-fors.md
Outdated
|
||
This change is binary and TASTY compatible since for-comprehensions are desugared in the Typer. Thus, both class and TASTY files only ever use the desugared versions of programs. | ||
|
||
While this change is forward source compatible, it is not backward compatible, as it accepts more syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty clear that many of the changes in this proposal can potentially change the behavior of user code. We could argue that such user code is badly written, but nevertheless it is a concern, and we should be rigorous about listing out what kinds of user code would have their behavior changed by this proposal.
Given that we are already choosing to potentially change the behavior of user code via these changes, I wonder if we should take it further. e.g. rather than generating
Some(1).map { a =>
val b = a
(a, b)
}.withFilter { case (a, b) =>
b > 1
}.map { case (a, b) =>
a + b
}
we could generate something like
Some(1).flatMap { a =>
val b = a
if (!(b > 1)) Option.empty
else Some(a + b)
}
The latter is what people would write by hand, and doesn't have the same surprising behavior around order of evaluation and laziness that the former does (surprising behavior that I have myself been bitten by several times)
If we're going to be changing the desugaring and potentially breaking user code, I feel like it makes sense to change it directly to a desugaring that makes sense, rather than changing it half-way and paying the cost in breakage without reaping the full benefits of a simpler desugaring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that particular example, how would the compiler know that it should desugar to Option.empty
instead of Some.empty
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be wrong, but AFAIK this would require some refactoring around desugaring of for
s. Right now, the desugaring is purely syntactic and it deals with untyped trees. So to implement the desugaring based on empty
in an elegant way, for
s would have to become fully fledged (typed) trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could provide implicitly[Empty[Some[Int]]]
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, this would still require the type info. Implementing empty
as a method on the class instead of the module might work, but that seems the opposite of elegant.
I think that this change might be way less disruptive than it may seem. Maybe we could run the open community build with vs without the feature flag and check the results. (It's far from good testing, but better than speculation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the old desugaring could remain supported under a flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, this would still require the type info. Implementing empty as a method on the class instead of the module might work, but that seems the opposite of elegant.
In fact we already have it. There is an empty
method on most (all?) stdlib collections. But we also need a singleton method and we don't have that (the Some
in the last line of the example)
Some(1).flatMap { a =>
val b = a
if (!(b > 1)) this.empty
else this.singleton(a + b)
}
Otherwise we'd need the types, and we definitely do not want that! The untyped translation of for expression is one of the major distinguishing features of Scala.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to go the implicit route, we might be able to do so via an untyped desugaring as well. Something like:
trait Empty[T]{
def apply(): T
}
object Empty{
def of[T](v: => T)(implicit e: Empty[T]) = e
}
trait Single[T]{
def apply(t: T): T
}
object Single{
def of[T](v: => T)(implicit e: Single[T]) = e
}
Some(1).flatMap { a =>
val b = a
if (!(b > 1)) Empty.of(this).apply()
else Single.of(this).apply(a + b)
}
This could be useful if e.g. we wanted to provide the Single.of
value without needing to add a def singleton
on every type, allowing us to move forward without modifications to existing standard library classes (e.g. if the std lib was still frozen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being able to make this change independent from unfreezing the stdlib would be great. But I'm not sure if it's that straightforward to create those typelcasses. For the sake of inference, we would like the typeclass to be contravariant (we want Empty[Option[?]] <: Empty[Some[?]]
). But then IMHO we can't implement it, since it's not the case that ∀(Opt <: Option[?]). None <: Opt
. On the other hand, if we make the typeclass invariant, then it will just infer the more specific type. (But maybe I'm missing something obvious)
One additional concern I have here is that this
isn't a correct lookup. In this case, it's an owner class/package reference. To get the "monad" lookup, we would have to lift the first GenFrom
expression and use it as a lookup afterwards, like so:
val a$lifted = Some(1)
a$lifted.flatMap { a =>
val b = a
if (b > 1) a$lifted.singleton(a + b)
else a$lifted.empty
}
If done correctly, the lifted val
should always be immediately before the reference, but might add a lazy
to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid the by name parameter, as long as the value is bound to a variable so you could have something like Empty.of[b.type]
Related about for-comprehensions but with focus on performance of for-do: https://august.nagro.us/scala-for-loop.html A quote from the above blog post by @AugustNagro : If we revisit de-sugaring of for-comprehensions perhaps we should also include possible performance optimizations to for-do in the same go? |
@bjornregnell If we can unfreeze the standard library, performance for |
I believe that is still not quite correct: val a$lifted = Some(1)
a$lifted.flatMap { a =>
val b = a
if (b > 1) a$lifted.singleton(a + b)
else a$lifted.empty
} This only works of the result of the function passed to the flatmap is the same as the carrier type of the flatmap. But it general it need not be. |
Aha! If it can be solved in the std library rather than by the compiler this is yet another cool evidence about the power of the language :) (Again so glad that I'm doing Scala). |
I think changes (1) and (2) are definitely worthwhile. (3) (eliding map) is tricky since it changes behavior if the I wonder whether we should wait with (3) until we have a better grip on effects. There are a number of places where this would be beneficial. But it's clearly a long-term goal only. Or, try a cooperative solution: If the final expression is Or, maybe we can declare that side-effecting |
I wonder if this behavior would make more sense in |
About the general problem of translating The criteria being:
The new scheme does not need
ExampleThe following for expression mixes generators, val definitions, and conditionals: for
x <- List(1, 2, 3)
y = x + x
if x >= 2
i <- List.range(0, y)
z = i * i
if z % 2 == 0
yield
i * x The expression would be translated to the following code: val xs = List(1, 2, 3)
xs.flatMapDefined: x =>
val y = x + x
xs.applyFilter(x >= 2):
val is = List.range(0, y)
is.mapDefined: i =>
val z = i * i
is.applyFilter(z % 2 == 0):
i * x I verified that the two expressions give the same results if the three methods are defined like this: extension [A](as: List[A])
def applyFilter[B](p: => Boolean)(b: => B): Option[B] =
if p then Some(b) else None
def flatMapDefined[B](f: A => Option[IterableOnce[B]]): List[B] =
as.flatMap: x =>
f(x).getOrElse(Nil)
def mapDefined[B](f: A => Option[B]): List[B] =
as.flatMap(f) Here we use an object UNDEFINED
extension [A](as: Vector[A])
def applyFilter[B](p: => Boolean)(b: => B): B | UNDEFINED.type =
if p then b else UNDEFINED
def flatMapDefined[B](f: A => IterableOnce[B] | UNDEFINED.type): Vector[B] =
as.flatMap: x =>
f(x) match
case UNDEFINED => Nil
case y: IterableOnce[B] => y
def mapDefined[B](f: A => B | UNDEFINED.type): Vector[B] =
as.flatMap: x =>
f(x) match
case UNDEFINED => Nil
case y: B => y :: Nil Or, as a third possibility, In the Scala collections library, we build most collections using buffers. In this case DiscussionWe can pick old or new scheme depending on the availability of The upside of this scheme is that we get nice untupled vals even in the presence of conditionals. The downside is that instead of About efficiency: it depends on the implementation of the collection methods. If there are no vals it will be hard to beat a good implementation of Formal translation rules for the new scheme.The syntax of for expressions is described as follows:
where
The translation scheme is defined with semantic brackets
The translation is defined as follows:
|
I like the As far as I can tell, it basically is the "minimal" encoding necessary to represent the for comprehension. The various Some/None/Empty/Singleton/etc. things we discussed earlier are collapsed relatively neatly into those three operations. And as a result of being the minimal encoding of the for comprehension, it seems to be the most direct as well: the for comprehension becomes this set of nested function calls. No tuple packing/unpacking weirdness like the status quo, no ad-hoc if-else-empty conditionals like some of the alternate proposals. I don't think it's any more complex than what we have now. Rather, it takes a bunch of complexity in the desugaring today and moves it into the implementation of three core methods, allowing the desugared code to be far simpler and allowing freedom to implement those three core methods in a wider variety of ways |
x <- combineM(a, b) | ||
yield x | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere, we discussed how making the final map
optional would be a breaking change. But what if we went the haskell route, and only removed the final map
if the final yield
is also removed?
val a = largeExpr(b)
for
b <- doSth(a)
combineM(a, b)
- This is currently invalid syntax, so no backwards compat concerns
- We get to remove the useless
map
from the generated code - We also get to remove the useless
yield
from the source code! - The correspondence between the
map
andyield
is direct:yield
meansmap
, no-yield
means no-map
. No "magic" map-removal depending on the shape of the finalyield
expression - The final source code is shorter and more concise
- The final source code looks more like other languages with similar syntax, e.g. Haskell's
do
or F#'sreturn!
in computation expressions
Seems like a win-win-win-win-win-win overall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an interesting possibility. Parsing will be a problem since we don't know anymore whether the next token sequence is a pattern or an expression. I am sure it can be overcome, but don't know how much complexity it would add to the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider this change, but I didn't manage to get it to work in the parser. (I tried going in the reflect-like direction to get better syntax for this case)
But if we were to be able to distinguish a pattern from an expression, then yet another problem is automatically solved -- we can get rid of the unnecessary _ <-
with monadic 'unit' expressions.
One workaround from a pre-SIP is to have a yield <- monadicExpr
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yield <-
looks super ugly syntactically IMO, but semantically it's exactly the same as yield from
in python and return!
in F#. Maybe we can tweak the syntax to make it prettier, but the idea of using a keyword to work around parser ambiguity seems reasonable if a 'direct' syntwx proves difficult to implement
we can get rid of the unnecessary _ <- with monadic 'unit' expressions.
One issue here is we have two things we may want to simplify
_ <- too
_ = foo
For non-trailing raw expressions in a for
block, we can only assign one meaning, and the other meaning will still have to be explicitly spelled out as above. Maybe that's ok, but it's something worth considering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be controversial to reuse return
as in
for x <- List(1,2,3) return List(x, x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be against reviving return
for several reasons, including: 1) yet another syntax to learn besides do
and yield
, but worse: 2) my experience is that many learners have all sorts of connotations on the return
keyword with all sorts of misconceptions about how it works in Scala so I think it is best to say "don't use return" and for some it takes a while not to end everything with return if they are used to that...
About the translation of for expressions with ifs. I omitted before patterns in generators and in particular
For symmetry could also add case binders
|
About final |
I'm not sure side effecting There are also the various Future/Streaming libraries where the final |
for
x <- xs
if p(x)
yield
x But in that case we would not drop the |
I think that the desugaring proposed by @odersky will result in a cleaner scheme than my suggested improvements. The only downside is that it is more disruptive, since it requires implementing new functions. So what should be the steps now? |
I am open merging the proposals into one, but we could also have two different ones. As far as I can see, the original SIP scheme is a subset of my proposed extensions, so it could go in separately. On the other hand, maybe it's more economic to discuss everything together. I believe even the extended the SIP can in principle be adopted without waiting for stdlib to be unfrozen. It's just that its improvements would be more useful if they could be used with stdlib collections. We can discuss what to do at the next SIP meeting. |
I saw that the 3.4.0 release notes contain:
That is a breaking change similar to the ones discussed in this proposal, as we have no guarantees that I suppose that sets the precedence that it's OK to do similar potentially-breaking changes going forward? That category would include the final |
So the first generator would be typed to figure out which desugaring to apply? |
The consensus in the SIP committee was that we would prefer the more aggressive change, if it is possible in the short-medium term, rather than making an incremental change now and then possibly making another change a short time later. But that is all contingent on whether or not the more aggressive change, i.e. the desugaring proposed by @odersky, is viable.
@odersky this point is worth discussing. AFAIK, one of the goals of many of these desugarings is to make them purely syntactic. The alternate proposal you made would require typechecking the first generator in order to decide how to perform the desugaring.
|
@lihaoyi Yes, I think something like that will be needed. Using inline extension methods for that is an interesting idea! |
I believe we had a similar situation in Scala 2.8 where |
Yes indeed. At the time we did not have inline extension methods, so we had to base it on membership. With inline extension methods, it's still a bit tricky. Here's a possible scheme:
With extension methods we can retro-fit the new scheme to existing libraries such as stdlib. |
@kyouko-taiga I also updated the stage to implementation to make it a valid state |
FYI I updated the implementation of the original (first) schema with some more tests and rebased it onto What are the next steps here? Is there an official summary of the SIP committee decision? |
@KacperFKorban apologies i forgot to update this. This proposla was accepted for implementation and experimentation during the April SIP committee meeting, where it will be available behind a flag for a period before a second vote to confirm it |
@lihaoyi no worries. |
Implementation for SIP-62. ### Summary of the changes For more details read the committed markdown file here: scala/improvement-proposals#79 This introduces improvements to `for` comprehensions in Scala to improve usability and simplify desugaring. The changes are hidden behind a language import `scala.language.experimental.betterFors`. The main changes are: 1. **Starting `for` comprehensions with aliases**: - **Current Syntax**: ```scala val a = 1 for { b <- Some(2) c <- doSth(a) } yield b + c ``` - **New Syntax**: ```scala for { a = 1 b <- Some(2) c <- doSth(a) } yield b + c ``` 2. **Simpler Desugaring for Pure Aliases**: - **Current Desugaring**: ```scala for { a <- doSth(arg) b = a } yield a + b ``` Desugars to: ```scala doSth(arg).map { a => val b = a (a, b) }.map { case (a, b) => a + b } ``` - **New Desugaring**: (where possible) ```scala doSth(arg).map { a => val b = a a + b } ``` 3. **Avoiding Redundant `map` Calls**: - **Current Desugaring**: ```scala for { a <- List(1, 2, 3) } yield a ``` Desugars to: ```scala List(1, 2, 3).map(a => a) ``` - **New Desugaring**: ```scala List(1, 2, 3) ```
Hi, I have another proposition for improving for-comprehensions. I want to fix the problem that arises from the use of the following code: //> using scala 3.3.3
//> using lib "dev.zio::zio:2.1.5"
import zio.*
def loop: Task[Unit] =
for
_ <- Console.print("loop")
_ <- loop
yield ()
@main
def run =
val runtime = Runtime.default
Unsafe.unsafe { implicit unsafe =>
runtime.unsafe.run(loop).getOrThrowFiberFailure()
} This kind of effect loop is pretty commonly used in Scala FP programs and often ends in The problem with the desugaring of this for-comprehensions is that it leaks memory because the result of A possible solution that I want to suggest and possibly add to the A possible approach could be to add sticky keys to the A naive PoC: KacperFKorban/dotty@31cbd47 |
Interesting idea, thanks! Since the design of this SIP has been accepted, additional changes should either be part of a revision or proposed as a separate SIP (possibly after a Pre-SIP discussion). Given the current status if this SIP, I think that the latter option would be the most appropriate. |
@kyouko-taiga Sure, in that case, I'll create a pre-SIP and try to gather some feedback. |
SIP-62 - For comprehension improvements
Author: Kacper Korban (VirtusLab)
Summary of the proposed changes
For more details read the committed markdown file.
This proposal introduces improvements to
for
comprehensions in Scala to improve usability and simplify desugaring. The main changes are:Starting
for
comprehensions with aliases:Simpler Desugaring for Pure Aliases:
Avoiding Redundant
map
Calls: