Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifier shadowing in for-comprehensions is only supported in flatMaps #4525

Closed
propensive opened this issue May 13, 2018 · 19 comments
Closed
Labels
area:desugar Desugaring happens after parsing but before typing, see desugar.scala help wanted itype:enhancement stat:needs spec

Comments

@propensive
Copy link
Contributor

In dotc 0.7.0, this compiles,

for {
  x <- List(1, 2, 3)
  x <- List(x + 1)
} yield x

but this does not,

for {
  x <- List(1, 2, 3)
  x = x + 1
} yield x

It gives the errors in the REPL,

<console>:14: error: recursive value x needs type
         x = x + 1
               ^
<console>:14: error: x is already defined as value x
         x = x + 1
         ^

It's probably an accident of implementation, but it's very useful for getting safe mutable-like syntax in long chained for-comprehensions, without needing to resort to introducing many different identifier names, provided every generator in the for-comprehension is a flatMap, not a map.

This is a quirk of Scala 2, and it would be nice to remove it in Scala 3.

@smarter smarter added area:desugar Desugaring happens after parsing but before typing, see desugar.scala itype:enhancement help wanted labels May 13, 2018
@Blaisorblade
Copy link
Contributor

I understand your reasoning, and (some?) MLs support this well, but the error suggest that x = x + 1 is being interpreted as a recursive definition, like outside of for comprehensions; the MLs that support this require you to request explicitly recursive definitions.

If we could (and we probably can't), I suspect forbidding x <- List(x + 1) would be more consistent for readers on the one hand (while on the other it's inconsistent with desugared comprehensions). But since nobody felt the need to forbid this pattern in existing code via WartRemover or Scalafix, it might not be worth the bother.

OTOH, if I test 0.8.0-RC1 (which you likely should), the errors got worse:

scala> for {x <- List(1, 2, 3); x: Int = x + 1 } yield x
1 |for {x <- List(1, 2, 3); x: Int = x + 1 } yield x
  |                         ^^^^^^
  |                         duplicate pattern variable: `x`
scala> for {x <- List(1, 2, 3); x = x + 1 } yield x
1 |for {x <- List(1, 2, 3); x = x + 1 } yield x
  |                             ^
  |                 overloaded or recursive method $anonfun needs return type

@joroKr21
Copy link
Member

But is the recursive definition of a local val even possible? I think not unless it's lazy.

@Blaisorblade
Copy link
Contributor

Blaisorblade commented May 13, 2018

@joroKr21 val x: Int = x + 1 is clearly interpreted as a recursive definition (though not a useful one). You need some suspension* to make this work, but you don't strictly need a lazy val:

scala> val f: List[Int] => List[Int] = { case Nil => Nil; case x :: xs => x + 1 :: f(xs) }
val f: List[Int] => List[Int] = Lambda$1715/318787032@4fc6e776
scala> f(List(1, 2, 3))
val res2: List[Int] = List(2, 3, 4)

and we can't very well change the scoping of val depending on the type being defined.

For more compelling examples of this, I'd maybe look at function types hidden behind aliases, e.g. in parser combinators, where IIRC there's better reasons to not write f as a method.

EDIT: had posted before being done.
EDIT: *you need to be defining productive codata, either functions or objects can do.

@joroKr21
Copy link
Member

joroKr21 commented May 13, 2018

Emphasis on local:

scala> val cd: Int => String = t => if (t <= 0) "lift-off" else cd(t - 1) 
val cd: Int => String = Lambda$1492/2086673744@79b2852b

scala> def foo = { val cd: Int => String = t => if (t <= 0) "lift-off" else cd(t - 1); cd } 
1 |def foo = { val cd: Int => String = t => if (t <= 0) "lift-off" else cd(t - 1); cd }
  |                                                                     ^^
  |         `cd` is a forward reference extending over the definition of `cd`

But I agree it might be too inconsistent for a val being local to suddenly change scoping rules.

Edit: on the other hand, the scoping rules in for-comprehensions are also inconsistent unless you know the desugaring 😃

@Blaisorblade
Copy link
Contributor

Oh, I stand corrected. And it's clearer why object fields need be allowed to refer to object fields than why that would work for locals. Still, not sure how to keep things consistent overall — or how to not make them less consistent.

Edit: on the other hand, the scoping rules in for-comprehensions are also inconsistent unless you know the desugaring 😃

It seems Scala's answer is "know the desugaring" right now, tho if somebody wants to push to get x <- List(x + 1) forbidden (at least in linters) things might be more consistent. How many people find "x loops over x" a good idea? What @propensive proposes (and doesn't work) seems much better, in a world where that's a known convention.

I'd also first try to resolve #2573 — some of the options there would actually desugar x = x + 1 into val x = x + 1, in which case it'd be even more natural to close this ticket.

But overall, I don't have a strong preference either way.

@LPTK
Copy link
Contributor

LPTK commented May 17, 2018

I just wanted to add that Scala lacking shadowing of local variable is a huge usability problem IMHO. In functional languages like OCaml, where values are usually immutable, you can do:

let name = getName() in
let name = name |> capitalize in
print_string name

In Scala, you have to come up with silly names that pollute the namespace:

val name = getName()
val nameCap = name.capitalize
println(nameCap)

...and in my experience, this is actually a huge source of mistakes. It is all too easy to write println(name), forgetting about the more recent version of the value, nameCap. This is especially terrible if you introduce the intermediate transformation in the middle of existing code, and you have to find and change all usages of the old version in the rest of the function. Shadowing actually helps users by preventing later code from seeing an obsolete binding.

Now, it's probably too late to change the scoping rules of Scala, but maybe something can be done to alleviate this usability hazard in a backward-compatible way.
How about a syntax like this?

val name = getName()
new val name = name.capitalize
println(name)

Then we can then say that for-comprehension bindings have the semantics of new val shadowing bindings instead of the semantics of simple val bindings (and we can make the desugaring itself actually use new val).

@Sciss
Copy link
Contributor

Sciss commented May 18, 2018

I also don't like name0, name1, etc.; but we have var name for that.

@LPTK
Copy link
Contributor

LPTK commented May 18, 2018

we have var name for that

It's true that this can help, but it does not work in general, as the type of a transformed value can change after a transformation (especially, it can be refined). For example, we may have a newtype CapitalizedString <: String to encode more knowledge in the type system – in that case name.capitalize cannot be assigned to a var name: String. It's also a shame to resort to mutability because of quite trivial limitations in the language design.

@Sciss
Copy link
Contributor

Sciss commented May 18, 2018

But if the type changes, probably you shouldn't be using the same variable name...

@LPTK
Copy link
Contributor

LPTK commented May 18, 2018

@Sciss the argument is that inventing new names is hard, so at some point you're going to use dummy names like name0, name1 etc. which is a big potential for mistakes (it's easy to leave some reference to name2 instead of name3 somewhere). Moreover, in the motivating example the new variable still conceptually represents the name, even though the type is refined, so there is arguably no reason for picking a different name and polluting the local namespace.

Anyway, at this point I guess it depends on individual preferences. Though I think it's valuable to remember that ML programmers have been doing this for a long time without problems (AFAIK).

@acjay
Copy link

acjay commented Jun 12, 2018

@LPTK I've thought about this more, and I'm very in favor. It is somewhat remniscent of linear types, in terms of ability to model tiny pipelines without the risk of making mistakes of using the wrong intermediate value. I think people would quickly realize how handy this could be for streamlining certain idioms. Two follow up ideas:

  • Even better is if if guards were allowed -- but only on redeclarations. The intermediate name problem only gets worse when those names have to capture "maybe" concepts.
  • Part of me really likes new val. That would make the intent to shadow explicit. This could even be optionally enabled with a compiler flag to generate a warning if new isn't used.

@propensive
Copy link
Contributor Author

@LPTK I only just saw your proposal, and I have no idea if it would be a net benefit or not, but full marks for audacity!

@LPTK
Copy link
Contributor

LPTK commented Jun 13, 2018

@acjay interesting, so you would have guards such as this?

new val name = name.capitalize if name != "BOB" // 'name' in guard refers to new value?

I'm not sure I see the value with that though, as it's easy to write with an if, or even with some helper functions like optionIf as in:

new val name = name.capitalize optionIf (_ != "BOB") getOrElse name

@acjay
Copy link

acjay commented Jun 13, 2018

@LPTK, yeah, you could use a full if-else, but I think it would enable a more direct modeling of business logic when you want fall-through behavior. These guards are a common idiom in Ruby, except there, there are nasty gotchas with mutability and nullability. Restricting usage of guards to new val fixes this.

For me this feature is particularly useful when folding over a data structure, propagating a bookkeeping case class asking along the way, using copy to update that memo through the process. It's like the immutable version of the pattern of a for loop that updates some data structure in-place. new val prevents the easy mistake of accidentally modifying the wrong intermediate version of a variable.

@decodejacques
Copy link

There's actually a case where you can't use var even when all the values are of the same type.

// What I want to do
val (state, x) = modifyState(state)
val (state, y) = modifyState(state)

// What I have to do in Scala
val (s1, x) = modifyState(s0)
val (s2, y) = modifyState(s1)
...
In other words, if a function returns a tuple, and you want to use one component of that tuple as a value for a variable that will shadow an existing variable... you're stuck

@OlegYch
Copy link
Contributor

OlegYch commented Dec 26, 2018

this has long seemed like a bit of inadequacy to me
if we can shadow in one context why can't we in other very much related contexts (both map in for comprehension and val) ?
i'm all for permitting it in both cases
and no, var doesn't work for it because it allows more than shadowing

@propensive
Copy link
Contributor Author

I think the shadowing is acceptable within the for-comprehension only because its scope is quite limited. Shadowing of vals has much more potential to be accidental, so I would prefer this not to be possible in general.

Regarding the possibility of changing the type during shadowing, this would be very much desirable. As a simple example of how this could be useful, we could be constructing an HList element-by-element, so its type would be growing each time it's shadowed.

@acjay
Copy link

acjay commented Dec 27, 2018

@propensive what if you were only allowed to shadow with a value that shares a non-trivial super type? The specific type of the new value would be retained though. In any case, it seems like warning settings could give people some control over the strictness they'd like. Or really whether they'd want to allow val shadowing at all.

@propensive
Copy link
Contributor Author

propensive commented Dec 28, 2018

@acjay I think it's worthy of consideration, though I don't like that it would require the introduction of another concept ("nontrivial supertype") even though I have a pretty good notion of what that would be. That same concept, if it existed, could be useful and even desirable elsewhere (e.g. if a LUB were inferred to be a nontrivial supertype of the parameters to a List constructor), but it would be a much grander language change than my original proposal, and I'm not sure we would be able to anticipate all the consequences...

Anyway, a much simpler argument is that I don't think it would be useful. If we are trying to avoid accidental shadowing of unrelated values, the typechecker would discover all the problematic cases anyway, because "nontrivial supertypes" are exactly the ones without useful interfaces.

There might also be performance considerations. I don't believe the compiler currently is aware when shadowing happens, other than to put the symbols into a data structure which makes the shadowed identifiers inaccessible. It might be expensive (even 2% overhead would undo a lot of good work that's already gone into improving compiler performance) to do these checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:desugar Desugaring happens after parsing but before typing, see desugar.scala help wanted itype:enhancement stat:needs spec
Projects
None yet
Development

No branches or pull requests

9 participants