Skip to content

[SUGGESTION] Allow variable shadowing in function scopes #285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zaucy opened this issue Mar 21, 2023 · 18 comments
Closed

[SUGGESTION] Allow variable shadowing in function scopes #285

zaucy opened this issue Mar 21, 2023 · 18 comments

Comments

@zaucy
Copy link

zaucy commented Mar 21, 2023

Allow for variables to be shadowed in C++2. In C++1 variable shadowing is already sort of allowed with nested blocks. For example this is allowed:

int v = 10;
{
    std::printf("v=%i\n", i); // v=10
    auto v = "hello";
    std::printf("v=%s\n", v); // v=hello
}

Instead I would like the following to be allowed in C++2:

v := 10
std::printf("v=%i\n", v); // v=10
v := "hello"
std::printf("v=%s\n", v); // v=hello

Adding variable shadowing can enable developers to selectively disallow access to a variable past a certain point preventing mistakes and simplifying code.

g: (v: std::optional<MyType>) -> void {
  v := v.value_or(MyType{...});
  // 1) We no longer need the optional anymore since we have the value!
  // 2) No need for an extra variable name
  // 3) Prevent calling `.value_or()` multiple times
}

Other C++ standard library types have "unwrapping" functionality such as std::reference_wrapper, std::expected, std::future, std::unique_ptr, std::shared_ptr and more.

Variable shadowing could also be useful for some form of inspect where the inspected value gets moved:

x := inspect x -> std::string {
  is int = std::to_string(x);
  is _   = "not an int";
};

Unfortunately the above snippets where the variable is initialized with a variable of the same name would require some hackery like the following to be valid C++:

auto g(std::optional<MyType> v) -> void {
  {
    auto v__init = [&, v = std::move(v)]() mutable { return v.value_or(MyType{...}); }
    auto v = v__init();
  }
}

Regardless I believe variable shadowing in function scopes should be seriously considered.

Some related discussions in this repository

Const by default for local variables

https://github.com/hsutter/cppfront/wiki/Design-note:-const-objects-by-default & #25 (#25 (comment))

But what about local variables? Get the arrows ready: I'm skeptical of the value of const by default for locals... maybe I don't get out enough, but the majority of local variables are, well, variables.

Having local variable shadowing can improve the "const by default" approach for local variables since you can just declare a new variable with the same name.

i := 10
i = i * i // illegal, const by default
i := i * i // OK if shadowing is allowed

Move on last use

Move on last use could be extended to include a shadowed variable.

Require this. qualification

#274 #274 (comment)

Sometimes shadowing is bad. The implicit this context can hide member functions and variables. This hidding isn't adding additional functionality and is instead making code more difficult to navigate. I think allowing local variable shadowing while disallowing class member shadowing is the best of both worlds. In my opinion instead of making rules to prevent shadowing class members, this. qualification should be required.

Resources

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?

Potentially in some cases where an incorrect variable name is used. Take this example:

handle: (input: std::string) -> void = {
	sanitized_input := sanitize(input);
	record_input(sanitized_input);

	scary_work(input);
}

The wrong variable was passed to scary_work.

handle: (input: std::string) -> void = {
	input := sanitize(input);
	record_input(input);

	// old 'input' no longer accessible, cannot be used by accident
	// there is only one 'input' here, and it's the sanitized one
	scary_work(input);
}

Will your feature suggestion automate or eliminate X% of current C++ guidance literature?

For the C++ Core Guidelines variable shadowing would probably be added as part of the ES.12 ban, however I think that's a mistake. Variable shadowing can open opportunities to control accessibility and lifetime in a scope. Rust is an example of a language that embraces variable shadowing. It is even encouraged it in the official "book" - https://doc.rust-lang.org/1.30.0/book/2018-edition/ch02-00-guessing-game-tutorial.html / https://doc.rust-lang.org/1.30.0/book/2018-edition/ch03-01-variables-and-mutability.html#shadowing

An alternative perspective - if shadowing is encouraged or considered a good thing it wouldn't need to be written about.

Describe alternatives you've considered.

The shadowing draft by Markus Grech mentions adding a __shadow keyword however I believe that is overkill for C++2.

@jcanizales
Copy link

I mean, you had to know this was going to be very controversial 🙂 I think you should address the obvious counterpoint that disallowing this reuse of names within a scope lets the compiler catch bugs from typos.

@zaucy
Copy link
Author

zaucy commented Mar 21, 2023

I mean, you had to know this was going to be very controversial 🙂

Haha yea I did, but I felt someone had to suggest it 😅

I think you should address the obvious counterpoint that disallowing this reuse of names within a scope lets the compiler catch bugs from typos.

I think the most obvious one is hiding class members. I did mention my preference for the this. qualification if variable shadowing were allowed. Otherwise I think the counterpoint is less obvious. Shadowing a variable in a local scope from what I can tell has very little consequence. Is there maybe a more obvious consequence that I'm missing?

@vladimir-kraus
Copy link

Shadowing variables is incredibly dangerous and can lead to easy to make and hard to find errors (actually happened to me several times). Especially when the behaviour of a program can totally change with just one extra colon. If cpp2 aims at being safer and less error prone than cpp1, then I think disallowing shadowing of variables should be absolutely imperative.

@AbhinavK00
Copy link

Variable shadowing could be a trouble-some feature at times, cpp2 is probably better-off without it. Explicit shadowing can be something I think could be considered (like suggested with __shadow).
I would probably go the other way and suggest banning of shadowing of variables inside nested scopes (to enforce ES.12) or atleast making it explicit.

Require this. qualification
Const by default for local variables

Please make these happen Herb

@JohelEGP
Copy link
Contributor

The first handle: example is solved with strong types. scary_work should take a sanitized_input, rather than a std::string.

@zaucy
Copy link
Author

zaucy commented Mar 21, 2023

The first handle: example is solved with strong types. scary_work should take a sanitized_input, rather than a std::string.

Very true that does help! Even with strong types I would still prefer the variable shadowing so that some other piece of code doesn't use the string when it shouldn't. In a perfect world that wouldn't happen.

Shadowing variables is incredibly dangerous and can lead to easy to make and hard to find errors (actually happened to me several times). Especially when the behaviour of a program can totally change with just one extra colon. If cpp2 aims at being safer and less error prone than cpp1, then I think disallowing shadowing of variables should be absolutely imperative.

I think this is where some might disagree. I haven't found variable shadowing to be a cause for any bugs for me personally or any of my co-workers, but that's only anecdotal. I would prefer if you had the ability to disallow variable shadowing in your projects via linter while also allowing me to write code like this:

v: std::optional<int> = /* ... */;
v := v.value_or(0);
// I'm not using std::optional anymore and it's hidden! Horray!

Instead of this:

opt_v: std::optional<int> = /* ... */;
v := v.value_or(0);
// I'm no longer using opt_v - I'd rather it not be available just incase
// myself or another developer uses it accidentally in this function.

Often the variable shadowing I'd like involves the new variable being a different type anyways. Making any bugs involving shadowing even fewer IMO.

@vladimir-kraus
Copy link

vladimir-kraus commented Mar 21, 2023

I admit that

v: std::optional<int> = /* ... */;
v := v.value_or(0);

is quite an elegant code. But that is all to it. It is gravely dangerous.

Allowing shadowed variables, shadowed not only in nested scope but also in the same scope, will turn your code to a dangerous minefield. Every time you introduce a variable (or copy paste code from somewhere) you will have to look up and down the scope whether the variable (or variables if you copy paste a block) is not already used because you are effectively redefining them. If you are lucky, it will not compile because of incompatible types. If you are unlucky, you will happily compile and then it blow your leg off and your company will go bankrupt.

In other words, shadowing helps lazy programmers because they can simply re-use variable names over and over. But it creates a massive danger whenever you introduce any variable to your code. Without checking whether this variable already exists in the scope, you can never be sure if you haven't broken your code. And believe me, such bugs are so hard to find.

@gregmarr
Copy link
Contributor

You can also do this, which results in the optional not sticking around:

v := std::optional<int>(/* ... */).value_or(0);
// I'm not using std::optional anymore and it's hidden! Horray!

@zaucy
Copy link
Author

zaucy commented Mar 21, 2023

Yes when something is temporary it's fine. Variable shadowing allows the same for non-temporaries:

g: (v: std::optional<int>) -> void {
  v := v.value_or(0);
}

@jcanizales
Copy link

Angel in my shoulder is telling me not to argue specific examples. Demon in my shoulder won though, and I have to point out that a function that takes an optional<int> only to unwrap the int with a default and really really doesn't want its body to touch the optional argument, instead should just input an int. A for range loop might be a better example.

Coming back to a more productive thread: Given that it's controversial whether allowing shadowing prevents bugs, whether disallowing it prevents bugs, and which of those bugs are worse, the rules of the game are to go fish for actual numbers to support the proposal. (I don't expect they're easy to find in one direction or the other, though).

@vladimir-kraus
Copy link

vladimir-kraus commented Mar 21, 2023

Consider this code:

i := 1;
// ... lots of code here
do_something_with(i);

Now you realize you want to do some calculation somewhere inside // ... lots of code here and you need some index variable for it, which you decide to conveniently name i, so the code becomes:

i := 1;
// ... lots of code here - part 1
i := 2;
// ... some code using the new i
// ... lots of code here - part 2
do_something_with(i); // oops... I have just called it with wrong i !!!

Have you spotted the error? Can you make 100 percent sure that this will never happen to you?
In other words, whenever you introduce a new variable, you will need to manually check whether the variable is not already defined in the same scope because otherwise you will break the existing code and compiler will not warn you. This is the minefield that I am talking about.

AFAIK, cpp2 aims to be safer than cpp1. Shadowing variables in the same scope will make it significantly less safe than cpp1.

@zaucy
Copy link
Author

zaucy commented Mar 21, 2023

Yes agreed C++2 AFICT aims to be safer and I want that! But as herb mentions in the README:

I want to keep writing code in C++... just "nicer"

To me variable shadowing is "nicer". I know that might be controversial and I'm not speaking for herb and his meaning of nicer, but to me it is nicer.

the rules of the game are to go fish for actual numbers to support the proposal

I totally agree and I struggled to compile a meaningful set of numbers. Most of what I found online was mixed. An outliar was the rust community seems to encourage variable shadowing.

[...] whenever you introduce a new variable, you will need to manually check whether the variable is not already defined in the same scope because otherwise you will break the existing code and compiler will not warn you. This is the minefield that I am talking about.

I don't think it's fair to call it a minefield. Even in your example if you had different variable names (not shadowing) I would expect a developer to check what variables they're throwing into a function. The difference between shadowing and not would be the direction you start looking (or just use your tooling such as an LSP to go to declaration). If you're shadowing you'd look from the callsite up. If you're not shadowing you'd you could start from the top of the function. The difference seems small to me.

Another example where shadowing can prevent a mistake is std::future or similar classes. For std::future calling std::future<>::get more than once throws an exception. If you use variable shadowing you can prevent accidental use.

f: std::future<int> = /* ... from some other source or function parameter ... */;
// ... lots of code
f := f.get();
// ... lots of code

f.get(); // Compile time error! `f` is not a future it's an int! (GOOD!)
         // If this wasn't shadowed then this would result in a runtime error (BAD!)

@SebastianTroy
Copy link

SebastianTroy commented Mar 21, 2023 via email

@switch-blade-stuff
Copy link

An optional value and a certain value have different use cases, they're intended for different purposes and should be named accordingly to help the user understand what the code is trying to do.

Agreed, an optional clearly serves a different purpose than the certain value, and in order to follow the descriptive naming best practice, the variable names should reflect that.

The only place I would consider shadowing as an acceptable practice is in constructor and function arguments, that might shadow member variables of the parent type (especially for static functions, for example).

@SebastianTroy
Copy link

SebastianTroy commented Mar 22, 2023 via email

@msadeqhe
Copy link

Variables can shadow outer scope variables, for example a local variable can shadow a global variable.

On the other hand, local variables in a function have the same scope of the function arguments, and they should not shadow the function arguments. That is similar to if(int x = 0; x < 3) { ... } in C++1, which doesn't allow to shadow variable x inside the scope of if statement.

@msadeqhe
Copy link

msadeqhe commented Mar 22, 2023

If the decision is to allow variable shadowing (function scopes, inside if scopes, multiple shadowing inside any scope), this proposal seems good enough, because being explicit about variable shadowing is better than being implicit:

Describe alternatives you've considered.

The shadowing draft by Markus Grech mentions adding a __shadow keyword however I believe that is overkill for C++2.

@hsutter
Copy link
Owner

hsutter commented Mar 22, 2023

Thanks for the suggestion and the detailed writeup!

I think this part is the key, and this core idea is repeated several times in the comment thread:

Adding variable shadowing can enable developers to selectively disallow access to a variable past a certain point preventing mistakes and simplifying code.

To me there are two key points here:

(1) Yes, I agree it would be interesting to selectively disallow access to a variable name past a certain point in the code. Just yesterday I had a similar case in cppfront where I wrote this code to add a scope just to prevent accidental use of the moved-from variable later on in the function. (Honest, I wrote that code and the comment before seeing this thread.)

(2) But adding shadowing is not the only way to do it -- making up a comparison, it feels like removing an unwanted rodent (variable) by replacing it with a new animal. If the goal is to get rid of the rodent, we won't always have a new animal to replace it with. Why not focus on ways to just remove the rodent?

Worse, shadowing is a known bug farm (sorry! please don't shoot the messenger). In fact I'm actively working to reduce shadowing:

  • Just 10 days ago I pushed a commit just to ban shadowing of type scope names.
  • When I implement inheritance I'm going to see if I can switch that to overloading semantics instead of hiding (shadowing) semantics, which also are a recurring source of confusion. For example, in today's C++ it's a well known problem that a derived function that's a worse match can accidentally hide (shadow) a base function with the same name that is a better or even perfect match... a quick googling finds this SO example which is already a duplicate of this other one.

I think shadowing is dangerous, so perhaps a better analogy is that disabling an existing name by encouraging shadowing (to easily create a new variable with the same name) feels like getting rid of a rat by arming everyone in town with hand grenades. While it's true that it solves the rat problem if the next person to see the rat gets it with a grenade, encouraging grenade-carrying creates many other new problems (including property damage to gardens that now have craters where rats used to be, hospital costs as we'll likely lose some hands even in cases where there were no rats, etc.). And since a grenade isn't the only way to kill the rat, why not focus just on getting rid of the rat?

So: Yes, (2) shadowing would solve (1) accessing a local variable name too long, but it's a huge hammer and not the only solution.

I might be interested in solving (1), since I just hit it again myself yesterday... but as my above solution shows, scopes are a reasonable answer for many such cases today, so I'm not sure it is a problem worth adding a language feature to solve (again, sorry!). If I pursued anything in Cpp2 to make this more convenient than declaring a scope to bound the name's lifetime, it might be along the lines of the "undeclare" in the post linked under Alternatives Considered. I'd be more likely to consider solving this if there was more data showing that (a) this is a frequently encountered problem (data measurements, CVE root cause reports) and (b) where adding an extra scope to contain the unwanted variable's lifetime isn't feasible.

@hsutter hsutter closed this as not planned Won't fix, can't repro, duplicate, stale Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants