-
Notifications
You must be signed in to change notification settings - Fork 258
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUGGESTION] - Move(not copy) semantic for by default #417
Comments
As other people already said in these closed suggestions that move semantic is not welcomed (for actually no reasons other then personal taste) but it isn't just a simple change of defaults. Move semantics is the key to safer c++. cpp2 is good but if I would be asked keep all the problems of cpp1 with fixed only move semantics or alternative the all nice features from cpp2 but wtihout move semantic - you know there is simply no choice on the platter. Nobody will need cpp2 if we could fix move in cpp1. Rust and val is too much extreme for cpp community but rejecting even thinking hard and deeply about move semantics just because you get used to copy and never tried alternative doesn't really good reasoning. Again move semantics expressed in default move paramaters is not just a (in)convenience compare to in that helps decide automatically copy or reference. This is a fundamental shift in how people design API and what tools can help us. Any change in internal memory presentation is by fact an ownership change operation. You've added an item to a vector -> vectors memory presentation may be changed so as API designer you have to demand full ownership to do so. On the other hand, just writing (out param) to a valid int item of a vector never causes any memory issues. And this is a very simple case when vector is a value type. This can help much more with classes that are non-value types where all business logic lives and where almost all methods change memory presentation somewhere in the object tree. You know all these controllers/models/uncopybale stuff, I really want/need help from the compiler/tools to prevent me from doing wrong things. Anyway, I can understand answer like - we really tried hard to implement this and it causing <litst_of_issues>. I can work with this list and might help with fixing issues, but answers like - nah we are c++ we don't care what other smart people in rust, val, and other communities think about safety we know better just because we know better/don't want even try/really think this is a stupid idea. In other words, what exactly an issue with having move as the default? Trivially copy types? 0 issues they continue to copy, value type? That is what we want - an explicit copy of heavy value types. Non-value types this is where move semantic is what makes the difference between safe and unsafe code. If I don't need to deal with non value types I wouldn't bother anybody asking this - that is where majority of my headache and move helps with it. If this suggestion could make people just think about move semantics by default idea for more than 10 minutes is already a success for me. Just give it a try in your head it is really more than you think it is. |
Apologize if I crated bad impression, I'm newcomer here and just tried to find relevant read, while waiting answer from author or contributors. As for topic, I perceive move and ownership as different things. For example, in gamedev I want to use one object in several functions, like
In Rust, as I understand, ownership will be "borrowed" but then returned, so I can use it later. This is not the same as just blindly move, because that means I cant pass object to second function, without somehow getting it back, via return or out parameters from 1st function. |
cpp2 argument passing type is v : std::vector<int> = (1,2,3);
fun(v); // no change to v
gun(v); // no change to v
print(v); // definite last use - it is safe to move When the function is going to change the variable, it needs to be stated on the call side: v : std::vector<int> = (1,2,3);
fun(move v); // will be moved (explicit)
gun(out v); // will be reinitialized (explicit)
print(v); // definite last use - will be moved (implicit) (probably the above list of keywords will be extended: #231 (comment)) I think your suggestion to be able to distinguish methods that invalidate class invariants from methods that change the details of the class is good. We can have that by marking |
I'm having trouble understanding exactly what it is you want, or want to change, could you perhaps give a couple of small examples?
On 9 May 2023 04:11:06 Anton Dyachenko ***@***.***> wrote:
As other people already said in these closed suggestions that move semantic is not welcomed (for actually no reasons other then personal taste) but it isn't just a simple change of defaults. Move semantics is the key to safer c++. cpp2 is good but if I would be asked keep all the problems of cpp1 with fixed only move semantics or alternative the all nice features from cpp2 but wtihout move semantic - you know there is simply no choice on the platter. Nobody will need cpp2 if we could fix move in cpp1. Rust and val is too much extreme for cpp community but rejecting even thinking hard and deeply about move semantics just because you get used to copy and never tried alternative doesn't really good reasoning.
Again move semantics expressed in default move paramaters is not just a (in)convenience compare to in that helps decide automatically copy or reference. This is a fundamental shift in how people design API and what tools can help us.
Any change in internal memory presentation is by fact an ownership change operation. You've added an item to a vector -> vectors memory presentation may be changed so as API designer you have to demand full ownership to do so. On the other hand, just writing (out param) to a valid int item of a vector never causes any memory issues. And this is a very simple case when vector is a value type. This can help much more with classes that are non-value types where all business logic lives and where almost all methods change memory presentation somewhere in the object tree. You know all these controllers/models/uncopybale stuff, I really want/need help from the compiler/tools to prevent me from doing wrong things.
Anyway, I can understand answer like - we really tried hard to implement this and it causing <litst_of_issues>. I can work with this list and might help with fixing issues, but answers like - nah we are c++ we don't care what other smart people in rust, val, and other communities think about safety we know better just because we know better/don't want even try/really think this is a stupid idea. In other words, what exactly an issue with having move as the default? Trivially copy types? 0 issues they continue to copy, value type? That is what we want - an explicit copy of heavy value types. Non-value types this is where move semantic is what makes the difference between safe and unsafe code. If I don't need to deal with non value types I wouldn't bother anybody asking this - that is where majority of my headache and move helps with it.
If this suggestion could make people just think about move semantics by default idea for more than 10 minutes is already a success for me. Just give it a try in your head it is really more than you think it is.
—
Reply to this email directly, view it on GitHub<#417 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQPORETIEPI4SDI7JSLXFGYULANCNFSM6AAAAAAXWQWQH4>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
I think I understand what you're saying, but I believe it hinges on a misunderstanding.
Ownership in C++ simply means who is responsible for the lifetime of the memory. If the function is not responsible for managing the lifetime of the memory then the item should be referenced not moved.
I'm no expert in rust, but I'm under the impression that the borrow checker prevents concurrent usage of a variable, and that ownership also implies that you have exclusive read/write access to the memory?
So if I'm right, and correct me if I'm not, your suggestion is that Cpp2 should also track memory access as well as memory ownership?
On 9 May 2023 10:01:27 realgdman ***@***.***> wrote:
Apologize if I crated bad impression, I'm newcomer here and just tried to find relevant read, while waiting answer from author or contributors.
As for topic, I perceive move and ownership as different things. For example, in gamedev I want to use one object in several functions, like
Player p {};
checkInput(p);
draw(p);
In Rust, as I understand, ownership will be "borrowed" but then returned, so I can use it later. This is not the same as just blindly move, because that means I cant pass object to second function, without somehow getting it back, via return or out parameters from 1st function.
—
Reply to this email directly, view it on GitHub<#417 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQKDT26NHLM7MCSN7PDXFIBWHANCNFSM6AAAAAAXWQWQH4>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
What I wanted to say, I see move as destructive operation. Compared to references, we trading potentially dangling reference (if callee saved it) for potentially use-after-move. Like in this (intentionally bad) example https://cpp.godbolt.org/z/oKqbe1h35 Edit: also this discussion #246 |
Sorry for the long silence, anyway. out is by nature different to move semantic. out indicates mutation within the current memory representation. Examples:
I am nto fully get used to cpp2 syntax but will try to express this in cpp2 v : vector = { 0, 1, 2 }; auto v = std::vector{ 0, 1, 2 }; back to cpp2, the syntax of
The key difference between out and move is semantic, we transfer knowledge to the compiler via type system about possible internal change in-memory representation. Any references on parts of an object can be invalidated after the move operation. With move semantic most of mutable methods are naturally accept this as move (safest default option). The developer/API designer should prove/guarantee that out semantics is ok in some cases. Move semantic is safe by nature and the compiler will warn you if you keep pointers on an object after move, but out semantic is not safe. Implementation has to make sure that memory representation is not changed / nothing invalidated. I don't think we can afford error due to the absence of mathematical proof that it is always an error, which is the goal for val language, but a true(not false positive) warning is possible without deep analysis. Companies with strict warning policy that treats warnings as errors will be in a safer position but ignoring warnings might be ok as well. I have written tons of code where vector reserve capacity and then push_back and never exceeds the reserved capacity, For such cases, warning can be safely suppressed when code is migrated from cpp1 to cpp2. And then it could be replaced by try_push_back if needed. I've read about new keyword discard in this example What about free functions that require first argument by move and return it? They are covered by UFCS otherwise explicit hiding of old name needed or declaration of a new name like in examples above. What if the developer doesn't want to pass argument by move then he clearly wants a copy, but the copy is usually expensive for nontrivial types, then make this copy explicit. move vs in Ok you really need to provide const access to a value fine, what we can have is something like v : vector<> = { 0, 1, 2 }; syntax can be better of course, in rust this is .copy method from Copy trait. Basically, I am reinventing the bycicle using wrong words sorry for this. I just found much clear wording of what this suggestion is about https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2676r0.pdf My only point is we don't need to have the full power of val but change the semantic towards Val model is what we really need. |
In cpp2
'member_func(out this ,...)'
Denotes a constructor, the object instance doesn't exist until the function is called. Your suggestion would require breaking that I believe?
decltype(auto) func(std::vector&& v)
{
v.push_back(3);
return std::move(v);
}
You say you want the compiler to elide these moves, but this is a known issue in cpp1, return value optimisation can be prevented by explicitly returning a moved value. Also just because you've moved the vector, doesn't mean that the underlying memory isn't also being read/written to elsewhere in the program (not great/safe design, but not impossible)
Also, it seems like I was correct, you want to use these parameter passing words to have additional meanings, ie imply exclusive ability to physically access memory, rather than just describe who can modify the values, and who owns them. I'd argue that you've already made a good case for using the already existing oneszas they already are, to safeguard against aliasing in specific scenarios, though I'm not sure if it would just be faster to check the addresses of the input values don't match beforehand. Either way the programmer needs to know about aliasing upfront and guard against it manually, which is not an ideal situation for bugs. Someone more knowledable than I would know if it represents a security issue, something I know cpp2 aims to eliminate.
I believe that what you want requires a ground up fundamental shift in the memory model of C++ and requires changes far beyond the cope of function calling. This may be desirable but out of scope for this experiment, ultimately that is up to @herb to decide.
On 12 May 2023 07:00:56 Anton Dyachenko ***@***.***> wrote:
``Sorry for the long silence, anyway.
out is by nature different to move semantic. out indicates mutation within the current memory representation. Examples:
* an unbounded dynamic array (vector) that has the current allocation. out operation can mutate any item but not the underlying collection (no adds, no removes). Out is always memory safe (assume item's index is valid)
* an unbounded dynamic array is actually implemented via a series(in time) of bounded dynamic arrays. So whenever a new item is added and the current capacity is not enough - reallocation - a new bounded array where all the items are moved from the old array. To do this function signature has to claim "full/exclusive" access that is move.
* an object tree is slightly easier, it is safe to add a new node into tree, but any remove/replace is the same matter requires move semantic
I am nto fully get used to cpp2 syntax but will try to express this in cpp2
vector : type<T> = { try_push_back : (out this, move item: T) -> T* = { if capacity == size { return nullptr; } ... usual stuff, note due to this is out this function shall not reallocate } push_back : (move this, move item: T) -> (This, T&) = { if capacity == size { reallocate } return (move this, *try_push_back(item)); } operator []: (out this, index: size_t) -> T& = { this don't need to be move } pop_back : (move this)->This= { this has to be move, because this function invalidates iterator/pointer on the last item return (move this); } }
v : vector<int> = { 0, 1, 2 }; x := v[0]; v.push_back(3); <- compiler can emit true warning about x variable, it refers to a part of an object that was moved
question about destructive move, there is no such thing in cpp1. So the code like this doesn't have to move anything but can if wants
`
decltype(auto) func(std::vector&& v)
{
v.push_back(3);
return std::move(v);
}
auto v = std::vector{ 0, 1, 2 };
/* v = */ func(std::move(v));
`
even if something is passed by rref it isn't invalidated from the cpp1 compiler point of view. I don't know if today's compilers are able to optimize (remove commented-out code) if I uncomment the assignment. In both cases v is valid and in a specified state you can safely use it. I don't think a destructive move is possible to implement on language level only (without library support), how would you destructively move an item from a middle of a vector? You would need to have a bitmap of all items that were moved to make sure that ~vector does not call dtor for such items. I it is hard to marry this bitmap with 0 cost abstractions and don't pay for what you don't use philosophy. Yes, sure there technics that can amortize these costs but it won't be 0, however calling dtor for moved value is also not 0 cost as well but easier to implement. Anyway, destructive move is a good thing but not what this suggestion is about.
back to cpp2, the syntax of v.push_back(0), where push_back accept this as move can (and should)
* not require explicit move, like (move v).push_back(0);
* is equal to (v, item) : (_, int&) = (move v).puch_back(0);
* and should guarantee optimization as if this passed as out parameter and returns only reference (so be like today)
The semantic as if (v1, item) := push_back(v, 0) where v is the last use of v and v1 is just initialized but we don't want to multiply names and wanted to reuse names that ended their lifetime, so we reuse the variable name v. From compiler point of view the old v independent variable and has nothing in common with the new v, that is why it can easily provide a warning.
The key difference between out and move is semantic, we transfer knowledge to the compiler via type system about possible internal change in-memory representation. Any references on parts of an object can be invalidated after the move operation.
With move semantic most of mutable methods are naturally accept this as move (safest default option). The developer/API designer should prove/guarantee that out semantics is ok in some cases. Move semantic is safe by nature and the compiler will warn you if you keep pointers on an object after move, but out semantic is not safe. Implementation has to make sure that memory representation is not changed / nothing invalidated.
I don't think we can afford error due to the absence of mathematical proof that it is always an error, which is the goal for val language, but a true(not false positive) warning is possible without deep analysis. Companies with strict warning policy that treats warnings as errors will be in a safer position but ignoring warnings might be ok as well. I have written tons of code where vector reserve capacity and then push_back and never exceeds the reserved capacity, For such cases, warning can be safely suppressed when code is migrated from cpp1 to cpp2. And then it could be replaced by try_push_back if needed.
I've read about new keyword discard in this example inout_func( discard x ) and I feel like the default move semantic covers this automatically, no need in the new keyword. (discard returning_func()) is a conceptually different problem IMHO, and discard here is a synonym to the absence of [[nodiscard]] which is nice but different from inout_func case.
What about free functions that require first argument by move and return it? They are covered by UFCS otherwise explicit hiding of old name needed or declaration of a new name like in examples above. What if the developer doesn't want to pass argument by move then he clearly wants a copy, but the copy is usually expensive for nontrivial types, then make this copy explicit.
move vs in
move can do everything that in can do, but contrary to ordinary c++ developer experience in isn't as safe (similarly as out) as move. What is the difference? in is by nature shared access (in case when it resolves to const reference) but c++ (neither cpp2) even warns about mutable and const access at the same time. It is implied that if something is passed by const ref it is immutable but no support to make sure this is the case is provided. The name for this problem is well known - aliasing. You can pass the same value as 2 parameters twise the first time as in and the second as out. In a simple case, compiler can emit a warning but in a complex can't. On the other hand, move means exclusive access, so if a function accepts two parameters one by in and the other by move (instead of out) you can't pass the same value twice, compiler can easily warn about using references on the moved value. Actually, if you have default move semantic you just rarely need in semantic. You don't care about constness at all - you have exclusive access to the value why do you care if it const or mut? This is very in line with cpp2 not const local semantic.
Ok you really need to provide const access to a value fine, what we can have is something like
`
vector : type = {
operator = : (move this, move that) -> (This, That) = {
...
return (move this, move that);
}
};
v : vector<> = { 0, 1, 2 };
v2 := v; // move construction, could be optimized to no op, basically this is renaming, new name is declared old is invalidated, usual mantra valid but unspecified in cpp1 and in cpp2 could be mandatory the last use without explicit in
v2 := in v; // copy construction and is a shortcut to (v2, v) := v; where new v and old v is the same value
v4 : vector<> = { 0, 1, 2 };
v4 = v2; // move assignment
v4 = in v2; // copy assignment
`
syntax can be better of course, in rust this is .copy method from Copy trait.
Basically, I am reinventing the bycicle using wrong words sorry for this. I just found much clear wording of what this suggestion is about https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2676r0.pdf My only point is we don't need to have the full power of val but change the semantic towards Val model is what we really need.
—
Reply to this email directly, view it on GitHub<#417 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQMBGGFT3KYTH7DUAMTXFXGY7ANCNFSM6AAAAAAXWQWQH4>.
You are receiving this because you commented.Message ID: ***@***.***>
|
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?
I am not a security expert, but one thing comes to my mind is copying sensitive data like passwords/keys requires explicit removal/zeroing data on the caller site after it is passed to callee. How it might help? The callee usually knows that the data passed is sensitive and erase/zeroed it so if this is only one instance of such data then mission accomplished callee makes sure it is safe, but if it is not the only copy then the same safe memory utilization is needed on the whole path of these data from IO code to the callee.
Will your feature suggestion automate or eliminate X% of current C++ guidance literature?
I think so, there is plenty of guidance to use std::move when performance does matter, and this is c++ it is always a matter. Actually, this is my the most ever used part of std namespace.
I use c++ for 20+ years, and I still get caught by this code(written even by me) regularly
std::vector v = {...};
auto& x = v[0];
// I can't always predict from the function name that v doesn't reallocate due to inserting in the whole call tree
mutateVector(v); // v.push_back(0) somewhere deep in call tree.
x = 10; <- SOMETIMES x is invalid because vector doesn't preserve iterators on insert
today tools can't catch it, because there are significantly more cases when mutableVector() simply changes some items' values rather than rearranging them. If these tools warn there would be almost always false alarms.
std2
vector: type = {
push_back : (this, val: T) = ... - both this and val passed by move, passing this by move clearly indicates all iterators invalidated
operator [] : (inout this, index: size_t)->T& = ... - this signature clearly indicates NO iterators invalidated
};
v: std2::vector = { ... };
x :&auto = v[0];
v.push_back(0); <- warning x is a dangling pointer
if v is passed outside as an out parameter then push_back simply doesn't compile.
Describe alternatives you've considered.
I don't think I need. We all know it and this is the current default - copy semantic.
Basically, I want rust/val move semantic to be the default semantic in cpp2. This means instead of making function parameters in by default it should be move by default. This matches very well with the mutable local variables, all copies become explicit. Having default move semantic allows to separate inout (mutable operations) from move(ownership transfer operations). push_back/insert/remove are ownership operations where is operator [] is not - is is always memory safe to use its result until ownership transferred. No borrow checker needed(rust), no math guarantee needed(val) just making a warning is enough but this is not a false positive warning.
Yes, this adds some extra amount of "noise" compared to implicit in, but we have this a lot in c++ (const &) and it is ok.
The text was updated successfully, but these errors were encountered: