From edbe0da900573acc7426b74dc67fbdd1c6e60ddf Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Sun, 25 Mar 2018 14:17:19 -0700 Subject: [PATCH 1/9] Added first draft --- DIPs/DIP1xxx-rval_to_ref.md | 283 ++++++++++++++++++++++++++++++++++++ 1 file changed, 283 insertions(+) create mode 100644 DIPs/DIP1xxx-rval_to_ref.md diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md new file mode 100644 index 000000000..26b067550 --- /dev/null +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -0,0 +1,283 @@ +# `ref const(T)` should receive r-values + +| Field | Value | +|-----------------|-----------------------------------------------------------------| +| DIP: | (number/id -- assigned by DIP Manager) | +| Review Count: | 0 (edited by DIP Manager) | +| Author: | Manu Evans (turkeyman@gmail.com) | +| Status: | Will be set by the DIP manager (e.g. "Approved" or "Rejected") | + +## Abstract + +A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead. +This situation leads to a workaround where function parameters must be manually assigned to temporaries prior to the function call, which many users find frustrating. + +Another further issue is that because they require special-case handling, this may introduce semantic edge-cases and necessitate undesirable compile-time logic invading the users code, particularly into generic code. + +`ref` args are not as common in conventional idiomatic D as they are in some other languages, but they exist and appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using ref arguments more than average. + +The choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain, however, the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interact the API with their local code. +It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. + +Here is proposed a strategy to emit implicit temporaries to conveniently interact with APIs that use ref arguments. + +### Reference + +Nothing here yet... + +## Contents +* [Rationale](#rationale) +* [Proposal](#proposal) +* [Temporary destruction]() +* [`@safe`ty implications](#safety_implications) +* [Why not `auto ref`?](#auto_ref) +* [Key use cases](#use_cases) +* [Reviews](#reviews) + +## Rationale + +Many functions receive arguments by reference. This may be for a variety of reasons. +One reason is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values, another common case is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. +In that case, it is conventional to mark the argument `const`, enforcing that the argument is not to be modified by the function and received purely as input. + +When calling functions that receive ref args, D prohibits supplying rvalues as arguments because an rvalue theoretically doesn't have an address, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function in question. +While these are sensible defense mechanisms for functions that receive arguments by mutable or `out` ref, it can be very inconvenient where functions receive arguments by `const ref` as pure inputs. + +Consider the example: +```d +void fun(int x); + +fun(10); // <-- this is how simple calling a function should be +``` +But when a const-ref is involved: +```d +void fun(ref const(int) x); + +fun(10); // <-- compile error; not an lvalue!! +``` +Necessitating the workaround: +```d +int temp = 10; +fun(temp); +``` +In practise, the argument would likely be some larger struct type rather than 'int', but the inconvenience applies generally. + +This inconvenience also extends more broadly to cases including: +```d +fun(10); // literals +fun(gun()); // return values from functions +fun(x.prop); // properties +fun(x + y); // expressions +fun(my_short); // implicit type conversions (ie, short->int promotion) +// etc... (basically, most things you pass to functions) +``` +The work-around can bloat the number of lines around the call-site significantly, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for moments in calculations (expressions) where no meaningful name exists. + +This work-around damages readability and brevity, and it's frustrating to implement repeatedly. + +## Proposal + +Calls with `ref const(T)` arguments supplied with rvalues are effectively rewritten to emit a temporary automatically, for example: +```d +fun(10); +``` +Is rewritten: +```d +{ + T __temp0 = void; + fun(__temp0 := 10); +} +``` +Where `T` is the function argument type. + +To mitigate confusion, I have used `:=` in this example to express the initial construction, and not a copy operation as would be expected if this code were written with an `=` expression. + +In the edge case where a function initialises an output variable: +```d +R result = fun(10); +``` +Becomes: +```d +R result = void; +{ + T __temp0 = void; + result := fun(__temp0 := 10); +} +``` +Again, where initial construction of `result` should be performed at the moment of assignment, as usual and expected. + +It is important that `T` be defined as the argument type, and not `auto`, because it will allow for implicit conversions to occur naturally as if the argument was not a ref. +The user should not experience edge cases, or differences in functionality when calling `fun(const(int) x)` vs `fun(ref const(int)x)`. + +## Temporary destruction + +Destruction of any temporaries occurs naturally at the end of the scope, as usual. + +## Function calls as arguments + +It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. +For calls that contain cascading function calls, ie: +```d +void fun(ref const(int) x, ref const(int) y); +int gun(ref const(int) x); + +fun(10, gun(20)); +``` +This correct expansion is: +```d +{ + int __fun_temp0 = void; + int __fun_temp1 = void; + int __gun_temp0 = void; + fun(__fun_temp0 := 10, __fun_temp1 := gun(__gun_temp0 := 20)); +} +``` + +## Interaction with `return ref` + +Given the expansion shown above for cascading function calls, `return ref` works naturally, exactly as the user expects. The key is that the scope encloses the entire statement, and all temporaries live for the length of the entire statement. + +For example: +```d +void fun(ref const(int) x); +ref const(int) gun(return ref const(int) y); + +fun(gun(10)); +``` +This correct expansion is: +```d +{ + int __gun_temp0 = void; + fun(gun(__gun_temp0 := 10)); +} +``` +The lifetime of `__gun_temp0` is satisfactory for any conceivable calling construction. + +## Interaction with other attributes + +Interactions with other attributes should follow all existing rules. +Any code that wouldn't compile in the event the user were to perform the rewrite manually will fail the same way, emitting the same error messages the user would expect. + +## Overload resolution + +In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. +Consider the following overload set: +```d +void fun(int); // A +void fun(const(int)); // B +void fun(ref int); // C +void fun(ref const(int)); // D + +int t = 10; +const(int) u = 10; +fun(10); // choose A +fun(const int(10)); // choose B +fun(t); // choose C +fun(u); // choose D +``` +This follows existing language rules. No change is proposed here. + +Overloading with `auto ref` equally preserves current rules, which is to emit an ambiguous call when it collides with an explicit overload: +```d +void fun(const(int)); // A +void fun(ref const(int)); // B +void fun()(auto ref const(int)); // C + +int t = 10; +fun(10); // error: ambiguous call between A and C +fun(t); // error: ambiguous call between B and C +``` + +## Default arguments + +In satisfying the statement above "The user should not experience edge cases, or differences in functionality...", it should be that default args are applicable to ref args as with non-ref args. + +If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied a literal manually. + +In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single static instance intended for reuse. +This shall not be specified functionality, but it may be a nice opportunity nonetheless. + +## `@safe`ty implications + +There are no implications on `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. +D already states that arguments received by ref shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. + +The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack args. This proposal adds nothing new. + +## Why `const`? + +Due to the nature of D's restrictive `const`, this proposal has been criticised as being so restrictive to inhibit some potentially useful programs. + +I suggest this proposal only applies to `const ref` arguments, because it's a guarantee that the parameter is used strictly as an input argument, rather than some form of output. +In the case where the parameter is used as an output argument, this proposal doesn't make sense because the output would be immediately discarded; such a function call given an rvalue as argument likely represents an accidental mistake on the users part, and we can catch that invalid code. + +That said, D has the `out` attribute, which is a semantic statement of this intent. It could be that this proposal is amended to include non-const ref arguments, expecting that `out` shall be used exclusively to mark this intent. +If we assume that world, and `out` is deployed appropriately, there are 2 cases where mutable-ref may be used: + 1. When the function *modifies* the input; not a strict output parameter, but still outputs new information + 2. Still used as input, but a user is trying to subvert the restrictiveness of D's `const` + +The proposal could be amended to accept mutable ref's depending on the value-judgement balancing these 2 use cases. +Sticking with `const` requires no such value judgement to be made at this time, and it's much easier to relax the spec in the future with emergence of evidence to do so. + +## Why not `auto ref`? + +A frequently proposed solution to this situation is to receive the arg via `auto ref`. + +`auto ref` solves a different set of problems; those may include "pass this argument in the most efficient way", or "forward this argument exactly how I received it". The implementation of `auto ref` requires that every function also be a template. + +There are many reasons why every function can't or shouldn't be a template. +1. API is not your code, and it's not already `auto ref` +2. Is distributed as a binary lib +3. Is exported from DLL +4. Is virtual +5. Is extern(C++) +6. Intent to capture function pointers or delegates +7. Has many args; unreasonable combinatorial explosion +8. Is larger-than-inline scale; engineer assesses that reducing instantiation bloat has greater priority than maximising parameter passing efficiency in some cases + +Any (or many) of these reasons may apply, eliminating `auto ref` from the solution space. + +## Key use cases + +By comparison, C++ has a very high prevalence of `const&` args and classes with virtual functions, and when interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. + +The D community has invested significant resources in improving interaction with C++; either co-existing or simplifying a migration, and thereby make D attractive to the C++ audience. +The importance of this initiative is widely agreed; it has featured prominently in the bi-annual game-plans documents, and comprehensive interaction with even the C++ standard library has attracted funding from the D foundation. +This DIP offers a lot for interaction with C++ APIs. + +This issue is also likely to appear more frequently for vendors with tight ABI requirements. +Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well. + +Another high-probability occurrence is OOP, where virtual function APIs inhibit the use of templates. + +## Anecdotes + +As a user with numerous counts of attempted C++ interactions and migrations in the workplace, and in my own projects, I can add some anecdotal observations. +My attempts to introduce D to the workplace are interesting, because they involve building interest and selling D's merits to my colleagues in order to be successful. Expansion of D in my workplaces depends on this target audience assessing that D is a superior choice compared to the de-facto establishment of C++. There are some major factors that will motivate this opinion, but mostly it is an aggregate of minor improvements, coupled with satisfaction that existing comforts and workflow will be left mostly unchanged. + +With respect to this issue, in all attempts, I have quickly demonstrated that the work-around presented above severely impact the quality of users experience with D when interacting with C++. +A large bulk of any migration task tends to involve responding to ongoing compile errors by copy-pasting function arguments to lines above the call, and assigning them temporary names. The resulting code is unappealing, bloated, and the experience is unsatisfying. +The take-away from this experience to a C++ programmer who is investigating D, is that the equivalent D code is objectively worse than the C++ code, and strongly undermines our ability to make a positive impression on that audience during the critical 'first-5-minutes'. +In my experience, this issue is almost enough on its own to call for immediate dismissal. Often expressed with vibrantly colourful language. + +## Copyright & License + +Copyright (c) 2017 by the D Language Foundation + +Licensed under [Creative Commons Zero 1.0](https://creativecommons.org/publicdomain/zero/1.0/legalcode.txt) + +## Review + +The DIP Manager will supplement this section with a summary of each review stage +of the DIP process beyond the Draft Review. + +## Appendix + +A few examples of typical C++ APIs that exhibit this issue: + - PhysX (Nvidia): http://docs.nvidia.com/gameworks/content/gameworkslibrary/physx/apireference/files/classPxSceneQueryExt.html + - NaCl (Google): https://developer.chrome.com/native-client/pepper_stable/cpp/classpp_1_1_instance + - DirectX (Microsoft): https://github.com/Microsoft/DirectXMath/blob/master/Inc/DirectXCollision.h + - Bullet (Physics Lib): http://bulletphysics.org/Bullet/BulletFull/classbtBroadphaseInterface.html + +In these examples of very typical C++ code, you can see a large number of functions receive arguments by reference. +Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, or properties. \ No newline at end of file From 37986cb48b601b1513ed523331bcb30167da942c Mon Sep 17 00:00:00 2001 From: Michael Parker Date: Mon, 2 Apr 2018 14:00:25 +0900 Subject: [PATCH 2/9] Set to draft status --- DIPs/DIP1xxx-rval_to_ref.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index 26b067550..c7b358d66 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -5,7 +5,7 @@ | DIP: | (number/id -- assigned by DIP Manager) | | Review Count: | 0 (edited by DIP Manager) | | Author: | Manu Evans (turkeyman@gmail.com) | -| Status: | Will be set by the DIP manager (e.g. "Approved" or "Rejected") | +| Status: | Draft | ## Abstract @@ -262,7 +262,7 @@ In my experience, this issue is almost enough on its own to call for immediate d ## Copyright & License -Copyright (c) 2017 by the D Language Foundation +Copyright (c) 2018 by the D Language Foundation Licensed under [Creative Commons Zero 1.0](https://creativecommons.org/publicdomain/zero/1.0/legalcode.txt) From cf20105036c45d4068a0691f5aff91dcaaf4ea0f Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Sat, 19 May 2018 15:26:32 -0700 Subject: [PATCH 3/9] Edits from feedback Removed `const` from proposal. --- DIPs/DIP1xxx-rval_to_ref.md | 168 ++++++++++++++++++++++-------------- 1 file changed, 104 insertions(+), 64 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index c7b358d66..df9b09c5d 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -1,4 +1,4 @@ -# `ref const(T)` should receive r-values +# `ref T` accepts r-values | Field | Value | |-----------------|-----------------------------------------------------------------| @@ -12,7 +12,7 @@ A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead. This situation leads to a workaround where function parameters must be manually assigned to temporaries prior to the function call, which many users find frustrating. -Another further issue is that because they require special-case handling, this may introduce semantic edge-cases and necessitate undesirable compile-time logic invading the users code, particularly into generic code. +A further issue is that because this situation require special-case handling, this may introduce semantic edge-cases and necessitate undesirable compile-time logic invading the users code, particularly into generic code. `ref` args are not as common in conventional idiomatic D as they are in some other languages, but they exist and appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using ref arguments more than average. @@ -23,7 +23,10 @@ Here is proposed a strategy to emit implicit temporaries to conveniently interac ### Reference -Nothing here yet... +Forum threads: + +Issues: + ## Contents * [Rationale](#rationale) @@ -36,12 +39,16 @@ Nothing here yet... ## Rationale -Many functions receive arguments by reference. This may be for a variety of reasons. -One reason is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values, another common case is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. -In that case, it is conventional to mark the argument `const`, enforcing that the argument is not to be modified by the function and received purely as input. +When calling functions that receive ref args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question. + +However, many functions receive arguments by reference, and this may be for a variety of reasons. +One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. +Another common case is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values. This is the potential error case that the existing design attempts to mitigate, but in D, pipeline programming is vary popular, and contrary to conventional wisdom where the statement is likely to end at the end of the function call, pipeline expressions may result in single statements performing a lot of work, mutating state as it passes down the pipeline. -When calling functions that receive ref args, D prohibits supplying rvalues as arguments because an rvalue theoretically doesn't have an address, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function in question. -While these are sensible defense mechanisms for functions that receive arguments by mutable or `out` ref, it can be very inconvenient where functions receive arguments by `const ref` as pure inputs. +A related issue is with relation to generic code which reflects or received a function by alias. Such generic code may want to call that function, but it is often the case that details about the ref-ness of arguments lead to incorrect semantic expressions in the generic code depending on the arguments, necessitating additional compile-time logic to identify the ref-ness of function arguments and implement appropriate workarounds on these conditions. This leads to longer, more brittle, and less-maintainable generic code. It is also much harder to write correctly the first time, and such issues may only emerge in niche use cases at a later time. + +With these cases in mind, the existing rule feels out-dated or inappropriate, and the presence of the rule may often lead to aggravation while trying to write simple, readable code. +Calling a function should be simple and orthogonal, generic code should not have to concern itself with details about ref-ness of function parameters, and users should not be required to jump through hoops when ref appears in API's they encounter. Consider the example: ```d @@ -49,9 +56,9 @@ void fun(int x); fun(10); // <-- this is how simple calling a function should be ``` -But when a const-ref is involved: +But when a ref is involved: ```d -void fun(ref const(int) x); +void fun(ref int x); fun(10); // <-- compile error; not an lvalue!! ``` @@ -71,14 +78,50 @@ fun(x + y); // expressions fun(my_short); // implicit type conversions (ie, short->int promotion) // etc... (basically, most things you pass to functions) ``` -The work-around can bloat the number of lines around the call-site significantly, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for moments in calculations (expressions) where no meaningful name exists. +The work-around can bloat the number of lines around the call-site significantly, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for expressions where no meaningful name exists, leading to. + +The generic case may appear in a form like this: +```d +void someMeta(alias userFun)() +{ + userFun(getValue()); +} + +void fun(int x); +void gun(ref const(int) x); + +unittest +{ + someMeta!(fun)(); // no problem + someMeta!(gun)(); // oh no, can't receive rvalue! +} +``` +Necessitating a workaround that may look like: +```d +void someMeta(alias userFun)() +{ + std.algorithm : canFind; + static if(canFind(__traits(getParameterStorageClasses, userFun, 0), "ref")) + { + auto x = getValue(); + userFun(x); + } + else + { + userFun(getValue()); + } +} +``` +This example situation is simplified, but it is often that such issues appear in complex aggregate meta, which may be difficult to understand, or the issue is caused indirectly at some layer the user did not author. -This work-around damages readability and brevity, and it's frustrating to implement repeatedly. +These work-arounds damage readability and brevity, they make authoring correct code more difficult, increase the probability of brittle meta, and it's frustrating to implement repeatedly. ## Proposal -Calls with `ref const(T)` arguments supplied with rvalues are effectively rewritten to emit a temporary automatically, for example: +Calls with `ref T` arguments supplied with rvalues are effectively rewritten to emit a temporary automatically, for example: ```d +void fun(ref int x); + fun(10); ``` Is rewritten: @@ -88,11 +131,11 @@ Is rewritten: fun(__temp0 := 10); } ``` -Where `T` is the function argument type. +Where `T` is the *function argument type*. To mitigate confusion, I have used `:=` in this example to express the initial construction, and not a copy operation as would be expected if this code were written with an `=` expression. -In the edge case where a function initialises an output variable: +In the case where a function output initialises an variable: ```d R result = fun(10); ``` @@ -106,12 +149,11 @@ R result = void; ``` Again, where initial construction of `result` should be performed at the moment of assignment, as usual and expected. -It is important that `T` be defined as the argument type, and not `auto`, because it will allow for implicit conversions to occur naturally as if the argument was not a ref. -The user should not experience edge cases, or differences in functionality when calling `fun(const(int) x)` vs `fun(ref const(int)x)`. +It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not a ref. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. ## Temporary destruction -Destruction of any temporaries occurs naturally at the end of the scope, as usual. +Destruction of any temporaries occurs naturally at the end of the introduced scope. ## Function calls as arguments @@ -139,8 +181,8 @@ Given the expansion shown above for cascading function calls, `return ref` works For example: ```d -void fun(ref const(int) x); -ref const(int) gun(return ref const(int) y); +void fun(ref int x); +ref int gun(return ref int y); fun(gun(10)); ``` @@ -153,10 +195,33 @@ This correct expansion is: ``` The lifetime of `__gun_temp0` is satisfactory for any conceivable calling construction. +This is particularly useful when pipeline programming. +It is common that functions are invoked which create and return a range which is then used in a pipeline operation: +```d +MyRange makeRange(); +MyRange transform(MyRange r, int x); + +auto results = makeRange().transform(10).array; +``` +But if the transform receives a range by ref, the pipeline syntax breaks down: +```d +MyRange makeRange(); +ref MyRange mutatingTransform(return ref MyRange r, int x); + +auto results = makeRange().transform(10).array; // error, not an lvalue! + +// necessitate workaround: +auto tempRange = makeRange(); // satisfy the compiler +auto results = tempRange.mutatingTransform(10).array; +``` +There are classes of range where the source range should be mutated through the pipeline. It is also possible that this pattern may be implemented for efficiency, since copying ranges at each step may be expensive. + +It is unfortunate that `ref` adds friction to one of D's greatest programming paradigms this way. + ## Interaction with other attributes Interactions with other attributes should follow all existing rules. -Any code that wouldn't compile in the event the user were to perform the rewrite manually will fail the same way, emitting the same error messages the user would expect. +Any code that wouldn't compile in the event the user were to perform the proposed rewrites manually will fail in the same way, emitting the same error messages the user would expect. ## Overload resolution @@ -170,23 +235,23 @@ void fun(ref const(int)); // D int t = 10; const(int) u = 10; -fun(10); // choose A -fun(const int(10)); // choose B -fun(t); // choose C -fun(u); // choose D +fun(10); // rvalue; choose A +fun(const int(10)); // rvalue; choose B +fun(t); // lvalue; choose C +fun(u); // lvalue; choose D ``` This follows existing language rules. No change is proposed here. -Overloading with `auto ref` equally preserves current rules, which is to emit an ambiguous call when it collides with an explicit overload: +Overloading with `auto ref` preserves existing rules, which is to emit an ambiguous call when it collides with an explicit overload: ```d -void fun(const(int)); // A -void fun(ref const(int)); // B -void fun()(auto ref const(int)); // C +void fun(ref int); // A +void fun()(auto ref int); // B int t = 10; -fun(10); // error: ambiguous call between A and C -fun(t); // error: ambiguous call between B and C +fun(10); // chooses B: auto ref resolves by-value given an rvalue, prefer exact match as above +fun(t); // error: ambiguous call between A and B ``` +No change to existing behaviour is proposed. ## Default arguments @@ -194,7 +259,7 @@ In satisfying the statement above "The user should not experience edge cases, or If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied a literal manually. -In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single static instance intended for reuse. +In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. This shall not be specified functionality, but it may be a nice opportunity nonetheless. ## `@safe`ty implications @@ -204,21 +269,6 @@ D already states that arguments received by ref shall not escape, so passing tem The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack args. This proposal adds nothing new. -## Why `const`? - -Due to the nature of D's restrictive `const`, this proposal has been criticised as being so restrictive to inhibit some potentially useful programs. - -I suggest this proposal only applies to `const ref` arguments, because it's a guarantee that the parameter is used strictly as an input argument, rather than some form of output. -In the case where the parameter is used as an output argument, this proposal doesn't make sense because the output would be immediately discarded; such a function call given an rvalue as argument likely represents an accidental mistake on the users part, and we can catch that invalid code. - -That said, D has the `out` attribute, which is a semantic statement of this intent. It could be that this proposal is amended to include non-const ref arguments, expecting that `out` shall be used exclusively to mark this intent. -If we assume that world, and `out` is deployed appropriately, there are 2 cases where mutable-ref may be used: - 1. When the function *modifies* the input; not a strict output parameter, but still outputs new information - 2. Still used as input, but a user is trying to subvert the restrictiveness of D's `const` - -The proposal could be amended to accept mutable ref's depending on the value-judgement balancing these 2 use cases. -Sticking with `const` requires no such value judgement to be made at this time, and it's much easier to relax the spec in the future with emergence of evidence to do so. - ## Why not `auto ref`? A frequently proposed solution to this situation is to receive the arg via `auto ref`. @@ -239,26 +289,16 @@ Any (or many) of these reasons may apply, eliminating `auto ref` from the soluti ## Key use cases -By comparison, C++ has a very high prevalence of `const&` args and classes with virtual functions, and when interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. +Pipeline programming expressions often begin with a range returned from a function. There are constructs where transform functions need to take the range by reference. Such cases currently break the pipeline and introduce a temporary. This proposal improves the pipeline-programming experience. -The D community has invested significant resources in improving interaction with C++; either co-existing or simplifying a migration, and thereby make D attractive to the C++ audience. -The importance of this initiative is widely agreed; it has featured prominently in the bi-annual game-plans documents, and comprehensive interaction with even the C++ standard library has attracted funding from the D foundation. -This DIP offers a lot for interaction with C++ APIs. +Generic programming is one of D's biggest success stories, and tends to work best when interfaces and expressions are orthogonal and with as few as possible edge-cases. Certain forms of meta find that `ref`-ness is a key edge-case which requires special case handling and may often lead to brittle generic code to be discovered by an unhappy niche user at some future time. -This issue is also likely to appear more frequently for vendors with tight ABI requirements. -Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well. +Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, auto ref). -Another high-probability occurrence is OOP, where virtual function APIs inhibit the use of templates. +By comparison, C++ has a very high prevalence of `const&` args and classes with virtual functions, and when interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. This DIP reduces inconvenience when interacting with C++ API's. -## Anecdotes - -As a user with numerous counts of attempted C++ interactions and migrations in the workplace, and in my own projects, I can add some anecdotal observations. -My attempts to introduce D to the workplace are interesting, because they involve building interest and selling D's merits to my colleagues in order to be successful. Expansion of D in my workplaces depends on this target audience assessing that D is a superior choice compared to the de-facto establishment of C++. There are some major factors that will motivate this opinion, but mostly it is an aggregate of minor improvements, coupled with satisfaction that existing comforts and workflow will be left mostly unchanged. - -With respect to this issue, in all attempts, I have quickly demonstrated that the work-around presented above severely impact the quality of users experience with D when interacting with C++. -A large bulk of any migration task tends to involve responding to ongoing compile errors by copy-pasting function arguments to lines above the call, and assigning them temporary names. The resulting code is unappealing, bloated, and the experience is unsatisfying. -The take-away from this experience to a C++ programmer who is investigating D, is that the equivalent D code is objectively worse than the C++ code, and strongly undermines our ability to make a positive impression on that audience during the critical 'first-5-minutes'. -In my experience, this issue is almost enough on its own to call for immediate dismissal. Often expressed with vibrantly colourful language. +This issue is also likely to appear more frequently for vendors with tight ABI requirements. +Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well. ## Copyright & License @@ -280,4 +320,4 @@ A few examples of typical C++ APIs that exhibit this issue: - Bullet (Physics Lib): http://bulletphysics.org/Bullet/BulletFull/classbtBroadphaseInterface.html In these examples of very typical C++ code, you can see a large number of functions receive arguments by reference. -Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, or properties. \ No newline at end of file +Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, properties, or expressions. \ No newline at end of file From 37eebfcbbc62c168eb3d6bc4a0ae4c0d5cd65b51 Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Sat, 19 May 2018 17:06:02 -0700 Subject: [PATCH 4/9] Added references. --- DIPs/DIP1xxx-rval_to_ref.md | 80 +++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 29 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index df9b09c5d..cc672374d 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -9,24 +9,46 @@ ## Abstract -A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead. +A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead. This situation leads to a workaround where function parameters must be manually assigned to temporaries prior to the function call, which many users find frustrating. A further issue is that because this situation require special-case handling, this may introduce semantic edge-cases and necessitate undesirable compile-time logic invading the users code, particularly into generic code. -`ref` args are not as common in conventional idiomatic D as they are in some other languages, but they exist and appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using ref arguments more than average. +`ref` args are not as common in conventional idiomatic D as they are in some other languages, but they exist and appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using `ref` arguments more than average. -The choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain, however, the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interact the API with their local code. +The choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain, however, the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interact the API with their local code. It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. -Here is proposed a strategy to emit implicit temporaries to conveniently interact with APIs that use ref arguments. - -### Reference - -Forum threads: - -Issues: - +Here is proposed a strategy to emit implicit temporaries to conveniently interact with APIs that use `ref` arguments. + +## Reference + +There is a lot of prior discussion on this topic. Much is out of date now due to recent language evolution. +Prior discussion involving `scope`, and `auto ref` as solutions are out of date; We have implemented `scope`, `auto ref`, and we also have `return ref` now, which affects the conversation. + +Forum threads: +https://forum.dlang.org/thread/mailman.3720.1453131378.22025.digitalmars-d@puremagic.com +https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy@forum.dlang.org +https://forum.dlang.org/thread/mailman.577.1410180586.5783.digitalmars-d@puremagic.com +https://forum.dlang.org/thread/km4rtm$239e$1@digitalmars.com +https://forum.dlang.org/thread/kl4v8r$tkc$1@digitalmars.com +https://forum.dlang.org/thread/ylebrhjnrrcajnvtthtt@forum.dlang.org +https://forum.dlang.org/thread/ntsyfhesnywfxvzbemwc@forum.dlang.org +https://forum.dlang.org/thread/uswucstsooghescofycp@forum.dlang.org +https://forum.dlang.org/thread/zteryxwxyngvyqvukqkm@forum.dlang.org +https://forum.dlang.org/thread/yhnbcocwxnbutylfeoxi@forum.dlang.org +https://forum.dlang.org/thread/tkdloxqhtptpifkhvxjh@forum.dlang.org +https://forum.dlang.org/thread/mailman.1478.1521842510.3374.digitalmars-d@puremagic.com +https://forum.dlang.org/thread/gsdkqnbljuwssslxuglf@forum.dlang.org + +Issues: +https://issues.dlang.org/show_bug.cgi?id=9238 +https://issues.dlang.org/show_bug.cgi?id=8845 +https://issues.dlang.org/show_bug.cgi?id=6221 +https://issues.dlang.org/show_bug.cgi?id=6442 + +PRs: +https://github.com/dlang/dmd/pull/4717 ## Contents * [Rationale](#rationale) @@ -39,16 +61,16 @@ Issues: ## Rationale -When calling functions that receive ref args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question. +When calling functions that receive `ref` args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question. -However, many functions receive arguments by reference, and this may be for a variety of reasons. -One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. +However, many functions receive arguments by reference, and this may be for a variety of reasons. +One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. Another common case is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values. This is the potential error case that the existing design attempts to mitigate, but in D, pipeline programming is vary popular, and contrary to conventional wisdom where the statement is likely to end at the end of the function call, pipeline expressions may result in single statements performing a lot of work, mutating state as it passes down the pipeline. A related issue is with relation to generic code which reflects or received a function by alias. Such generic code may want to call that function, but it is often the case that details about the ref-ness of arguments lead to incorrect semantic expressions in the generic code depending on the arguments, necessitating additional compile-time logic to identify the ref-ness of function arguments and implement appropriate workarounds on these conditions. This leads to longer, more brittle, and less-maintainable generic code. It is also much harder to write correctly the first time, and such issues may only emerge in niche use cases at a later time. -With these cases in mind, the existing rule feels out-dated or inappropriate, and the presence of the rule may often lead to aggravation while trying to write simple, readable code. -Calling a function should be simple and orthogonal, generic code should not have to concern itself with details about ref-ness of function parameters, and users should not be required to jump through hoops when ref appears in API's they encounter. +With these cases in mind, the existing rule feels out-dated or inappropriate, and the presence of the rule may often lead to aggravation while trying to write simple, readable code. +Calling a function should be simple and orthogonal, generic code should not have to concern itself with details about ref-ness of function parameters, and users should not be required to jump through hoops when `ref` appears in API's they encounter. Consider the example: ```d @@ -56,7 +78,7 @@ void fun(int x); fun(10); // <-- this is how simple calling a function should be ``` -But when a ref is involved: +But when `ref` is involved: ```d void fun(ref int x); @@ -149,7 +171,7 @@ R result = void; ``` Again, where initial construction of `result` should be performed at the moment of assignment, as usual and expected. -It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not a ref. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. +It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not `ref`. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. ## Temporary destruction @@ -157,7 +179,7 @@ Destruction of any temporaries occurs naturally at the end of the introduced sco ## Function calls as arguments -It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. +It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. For calls that contain cascading function calls, ie: ```d void fun(ref const(int) x, ref const(int) y); @@ -203,7 +225,7 @@ MyRange transform(MyRange r, int x); auto results = makeRange().transform(10).array; ``` -But if the transform receives a range by ref, the pipeline syntax breaks down: +But if the transform receives a range by `ref`, the pipeline syntax breaks down: ```d MyRange makeRange(); ref MyRange mutatingTransform(return ref MyRange r, int x); @@ -220,12 +242,12 @@ It is unfortunate that `ref` adds friction to one of D's greatest programming pa ## Interaction with other attributes -Interactions with other attributes should follow all existing rules. +Interactions with other attributes should follow all existing rules. Any code that wouldn't compile in the event the user were to perform the proposed rewrites manually will fail in the same way, emitting the same error messages the user would expect. ## Overload resolution -In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. +In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. Consider the following overload set: ```d void fun(int); // A @@ -255,17 +277,17 @@ No change to existing behaviour is proposed. ## Default arguments -In satisfying the statement above "The user should not experience edge cases, or differences in functionality...", it should be that default args are applicable to ref args as with non-ref args. +In satisfying the statement above "The user should not experience edge cases, or differences in functionality...", it should be that default args are applicable to `ref` args as with non-`ref` args. If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied a literal manually. -In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. +In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. This shall not be specified functionality, but it may be a nice opportunity nonetheless. ## `@safe`ty implications -There are no implications on `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. -D already states that arguments received by ref shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. +There are no implications on `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. +D already states that arguments received by `ref` shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack args. This proposal adds nothing new. @@ -293,11 +315,11 @@ Pipeline programming expressions often begin with a range returned from a functi Generic programming is one of D's biggest success stories, and tends to work best when interfaces and expressions are orthogonal and with as few as possible edge-cases. Certain forms of meta find that `ref`-ness is a key edge-case which requires special case handling and may often lead to brittle generic code to be discovered by an unhappy niche user at some future time. -Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, auto ref). +Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, `auto ref`). By comparison, C++ has a very high prevalence of `const&` args and classes with virtual functions, and when interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. This DIP reduces inconvenience when interacting with C++ API's. -This issue is also likely to appear more frequently for vendors with tight ABI requirements. +This issue is also likely to appear more frequently for vendors with tight ABI requirements. Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well. ## Copyright & License @@ -319,5 +341,5 @@ A few examples of typical C++ APIs that exhibit this issue: - DirectX (Microsoft): https://github.com/Microsoft/DirectXMath/blob/master/Inc/DirectXCollision.h - Bullet (Physics Lib): http://bulletphysics.org/Bullet/BulletFull/classbtBroadphaseInterface.html -In these examples of very typical C++ code, you can see a large number of functions receive arguments by reference. +In these examples of very typical C++ code, you can see a large number of functions receive arguments by reference. Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, properties, or expressions. \ No newline at end of file From 7b1974f641b6cc891df62f27e111283edd968a8d Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Mon, 21 May 2018 20:38:42 -0700 Subject: [PATCH 5/9] Added lvalue-restrictive concept. --- DIPs/DIP1xxx-rval_to_ref.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index cc672374d..f867a5732 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -262,7 +262,19 @@ fun(const int(10)); // rvalue; choose B fun(t); // lvalue; choose C fun(u); // lvalue; choose D ``` -This follows existing language rules. No change is proposed here. +This follows existing language rules, with one notable change; the text "rvalues should *prefer* by-value functions" allows that rvalues *may* now choose a by-ref function if no by-val overload is present, where it was previously a compile error. + +It has been noted that is it possible to perceive the current usage of `ref` in lieu of a by-val overload as an 'lvalues-only restriction', which may be useful in some constructions. That functionality can be preserved using a `@disable` mechanic: +```d +void lval_only(int x) @disable; +void lval_only(ref int x); + +int x = 10; +lval_only(x); // ok: choose by-ref +lval_only(10); // error: literal matches by-val, which is @disabled +``` +It may be considered an advantage that using this construction, the intent to restrict the argument in this way is made explicit. +The symmetrical 'rvalue-only restriction' is also possible to express in the same way. Overloading with `auto ref` preserves existing rules, which is to emit an ambiguous call when it collides with an explicit overload: ```d From 9b9a3554cdfada79a3f700ee0c402e62b6995948 Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Sat, 16 Jun 2018 01:14:17 -0700 Subject: [PATCH 6/9] Dun a fair refactor addressing most criticism. --- DIPs/DIP1xxx-rval_to_ref.md | 104 ++++++++++++++++++++---------------- 1 file changed, 57 insertions(+), 47 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index f867a5732..b40f00435 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -9,22 +9,16 @@ ## Abstract -A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead. -This situation leads to a workaround where function parameters must be manually assigned to temporaries prior to the function call, which many users find frustrating. +Functions that receive arguments by `ref` do not accept rvalues. -A further issue is that because this situation require special-case handling, this may introduce semantic edge-cases and necessitate undesirable compile-time logic invading the users code, particularly into generic code. +This leads to edge-cases in calling code with respect to parameter passing semantics, requiring an assortment of workarounds and user-intervention which may be frustrating, and pollute _client-side_ code clarity. -`ref` args are not as common in conventional idiomatic D as they are in some other languages, but they exist and appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using `ref` arguments more than average. - -The choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain, however, the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interact the API with their local code. -It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. - -Here is proposed a strategy to emit implicit temporaries to conveniently interact with APIs that use `ref` arguments. +Here is proposed a strategy to emit implicit temporaries to conveniently and uniformly interact with APIs that use `ref` arguments. ## Reference There is a lot of prior discussion on this topic. Much is out of date now due to recent language evolution. -Prior discussion involving `scope`, and `auto ref` as solutions are out of date; We have implemented `scope`, `auto ref`, and we also have `return ref` now, which affects the conversation. +Prior discussion involving `scope`, and `auto ref` as solutions are out of date; We have implemented `scope`, `auto ref`, and we also have `return ref` now, which affects prior conversation. Forum threads: https://forum.dlang.org/thread/mailman.3720.1453131378.22025.digitalmars-d@puremagic.com @@ -52,7 +46,7 @@ https://github.com/dlang/dmd/pull/4717 ## Contents * [Rationale](#rationale) -* [Proposal](#proposal) +* [Description](#description) * [Temporary destruction]() * [`@safe`ty implications](#safety_implications) * [Why not `auto ref`?](#auto_ref) @@ -61,24 +55,19 @@ https://github.com/dlang/dmd/pull/4717 ## Rationale -When calling functions that receive `ref` args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question. - -However, many functions receive arguments by reference, and this may be for a variety of reasons. -One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. -Another common case is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values. This is the potential error case that the existing design attempts to mitigate, but in D, pipeline programming is vary popular, and contrary to conventional wisdom where the statement is likely to end at the end of the function call, pipeline expressions may result in single statements performing a lot of work, mutating state as it passes down the pipeline. +Functions may receive arguments by reference, and this may be for a variety of reasons. +One common reason is that the cost of copying or moving large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. +Another is that the function may want to mutate the caller's data directly, or return data via `out` parameters due to ABI limitations regarding multiple return values. -A related issue is with relation to generic code which reflects or received a function by alias. Such generic code may want to call that function, but it is often the case that details about the ref-ness of arguments lead to incorrect semantic expressions in the generic code depending on the arguments, necessitating additional compile-time logic to identify the ref-ness of function arguments and implement appropriate workarounds on these conditions. This leads to longer, more brittle, and less-maintainable generic code. It is also much harder to write correctly the first time, and such issues may only emerge in niche use cases at a later time. - -With these cases in mind, the existing rule feels out-dated or inappropriate, and the presence of the rule may often lead to aggravation while trying to write simple, readable code. -Calling a function should be simple and orthogonal, generic code should not have to concern itself with details about ref-ness of function parameters, and users should not be required to jump through hoops when `ref` appears in API's they encounter. +A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead, invoking the necessity for manual workarounds. Consider the example: ```d void fun(int x); -fun(10); // <-- this is how simple calling a function should be +fun(10); // <-- this is how users expect to call a function ``` -But when `ref` is involved: +But when `ref` is present: ```d void fun(ref int x); @@ -89,20 +78,24 @@ Necessitating the workaround: int temp = 10; fun(temp); ``` -In practise, the argument would likely be some larger struct type rather than 'int', but the inconvenience applies generally. +In practise, the argument is likely a struct type rather than 'int', but the inconvenience applies generally. -This inconvenience also extends more broadly to cases including: +This inconvenience extends broadly to every manner of thing you pass to functions with the exception of lvalue instances, including: ```d fun(10); // literals fun(gun()); // return values from functions fun(x.prop); // properties fun(x + y); // expressions fun(my_short); // implicit type conversions (ie, short->int promotion) -// etc... (basically, most things you pass to functions) +// etc. ``` -The work-around can bloat the number of lines around the call-site significantly, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for expressions where no meaningful name exists, leading to. +The work-around bloats the number of lines around the call-site, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for expressions where no meaningful name exists. + +A further issue is that because these situations require special-case handling, they necessitate undesirable and potentially complex compile-time logic being added _prospectively_ to generic code. -The generic case may appear in a form like this: +An example may be some meta that reflects or receives a function by alias. Such code may want to call that function, but it is often the case that details about the ref-ness of arguments change the way arguments must be supplied, requiring additional compile-time logic to identify the ref-ness of function arguments and implement appropriate action for each case. + +The generic case may appear in a form such as: ```d void someMeta(alias userFun)() { @@ -115,7 +108,7 @@ void gun(ref const(int) x); unittest { someMeta!(fun)(); // no problem - someMeta!(gun)(); // oh no, can't receive rvalue! + someMeta!(gun)(); // error: not an lvalue! } ``` Necessitating a workaround that may look like: @@ -134,11 +127,28 @@ void someMeta(alias userFun)() } } ``` -This example situation is simplified, but it is often that such issues appear in complex aggregate meta, which may be difficult to understand, or the issue is caused indirectly at some layer the user did not author. +This example situation is simplified. In practise, such issues are often exposed when composing functionality, where a dependent library author did not correctly support `ref` functions. In that case, the end-user will experience the problem, but it may be difficult to diagnose or understand that the problem is not their direct fault. + +In general, these work-arounds damage readability, maintainability, and brevity. They make authoring correct code more difficult, increase the probability of brittle meta, and correct code is frustrating to implement repeatedly. -These work-arounds damage readability and brevity, they make authoring correct code more difficult, increase the probability of brittle meta, and it's frustrating to implement repeatedly. +Importantly, it is not intuitive to library authors that they should need to handle these cases, those who don't specifically test for `ref` are at high risk of failing to implement the required machinery, leaving the library _user_ in the a position of discovering, and dancing around potential unintended edge-cases. + +It is worth noting that `ref` args are not so common in conventional idiomatic D, but they appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using `ref` arguments more than average. + +### Why are we here? + +It is suggested that the reason this limitation exists is to assist with identifying +a class of bug where a function returns state by mutating an argument, but the programmer _accidentally_ passes an expiring rvalue, the function results are discarded, and statement has no effect. + +With the introduction of `return ref`, it is potentially possible that a supplied rvalue may by mutated and returned to propagate its affect. + +Modern D has firmly embraced pipeline programming. With this evolution, statements are often constructed by chaining function calls, so the presumption that the statement ends with the function is no longer reliable. + +This DIP proposes that we reconsider the choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain. Currently the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interface the API with their local code. + +It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. -## Proposal +## Description Calls with `ref T` arguments supplied with rvalues are effectively rewritten to emit a temporary automatically, for example: ```d @@ -173,11 +183,11 @@ Again, where initial construction of `result` should be performed at the moment It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not `ref`. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. -## Temporary destruction +### Temporary destruction Destruction of any temporaries occurs naturally at the end of the introduced scope. -## Function calls as arguments +### Function calls as arguments It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. For calls that contain cascading function calls, ie: @@ -197,7 +207,7 @@ This correct expansion is: } ``` -## Interaction with `return ref` +### Interaction with `return ref` Given the expansion shown above for cascading function calls, `return ref` works naturally, exactly as the user expects. The key is that the scope encloses the entire statement, and all temporaries live for the length of the entire statement. @@ -240,12 +250,12 @@ There are classes of range where the source range should be mutated through the It is unfortunate that `ref` adds friction to one of D's greatest programming paradigms this way. -## Interaction with other attributes +### Interaction with other attributes Interactions with other attributes should follow all existing rules. Any code that wouldn't compile in the event the user were to perform the proposed rewrites manually will fail in the same way, emitting the same error messages the user would expect. -## Overload resolution +### Overload resolution In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. Consider the following overload set: @@ -287,23 +297,23 @@ fun(t); // error: ambiguous call between A and B ``` No change to existing behaviour is proposed. -## Default arguments +### Default arguments -In satisfying the statement above "The user should not experience edge cases, or differences in functionality...", it should be that default args are applicable to `ref` args as with non-`ref` args. +In satisfying the goal that 'the user should not experience edge cases, or differences in functionality', it should be that default args are applicable to `ref` args as with non-`ref` args. -If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied a literal manually. +If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied the argument manually. In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. This shall not be specified functionality, but it may be a nice opportunity nonetheless. -## `@safe`ty implications +### `@safe`ty implications There are no implications on `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. D already states that arguments received by `ref` shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack args. This proposal adds nothing new. -## Why not `auto ref`? +### Why not `auto ref`? A frequently proposed solution to this situation is to receive the arg via `auto ref`. @@ -317,22 +327,22 @@ There are many reasons why every function can't or shouldn't be a template. 5. Is extern(C++) 6. Intent to capture function pointers or delegates 7. Has many args; unreasonable combinatorial explosion -8. Is larger-than-inline scale; engineer assesses that reducing instantiation bloat has greater priority than maximising parameter passing efficiency in some cases +8. Is larger-than-inline scale; engineer assesses that reducing instantiation bloat has greater priority than maximising parameter passing efficiency in select cases Any (or many) of these reasons may apply, eliminating `auto ref` from the solution space. -## Key use cases +### Key use cases -Pipeline programming expressions often begin with a range returned from a function. There are constructs where transform functions need to take the range by reference. Such cases currently break the pipeline and introduce a temporary. This proposal improves the pipeline-programming experience. +Pipeline programming expressions often begin with a range returned from a function (an rvalue). Transform functions may receive their argument by reference. Such cases currently break the pipeline and introduce a manual temporary. This proposal improves the pipeline-programming experience. -Generic programming is one of D's biggest success stories, and tends to work best when interfaces and expressions are orthogonal and with as few as possible edge-cases. Certain forms of meta find that `ref`-ness is a key edge-case which requires special case handling and may often lead to brittle generic code to be discovered by an unhappy niche user at some future time. +Generic programming is one of D's biggest success stories, and tends to work best when interfaces and expressions are orthogonal and with as few as possible edge-cases. Certain forms of meta find that `ref`-ness is a key edge-case which requires special case handling and may often lead to brittle generic code to be discovered by a niche end-user at some future time. Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, `auto ref`). -By comparison, C++ has a very high prevalence of `const&` args and classes with virtual functions, and when interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. This DIP reduces inconvenience when interacting with C++ API's. +By comparison, C++ has a very high prevalence of `const&` args, classes with virtual functions, and default args supplied to ref. When interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. C++ interaction is a key initiative, this DIP reduces inconvenience when interacting with C++ API's, and improves the surface area we are able to express. This issue is also likely to appear more frequently for vendors with tight ABI requirements. -Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well. +Lack of templates at ABI boundary lead to users of closed-source libraries distributed as binary or DLLs being more likely to encounter challenges interacting with such APIs. ## Copyright & License From 31105c32a6eb4241fca852f2ba5f371d0f8789bf Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Sat, 16 Jun 2018 01:42:05 -0700 Subject: [PATCH 7/9] Added a comment on the `const` discussion. --- DIPs/DIP1xxx-rval_to_ref.md | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index b40f00435..686a5ecae 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -20,6 +20,11 @@ Here is proposed a strategy to emit implicit temporaries to conveniently and uni There is a lot of prior discussion on this topic. Much is out of date now due to recent language evolution. Prior discussion involving `scope`, and `auto ref` as solutions are out of date; We have implemented `scope`, `auto ref`, and we also have `return ref` now, which affects prior conversation. +Issues: +https://issues.dlang.org/show_bug.cgi?id=6221 - comparison asymmetry +https://issues.dlang.org/show_bug.cgi?id=8845 - friction calling with expressions +https://issues.dlang.org/show_bug.cgi?id=9238 - competing ideas + Forum threads: https://forum.dlang.org/thread/mailman.3720.1453131378.22025.digitalmars-d@puremagic.com https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy@forum.dlang.org @@ -35,15 +40,6 @@ https://forum.dlang.org/thread/tkdloxqhtptpifkhvxjh@forum.dlang.org https://forum.dlang.org/thread/mailman.1478.1521842510.3374.digitalmars-d@puremagic.com https://forum.dlang.org/thread/gsdkqnbljuwssslxuglf@forum.dlang.org -Issues: -https://issues.dlang.org/show_bug.cgi?id=9238 -https://issues.dlang.org/show_bug.cgi?id=8845 -https://issues.dlang.org/show_bug.cgi?id=6221 -https://issues.dlang.org/show_bug.cgi?id=6442 - -PRs: -https://github.com/dlang/dmd/pull/4717 - ## Contents * [Rationale](#rationale) * [Description](#description) @@ -140,11 +136,11 @@ It is worth noting that `ref` args are not so common in conventional idiomatic D It is suggested that the reason this limitation exists is to assist with identifying a class of bug where a function returns state by mutating an argument, but the programmer _accidentally_ passes an expiring rvalue, the function results are discarded, and statement has no effect. -With the introduction of `return ref`, it is potentially possible that a supplied rvalue may by mutated and returned to propagate its affect. +With the introduction of `return ref`, it is potentially possible that a supplied rvalue may by mutated and returned to propagate its effect. Modern D has firmly embraced pipeline programming. With this evolution, statements are often constructed by chaining function calls, so the presumption that the statement ends with the function is no longer reliable. -This DIP proposes that we reconsider the choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain. Currently the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump through hurdles to interface the API with their local code. +This DIP proposes that we reconsider the choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain. Currently the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump hurdles to interface the API with their local code. It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. @@ -192,8 +188,8 @@ Destruction of any temporaries occurs naturally at the end of the introduced sco It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. For calls that contain cascading function calls, ie: ```d -void fun(ref const(int) x, ref const(int) y); -int gun(ref const(int) x); +void fun(ref int x, ref int y); +int gun(ref int x); fun(10, gun(20)); ``` @@ -240,7 +236,7 @@ But if the transform receives a range by `ref`, the pipeline syntax breaks down: MyRange makeRange(); ref MyRange mutatingTransform(return ref MyRange r, int x); -auto results = makeRange().transform(10).array; // error, not an lvalue! +auto results = makeRange().mutatingTransform(10).array; // error, not an lvalue! // necessitate workaround: auto tempRange = makeRange(); // satisfy the compiler @@ -331,6 +327,14 @@ There are many reasons why every function can't or shouldn't be a template. Any (or many) of these reasons may apply, eliminating `auto ref` from the solution space. +### Why not `const` like C++? + +It was debated whether this proposal should apply to `ref const` like C++, or to `ref` more generally. + +Satisfying arguments in favour of extending the pattern beyond `const` include: +* D has `return ref`, and a tendency towards pipeline programming, making mutation of `ref` rvalue arguments a valid and valuable pattern. +* D's `const` is more restrictive than C++, and limits wide application. + ### Key use cases Pipeline programming expressions often begin with a range returned from a function (an rvalue). Transform functions may receive their argument by reference. Such cases currently break the pipeline and introduce a manual temporary. This proposal improves the pipeline-programming experience. From ee8d8fb96adc8ec55cc336a1a197f83ae74747bf Mon Sep 17 00:00:00 2001 From: Michael Parker Date: Mon, 2 Jul 2018 14:04:47 +0900 Subject: [PATCH 8/9] Copy edit pass. --- DIPs/DIP1xxx-rval_to_ref.md | 153 ++++++++++++++++-------------------- 1 file changed, 68 insertions(+), 85 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index 686a5ecae..f5968b6c9 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -9,53 +9,49 @@ ## Abstract -Functions that receive arguments by `ref` do not accept rvalues. +Functions that receive arguments by reference in the form of `ref` parameters do not accept rvalues. This leads to edge cases in calling code with respect to argument passing semantics, requiring an assortment of workarounds which pollute client-side code clarity, e.g. rvalues must first be manually saved in a temporary local variable before being passed as an argument to a function that expects a `ref` parameter. -This leads to edge-cases in calling code with respect to parameter passing semantics, requiring an assortment of workarounds and user-intervention which may be frustrating, and pollute _client-side_ code clarity. - -Here is proposed a strategy to emit implicit temporaries to conveniently and uniformly interact with APIs that use `ref` arguments. +Here is proposed a strategy to emit implicit temporaries when rvalues are bound to `ref` parameters, enabling client code to conveniently and uniformly interact with APIs that use `ref` parameters. ## Reference -There is a lot of prior discussion on this topic. Much is out of date now due to recent language evolution. -Prior discussion involving `scope`, and `auto ref` as solutions are out of date; We have implemented `scope`, `auto ref`, and we also have `return ref` now, which affects prior conversation. - -Issues: -https://issues.dlang.org/show_bug.cgi?id=6221 - comparison asymmetry -https://issues.dlang.org/show_bug.cgi?id=8845 - friction calling with expressions -https://issues.dlang.org/show_bug.cgi?id=9238 - competing ideas - -Forum threads: -https://forum.dlang.org/thread/mailman.3720.1453131378.22025.digitalmars-d@puremagic.com -https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy@forum.dlang.org -https://forum.dlang.org/thread/mailman.577.1410180586.5783.digitalmars-d@puremagic.com -https://forum.dlang.org/thread/km4rtm$239e$1@digitalmars.com -https://forum.dlang.org/thread/kl4v8r$tkc$1@digitalmars.com -https://forum.dlang.org/thread/ylebrhjnrrcajnvtthtt@forum.dlang.org -https://forum.dlang.org/thread/ntsyfhesnywfxvzbemwc@forum.dlang.org -https://forum.dlang.org/thread/uswucstsooghescofycp@forum.dlang.org -https://forum.dlang.org/thread/zteryxwxyngvyqvukqkm@forum.dlang.org -https://forum.dlang.org/thread/yhnbcocwxnbutylfeoxi@forum.dlang.org -https://forum.dlang.org/thread/tkdloxqhtptpifkhvxjh@forum.dlang.org -https://forum.dlang.org/thread/mailman.1478.1521842510.3374.digitalmars-d@puremagic.com -https://forum.dlang.org/thread/gsdkqnbljuwssslxuglf@forum.dlang.org +There is a significant amount of prior discussion on this topic. Much is out of date due to recent language evolution. Prior discussion involving `scope` and `auto ref` as solutions are out of date as both `scope` and `auto ref` have been implemented. `return ref`, which was raised in prior discussion, has also been implemented. + +Issues: +* [#6221](https://issues.dlang.org/show_bug.cgi?id=6221) - comparison asymmetry +* [#8845](https://issues.dlang.org/show_bug.cgi?id=8845) - friction calling with expressions +* [#9238](https://issues.dlang.org/show_bug.cgi?id=9238) - competing ideas + +Forum threads: +* [rval->ref const(T), implicit conversions](https://forum.dlang.org/thread/mailman.3720.1453131378.22025.digitalmars-d@puremagic.com) +* [rvalue references (2015)](https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy@forum.dlang.org) +* [rvalues->ref args](https://forum.dlang.org/thread/mailman.577.1410180586.5783.digitalmars-d@puremagic.com) +* [The liabilities of binding rvalues to ref](https://forum.dlang.org/thread/km4rtm$239e$1@digitalmars.com) +* [rvalue references (2013)](https://forum.dlang.org/thread/kl4v8r$tkc$1@digitalmars.com) +* [DIP 36: Rvalue References](https://forum.dlang.org/thread/ylebrhjnrrcajnvtthtt@forum.dlang.org) +* [My thoughts & tries with rvalue references](https://forum.dlang.org/thread/ntsyfhesnywfxvzbemwc@forum.dlang.org) +* [Question about auto ref](https://forum.dlang.org/thread/uswucstsooghescofycp@forum.dlang.org) +* [Settling rvalue to (const) ref parameter binding once and for all](https://forum.dlang.org/thread/zteryxwxyngvyqvukqkm@forum.dlang.org) +* [Const ref and rvalues again...](https://forum.dlang.org/thread/yhnbcocwxnbutylfeoxi@forum.dlang.org) +* [Rvalue references (2018)](https://forum.dlang.org/thread/tkdloxqhtptpifkhvxjh@forum.dlang.org) +* [rvalues -> ref (yup... again!)](https://forum.dlang.org/thread/mailman.1478.1521842510.3374.digitalmars-d@puremagic.com) +* [Parsing a string from stdin using formattedRead](https://forum.dlang.org/thread/gsdkqnbljuwssslxuglf@forum.dlang.org) ## Contents * [Rationale](#rationale) * [Description](#description) -* [Temporary destruction]() -* [`@safe`ty implications](#safety_implications) -* [Why not `auto ref`?](#auto_ref) -* [Key use cases](#use_cases) +* [Temporary destruction](#temporary-destruction) +* [`@safe`ty implications](#safety-implications) +* [Why not `auto ref`?](#auto-ref) +* [Key use cases](#use-cases) * [Reviews](#reviews) ## Rationale -Functions may receive arguments by reference, and this may be for a variety of reasons. -One common reason is that the cost of copying or moving large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. +Functions may receive arguments by reference for a variety of reasons. One common reason is that the cost of copying or moving large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. Another is that the function may want to mutate the caller's data directly, or return data via `out` parameters due to ABI limitations regarding multiple return values. -A recurring complaint from users when interacting with functions that receive arguments by `ref` is that given an rvalue as argument, the compiler is unable to create an implicit temporary to perform the function call, and presents the user with a compile error instead, invoking the necessity for manual workarounds. +A recurring complaint from users when interacting with functions that receive arguments by `ref` is that, given an rvalue as the argument, the compiler is unable to create an implicit temporary to perform the function call, and instead presents the user with a compile error. This behavior instigates the necessity for manual workarounds. Consider the example: ```d @@ -74,9 +70,7 @@ Necessitating the workaround: int temp = 10; fun(temp); ``` -In practise, the argument is likely a struct type rather than 'int', but the inconvenience applies generally. - -This inconvenience extends broadly to every manner of thing you pass to functions with the exception of lvalue instances, including: +This inconvenience extends broadly to every manner of rvalue passed to functions, including: ```d fun(10); // literals fun(gun()); // return values from functions @@ -85,11 +79,11 @@ fun(x + y); // expressions fun(my_short); // implicit type conversions (ie, short->int promotion) // etc. ``` -The work-around bloats the number of lines around the call-site, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for expressions where no meaningful name exists. +The workaround increases the number of lines around the call site and the user must declare names for all temporaries, polluting the local namespace. A further issue is that because these situations require special-case handling, they necessitate undesirable and potentially complex compile-time logic being added _prospectively_ to generic code. -An example may be some meta that reflects or receives a function by alias. Such code may want to call that function, but it is often the case that details about the ref-ness of arguments change the way arguments must be supplied, requiring additional compile-time logic to identify the ref-ness of function arguments and implement appropriate action for each case. +An example may be some meta that reflects or receives a function by alias. Such code may want to call that function, but it is often the case that details about the ref-ness of parameters change the way arguments must be supplied, requiring additional compile-time logic to identify the ref-ness of function parameters and implement appropriate action for each case. The generic case may appear in a form such as: ```d @@ -123,30 +117,29 @@ void someMeta(alias userFun)() } } ``` -This example situation is simplified. In practise, such issues are often exposed when composing functionality, where a dependent library author did not correctly support `ref` functions. In that case, the end-user will experience the problem, but it may be difficult to diagnose or understand that the problem is not their direct fault. +This example situation is simplified. In practice, such issues are often exposed when composing functionality, where a dependent library author did not correctly support `ref` parameters. In that case, the end-user will experience the problem, but it may be difficult to diagnose or understand that the fault is not their own. -In general, these work-arounds damage readability, maintainability, and brevity. They make authoring correct code more difficult, increase the probability of brittle meta, and correct code is frustrating to implement repeatedly. +In general, these workarounds damage readability, maintainability, and brevity. They make authoring correct code more difficult, increase the probability of brittle meta, and correct code is frustrating to implement repeatedly. -Importantly, it is not intuitive to library authors that they should need to handle these cases, those who don't specifically test for `ref` are at high risk of failing to implement the required machinery, leaving the library _user_ in the a position of discovering, and dancing around potential unintended edge-cases. +It is not intuitive to library authors that they should need to handle these cases; those who don't specifically test for `ref` are at high risk of failing to implement the required machinery, leaving the library _user_ in the a position of discovering, and working around, potential unintended edge cases. -It is worth noting that `ref` args are not so common in conventional idiomatic D, but they appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using `ref` arguments more than average. +It is worth noting that `ref` args are not so common in conventional idiomatic D, but they appear frequently in niche circumstances. As such, this issue is likely to disproportionately affect subsets of users who find themselves using `ref` arguments more than average, e.g. those interacting with C++ libraries where such parameters are more common. ### Why are we here? -It is suggested that the reason this limitation exists is to assist with identifying -a class of bug where a function returns state by mutating an argument, but the programmer _accidentally_ passes an expiring rvalue, the function results are discarded, and statement has no effect. +It is suggested that the reason this limitation exists is to assist with identifying a class of bug that can occur when a function returns state by mutating an argument. When the programmer _accidentally_ passes an expiring rvalue, the function results are discarded and the statement has no effect. With the introduction of `return ref`, it is potentially possible that a supplied rvalue may by mutated and returned to propagate its effect. Modern D has firmly embraced pipeline programming. With this evolution, statements are often constructed by chaining function calls, so the presumption that the statement ends with the function is no longer reliable. -This DIP proposes that we reconsider the choice to receive an argument by value or by reference is a detail that the API *author* selects with respect to criteria relevant to their project or domain. Currently the semantic impact is not worn by the API author, but rather by the API user, who may be required to jump hurdles to interface the API with their local code. +This DIP proposes that we reconsider the choice to receive an argument by value or by reference as a detail that the API *author* selects with respect to criteria relevant to their project or domain. Currently the semantic impact is not borne by the API author, but rather by the API user, who may be hindered in interfacing the API with their local code. -It would be ideal if the decision to receive arguments by value or by reference were a detail for the API, and not increase the complexity of the users code. +It would be ideal if the decision to receive arguments by value or by reference were simply a detail of the API and not something that increases the complexity of user code. ## Description -Calls with `ref T` arguments supplied with rvalues are effectively rewritten to emit a temporary automatically, for example: +Calls with `ref T` parameters supplied with rvalue arguments are effectively rewritten to emit a temporary automatically, for example: ```d void fun(ref int x); @@ -163,7 +156,7 @@ Where `T` is the *function argument type*. To mitigate confusion, I have used `:=` in this example to express the initial construction, and not a copy operation as would be expected if this code were written with an `=` expression. -In the case where a function output initialises an variable: +In the case where a function output initializes a variable: ```d R result = fun(10); ``` @@ -175,9 +168,9 @@ R result = void; result := fun(__temp0 := 10); } ``` -Again, where initial construction of `result` should be performed at the moment of assignment, as usual and expected. +Again where initial construction of `result` should be performed at the moment of assignment, as usual and expected. -It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not `ref`. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. +It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behavior as when the parameter is not `ref`. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`. ### Temporary destruction @@ -185,15 +178,14 @@ Destruction of any temporaries occurs naturally at the end of the introduced sco ### Function calls as arguments -It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. -For calls that contain cascading function calls, ie: +It is important to note that a single scope is introduced to enclose the entire statement. The pattern should not cascade when nested calls exist in the parameter list within a single statement. For calls that contain cascading function calls, i.e.: ```d void fun(ref int x, ref int y); int gun(ref int x); fun(10, gun(20)); ``` -This correct expansion is: +The correct expansion is: ```d { int __fun_temp0 = void; @@ -214,7 +206,7 @@ ref int gun(return ref int y); fun(gun(10)); ``` -This correct expansion is: +The correct expansion is: ```d { int __gun_temp0 = void; @@ -223,15 +215,14 @@ This correct expansion is: ``` The lifetime of `__gun_temp0` is satisfactory for any conceivable calling construction. -This is particularly useful when pipeline programming. -It is common that functions are invoked which create and return a range which is then used in a pipeline operation: +This is particularly useful in pipeline programming. It is common that functions are invoked to create and return a range which is then used in a pipeline operation: ```d MyRange makeRange(); MyRange transform(MyRange r, int x); auto results = makeRange().transform(10).array; ``` -But if the transform receives a range by `ref`, the pipeline syntax breaks down: +If the transform receives a range by `ref`, the pipeline syntax breaks down: ```d MyRange makeRange(); ref MyRange mutatingTransform(return ref MyRange r, int x); @@ -248,12 +239,11 @@ It is unfortunate that `ref` adds friction to one of D's greatest programming pa ### Interaction with other attributes -Interactions with other attributes should follow all existing rules. -Any code that wouldn't compile in the event the user were to perform the proposed rewrites manually will fail in the same way, emitting the same error messages the user would expect. +Interactions with other attributes should follow all existing rules. Any code that wouldn't compile if the user were to perform the proposed rewrites manually will fail in the same way, emitting the same error messages the user would expect. ### Overload resolution -In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. +In the interest of preserving optimal calling efficiency, existing language rules continue to apply; lvalues should prefer by-ref functions, and rvalues should prefer by-value functions. Consider the following overload set: ```d void fun(int); // A @@ -279,8 +269,7 @@ int x = 10; lval_only(x); // ok: choose by-ref lval_only(10); // error: literal matches by-val, which is @disabled ``` -It may be considered an advantage that using this construction, the intent to restrict the argument in this way is made explicit. -The symmetrical 'rvalue-only restriction' is also possible to express in the same way. +It may be considered an advantage that by using this construction, the intent to restrict the argument in this way is made explicit. The symmetrical 'rvalue-only restriction' is also possible to express in the same way. Overloading with `auto ref` preserves existing rules, which is to emit an ambiguous call when it collides with an explicit overload: ```d @@ -291,27 +280,23 @@ int t = 10; fun(10); // chooses B: auto ref resolves by-value given an rvalue, prefer exact match as above fun(t); // error: ambiguous call between A and B ``` -No change to existing behaviour is proposed. +No change to existing behavior is proposed. ### Default arguments -In satisfying the goal that 'the user should not experience edge cases, or differences in functionality', it should be that default args are applicable to `ref` args as with non-`ref` args. - -If the user does not supply an argument and a default arg is specified, the default arg is selected as usual and populates a temporary, just as if the user supplied the argument manually. +In satisfying the goal that 'the user should not experience edge cases, or differences in functionality', it should be the case that default arguments are applicable to `ref` parameters as with non-`ref` parameters. If the user does not supply an argument and a default argument is specified, the default argument is selected as usual and populates a temporary, just as if the user supplied the argument manually. -In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. -This shall not be specified functionality, but it may be a nice opportunity nonetheless. +In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. This shall not be specified functionality, but it may be a nice opportunity nonetheless. ### `@safe`ty implications -There are no implications on `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. -D already states that arguments received by `ref` shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. +There are no implications from this proposal for `@safe`ty. There are no additions or changes to allocation or parameter passing schemes. D already states that arguments received by `ref` shall not escape, so passing temporaries is not dangerous from an escaping/dangling-reference point of view. -The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack args. This proposal adds nothing new. +The user is able to produce the implicit temporary described in this proposal manually, and pass it with identical semantics; any potential safety implications are already applicable to normal stack arguments. This proposal adds nothing new. ### Why not `auto ref`? -A frequently proposed solution to this situation is to receive the arg via `auto ref`. +A frequently proposed solution to this situation is to receive the argument via `auto ref`. `auto ref` solves a different set of problems; those may include "pass this argument in the most efficient way", or "forward this argument exactly how I received it". The implementation of `auto ref` requires that every function also be a template. @@ -322,31 +307,30 @@ There are many reasons why every function can't or shouldn't be a template. 4. Is virtual 5. Is extern(C++) 6. Intent to capture function pointers or delegates -7. Has many args; unreasonable combinatorial explosion -8. Is larger-than-inline scale; engineer assesses that reducing instantiation bloat has greater priority than maximising parameter passing efficiency in select cases +7. Has many arguments; unreasonable combinatorial explosion +8. Is larger-than-inline scale; engineer assesses that reducing instantiation bloat has greater priority than maximizing parameter passing efficiency in select cases -Any (or many) of these reasons may apply, eliminating `auto ref` from the solution space. +One or more of these reasons may apply, thereby eliminating `auto ref` from the solution space. ### Why not `const` like C++? It was debated whether this proposal should apply to `ref const` like C++, or to `ref` more generally. -Satisfying arguments in favour of extending the pattern beyond `const` include: +Satisfying arguments in favor of extending the pattern beyond `const` include: * D has `return ref`, and a tendency towards pipeline programming, making mutation of `ref` rvalue arguments a valid and valuable pattern. -* D's `const` is more restrictive than C++, and limits wide application. +* D's `const` is more restrictive than that of C++ and limits wide application. ### Key use cases -Pipeline programming expressions often begin with a range returned from a function (an rvalue). Transform functions may receive their argument by reference. Such cases currently break the pipeline and introduce a manual temporary. This proposal improves the pipeline-programming experience. +Pipeline programming expressions often begin with a range returned from a function (an rvalue). Transform functions may receive their argument by reference. Such cases currently break the pipeline and introduce a manual temporary. This proposal improves the pipeline programming experience. -Generic programming is one of D's biggest success stories, and tends to work best when interfaces and expressions are orthogonal and with as few as possible edge-cases. Certain forms of meta find that `ref`-ness is a key edge-case which requires special case handling and may often lead to brittle generic code to be discovered by a niche end-user at some future time. +Generic programming is one of D's biggest success stories and tends to work best when interfaces and expressions are orthogonal, with as few edge cases as possible . Certain forms of meta find that `ref`-ness is a key edge case which requires special case handling and may often lead to brittle generic code to be discovered by a niche end user at some future time. Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, `auto ref`). -By comparison, C++ has a very high prevalence of `const&` args, classes with virtual functions, and default args supplied to ref. When interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. C++ interaction is a key initiative, this DIP reduces inconvenience when interacting with C++ API's, and improves the surface area we are able to express. +By comparison, C++ has a very high prevalence of `const&` arguments, classes with virtual functions, and default arguments supplied to reference parameters. When interfacing with C++, those functions are mirrored to D. The issue addressed in this DIP becomes magnified significantly to this set of users. C++ interaction is a key initiative; this DIP reduces inconvenience when interacting with C++ APIs, and improves the surface area we are able to express. -This issue is also likely to appear more frequently for vendors with tight ABI requirements. -Lack of templates at ABI boundary lead to users of closed-source libraries distributed as binary or DLLs being more likely to encounter challenges interacting with such APIs. +This issue is also likely to appear more frequently for vendors with tight ABI requirements. Lack of templates at ABI boundaries lead to users of closed-source libraries distributed as binaries being more likely to encounter challenges interacting with such APIs. ## Copyright & License @@ -361,11 +345,10 @@ of the DIP process beyond the Draft Review. ## Appendix -A few examples of typical C++ APIs that exhibit this issue: +A few examples of typical C++ APIs that would exhibit the issues described in this proposal if they were in D: - PhysX (Nvidia): http://docs.nvidia.com/gameworks/content/gameworkslibrary/physx/apireference/files/classPxSceneQueryExt.html - NaCl (Google): https://developer.chrome.com/native-client/pepper_stable/cpp/classpp_1_1_instance - DirectX (Microsoft): https://github.com/Microsoft/DirectXMath/blob/master/Inc/DirectXCollision.h - Bullet (Physics Lib): http://bulletphysics.org/Bullet/BulletFull/classbtBroadphaseInterface.html -In these examples of very typical C++ code, you can see a large number of functions receive arguments by reference. -Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, properties, or expressions. \ No newline at end of file +In these examples of very typical C++ code, a large number of functions receive arguments by reference. Complex objects are likely to be fetched via getters/properties. Simple objects like math vectors/matrices are likely to be called with literals, properties, or expressions. \ No newline at end of file From 28c0d8baf22cc23690ff6d68ea5a318e01b61237 Mon Sep 17 00:00:00 2001 From: Manu Evans Date: Wed, 18 Jul 2018 18:34:10 -0700 Subject: [PATCH 9/9] Tweak --- DIPs/DIP1xxx-rval_to_ref.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/DIPs/DIP1xxx-rval_to_ref.md b/DIPs/DIP1xxx-rval_to_ref.md index f5968b6c9..52b1ab369 100644 --- a/DIPs/DIP1xxx-rval_to_ref.md +++ b/DIPs/DIP1xxx-rval_to_ref.md @@ -51,13 +51,13 @@ Forum threads: Functions may receive arguments by reference for a variety of reasons. One common reason is that the cost of copying or moving large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost. Another is that the function may want to mutate the caller's data directly, or return data via `out` parameters due to ABI limitations regarding multiple return values. -A recurring complaint from users when interacting with functions that receive arguments by `ref` is that, given an rvalue as the argument, the compiler is unable to create an implicit temporary to perform the function call, and instead presents the user with a compile error. This behavior instigates the necessity for manual workarounds. +A recurring complaint from users when interacting with functions that receive arguments by `ref` is that, given an rvalue as the argument, the compiler is unable to create an implicit temporary to perform the function call, and instead presents the user with a compile error. This behavior necessitates manual workarounds. Consider the example: ```d void fun(int x); -fun(10); // <-- this is how users expect to call a function +fun(10); // <-- this is how users expect to call a typical function ``` But when `ref` is present: ```d @@ -97,8 +97,8 @@ void gun(ref const(int) x); unittest { - someMeta!(fun)(); // no problem - someMeta!(gun)(); // error: not an lvalue! + someMeta!fun(); // no problem + someMeta!gun(); // error: not an lvalue! } ``` Necessitating a workaround that may look like: @@ -284,7 +284,7 @@ No change to existing behavior is proposed. ### Default arguments -In satisfying the goal that 'the user should not experience edge cases, or differences in functionality', it should be the case that default arguments are applicable to `ref` parameters as with non-`ref` parameters. If the user does not supply an argument and a default argument is specified, the default argument is selected as usual and populates a temporary, just as if the user supplied the argument manually. +In satisfying the goal that 'the user should not experience edge cases, or differences in functionality', it should be the case that default arguments are applicable to `ref` parameters as with non-`ref` parameters. If the user does not supply an argument and a default argument is specified, the default argument expression is selected as usual and populates a temporary, just as if the user supplied the argument manually. In this case, an interesting circumstantial opportunity appears where the compiler may discern that construction is expensive, and construct a single immutable instance for reuse. This shall not be specified functionality, but it may be a nice opportunity nonetheless. @@ -324,7 +324,7 @@ Satisfying arguments in favor of extending the pattern beyond `const` include: Pipeline programming expressions often begin with a range returned from a function (an rvalue). Transform functions may receive their argument by reference. Such cases currently break the pipeline and introduce a manual temporary. This proposal improves the pipeline programming experience. -Generic programming is one of D's biggest success stories and tends to work best when interfaces and expressions are orthogonal, with as few edge cases as possible . Certain forms of meta find that `ref`-ness is a key edge case which requires special case handling and may often lead to brittle generic code to be discovered by a niche end user at some future time. +Generic programming is one of D's biggest success stories and tends to work best when interfaces and expressions are orthogonal, with as few edge cases as possible. Certain forms of meta find that `ref`-ness is a key edge case which requires special case handling and may often lead to brittle generic code to be discovered by a niche end user at some future time. Another high-impact case is OOP, where virtual function APIs inhibit the use of templates (ie, `auto ref`).