Skip to content

Conversation

@TurkeyMan
Copy link
Contributor

@TurkeyMan TurkeyMan commented Mar 25, 2018

Created a DIP for the rval->ref situation

@TurkeyMan TurkeyMan changed the title Added first draft Passing rvalues to const ref args Mar 25, 2018
There are many reasons why every function can't or shouldn't be a template.
1. API is not your code, and it's not alrady `auto ref`
2. Is distributed as a binary lib
3. Is expored from DLL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: exported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did fix a couple of typos

@TurkeyMan
Copy link
Contributor Author

Updated to clarify that only one scope should be introduced for the entire statement (clarifying behaviour for cascading calls), and that return ref is satisfied by this implementation.

@TurkeyMan
Copy link
Contributor Author

TurkeyMan commented Mar 26, 2018

I could perhaps be persuaded to remove the word 'const' from this proposal... but I'd prefer to see that as a separate follow-up debate (relaxation of the spec is always easier than tightening). I don't really feel that symptoms of the restrictiveness of const are on trial here.

@skl131313
Copy link

Possibly add a bit about default parameters:

extern(C++) void foo(ref const(int) = 0)
{
}

Part of the problem when writing interfaces to C++ is that const& can have default parameters set to the function. But with D you can't do that, in the event that the function is nice and doesn't have T* const parameters in the function definition, you now have to create two different definitions of the same function. One for the interface and another to implement the default parameters.

What gets called when? For the following:

void foo(int);
void foo(const(int));
void foo(ref int);
void foo(ref const(int));

Didn't see anything about this in there either, though I honestly don't know when the second definition would even be called?

@TurkeyMan
Copy link
Contributor Author

TurkeyMan commented Mar 26, 2018

Nice points!

This issue is also likely to appear more frequently for vendors with tight ABI requirements.
Users of closed-source libraries distributed as binary libs, or libraries distributes as DLLs are more likely to encounter these challenges interacting with those APIs as well.

Another high-probability occurrence is OOP, where virtual function APIs inhibit the use of templates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good use case is random generation. You always should take the generator by ref to avoid accidentally copying the generator (oops), but that makes the API more verbose as the user can't declare the RandomEngine and use it one line.
Phobos uses auto ref- it shouldn't :/
A summary of Phobos's broken std.random: https://dconf.org/2015/talks/wakeling.html

@mdparker
Copy link
Member

mdparker commented Apr 2, 2018

@TurkeyMan As per the new procedures, I'm setting this to DIP to Draft status. I'll also put out a call for Draft Review on the forums.

@mdparker
Copy link
Member

mdparker commented Apr 2, 2018

I recommend adding a link to the ongoing forum discussion in the References section.

@schveiguy
Copy link
Member

"because it's a guarantee that the parameter is used strictly as an input argument, rather than some form of output."

Consider a reference inside such a type as being a potentially sound place for an output. In other words, the struct itself is a reference to a temporary, but what its members point at may not be temporary. In C++ this is fine, because head-const does exist. But in D, requiring transitive const has a much more far-reaching effect.

In any case, there is a large missing discussion in this DIP: the fact that this is passed by reference, even for rvalues. Apparently, the D world hasn't yet ended due to that loose interpretation of the rules, even without const, so maybe it's OK for other uses!

@Bolpat
Copy link
Contributor

Bolpat commented Apr 17, 2018

Your proposal does not only break current code, but current expression, because someone might want to express take it by ref, but only lvalues, which as far as I see, cannot be done anymore. If so, it must be in the document.
I've made a suggestion[1] in the Forum to reuse the in storage class for that. This way, one can still express take it by ref, but only lvalues. The in storage class is completely redundant, so deprecating it has a trivial migration to replace it by const scope. That's not the big deal. The aforementioned is not affected. Most programmers will not need const scope ref anyway.

So, how about redefining in to mean "const scope ref that also binds rvalues"?

The compiler can even suggest changing const scope ref to in when you plug in an rvalue for a const ref. Generally, scope should be part of all that: You propose an efficient (l/r)value-agnostic storage class for input parameters. Those should not be leaked.

[1] https://forum.dlang.org/post/medovwjuykzpstnwbfyy@forum.dlang.org

@Bolpat
Copy link
Contributor

Bolpat commented Apr 17, 2018

About the lowering, for R func(const ref T arg) and some expression expr resulting in an rvalue,
isn't (param => func(param))(expr()) the right lowering?

@TurkeyMan
Copy link
Contributor Author

Sorry for the silence guys. Work got very busy, and I haven't had any time after-hours for the last couple of weeks.
I'll integrate all the recent feedback as soon as I am able. PR's welcome too, if there are others who are motivated by this issue! :)

@TurkeyMan
Copy link
Contributor Author

isn't (param => func(param))(expression()) the right lowering?

My proposal is more semantically direct. The syntax you present introduces a function, which relies on the inliner to perform as expected to produce the proper calling code equivalent to my proposal. That will not happen in debug, and as such, the code may be hard to inspect (additional call in the callstack obscuring outer locals) and debug/step.

@Bolpat
Copy link
Contributor

Bolpat commented Apr 26, 2018

About the lambda: I've seen stuff like that in other DIPs. I believe it's being used to specify behavior rigorously and rather simple and let the compiler guys decide on the implementation details.

@TurkeyMan
Copy link
Contributor Author

TurkeyMan commented Apr 26, 2018

It's not right though; i made a point about introducing a single scope, not one for each call. Otherwise chains using return ref won't work as you expect. All temps need to live the life of the statement, not the call. (calls can be nested)

@TurkeyMan TurkeyMan force-pushed the ref_args branch 2 times, most recently from 67ea010 to ba70f30 Compare May 20, 2018 00:17
@Bolpat
Copy link
Contributor

Bolpat commented May 21, 2018

It took me some time to understand how your proposal changes how ref should be looked at with respect to one of its implications. I thought of ref mainly as lvalues-only restriction and secondary as how a parameter is being passed (optimization).

You should mention that with your proposal, ref cannot be used as a restriction any more. It is still possible to implement the restriction as lvalues still prefer ref overloads and rvalues still prefer non-ref overloads:

void f(T) @disable; // catch and reject rvalues ...
void f(ref T); // ... therefore lvalues only

void g(ref T) @disable; // catch and reject lvalues ...
void g(T); // ... therefore rvalues only

One (good or bad depends on viewpoint) implication is that restriction to lvalues must be done much more explicit than before. Possibly there are algorithms that are not useful or don't even work correctly on l- or rvalues, but the assumption that those algorithms are rare certainly holds.

So the proposal makes ref more like C#'s ref[1] and VB.NET's ByRef than C++'s &. I believe that many D folks read D's ref similar to C++'s & (which without const means lvalues only[2]) with the only difference that it can only be used on function parameters and returns (NB: not function parameter types or return types, as ref is not a type constructor).

[1] jonskeet.uk or docs.microsoft.com (apart from ref locals)
[2] For a very explicit version of that meaning, see ref-qualified non-static member functions on cppreference.com § const-, volatile-, and ref-qualified member functions

@TurkeyMan
Copy link
Contributor Author

Right, I had that thought about retaining the restriction. If people want that, they can use @disable.
I'll add that detail. I did think about that approach, I just forgot.

@TurkeyMan
Copy link
Contributor Author

TurkeyMan commented May 21, 2018

Right, it's moved a little farther from C++ const &, and that was due to feedback here.
I've become convinced that non-const cases are useful and they're actually a feature, and not just a thing to avoiding const-related issues.
Pipeline programming in D is super-popular, and almost all ranges are created and returned from functions as rvalues. Not all ranges should be copied by-value though, so things like return ref can be used along the pipeline. This broke the pipeline before, but it all works very nicely with this dip.

Copy link
Member

@mdparker mdparker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a first pass, I've made some suggestions for structural edits. I'll give you some time to address these before drilling down to a line/copy edit pass.

| Author: | Manu Evans (turkeyman@gmail.com) |
| Status: | Draft |

## Abstract
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the abstract as written is too dense and too detailed. It can be made much more concise, likely stripped down to a single paragraph. For example, you don't need to iterate here any of the specific issues you detail in later in the Rationale. Just get across the main ideas: the current status quo is sometimes troublesome and this DIP proposes a means to alleviate the pain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored


Here is proposed a strategy to emit implicit temporaries to conveniently interact with APIs that use `ref` arguments.

## Reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that you've listed so many references, but it's a bit much for anyone coming to all of this for the first time without anything to guide them. A one-line summary for each reference link (forum threads, issues, and PRs all) would be extremely helpful here.


## Rationale

When calling functions that receive `ref` args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first sentence would work well for your opening statement in the abstract, though it could be reworded a bit, e.g. "D does not allow rvalues to be passed as function arguments where ref parameters are declared." or some such.

With these cases in mind, the existing rule feels out-dated or inappropriate, and the presence of the rule may often lead to aggravation while trying to write simple, readable code.
Calling a function should be simple and orthogonal, generic code should not have to concern itself with details about ref-ness of function parameters, and users should not be required to jump through hoops when `ref` appears in API's they encounter.

Consider the example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You examples should be collocated above with the issues they are intended to demonstrate, e.g.

"The first issue is yadda yadda.

Example Here

A related issue is yadda yadda.

Example here

}
}
```
This example situation is simplified, but it is often that such issues appear in complex aggregate meta, which may be difficult to understand, or the issue is caused indirectly at some layer the user did not author.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful about authoritative claims about frequency, or side effects that are hard to quantify. If you have precise claims to make, with demonstrative data to back them up, then the data need to be referenced somehow. Otherwise, consider language such as "can lead to" or "may cause", with examples to demonstrate. This applies not just here with these lines, but throughout the document.


These work-arounds damage readability and brevity, they make authoring correct code more difficult, increase the probability of brittle meta, and it's frustrating to implement repeatedly.

## Proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with other DIPs, let's keep the 'Description' label here.


It is important that `T` be defined as the argument type, and not `auto`, because it will allow implicit conversions to occur naturally, with identical behaviour as when argument was not `ref`. The user should not experience edge cases, or differences in functionality when calling `fun(int x)` vs `fun(ref int x)`.

## Temporary destruction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a level 3 subheading under description, as should all the headings between here and the copyright.

When calling functions that receive `ref` args, D prohibits supplying rvalues. It is suggested that this is to assist the author identifying likely logic errors where an rvalue will expire at the end of the statement, and it doesn't make much sense for a function to mutate a temporary whose life will not extend beyond the function call in question.

However, many functions receive arguments by reference, and this may be for a variety of reasons.
One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this to "the cost of copying or moving large structs".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

One common reason is that the cost of copying large structs via the parameter list is expensive, so struct parameters may be received by reference to mitigate this cost.
Another common case is that the function may want to mutate the caller's data directly or return data via `out` parameters due to ABI limitations regarding multiple return values. This is the potential error case that the existing design attempts to mitigate, but in D, pipeline programming is vary popular, and contrary to conventional wisdom where the statement is likely to end at the end of the function call, pipeline expressions may result in single statements performing a lot of work, mutating state as it passes down the pipeline.

A related issue is with relation to generic code which reflects or received a function by alias. Such generic code may want to call that function, but it is often the case that details about the ref-ness of arguments lead to incorrect semantic expressions in the generic code depending on the arguments, necessitating additional compile-time logic to identify the ref-ness of function arguments and implement appropriate workarounds on these conditions. This leads to longer, more brittle, and less-maintainable generic code. It is also much harder to write correctly the first time, and such issues may only emerge in niche use cases at a later time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example here would help.

fun(my_short); // implicit type conversions (ie, short->int promotion)
// etc... (basically, most things you pass to functions)
```
The work-around can bloat the number of lines around the call-site significantly, and the user needs to declare names for all the temporaries, polluting the local namespace, and often for expressions where no meaningful name exists, leading to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "leading to" at the end of the sentence doesn't seem to belong there.

Is rewritten:
```d
{
T __temp0 = void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not T __temp0 = 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that would change the order of evaluation.
The order of evaluation should match function call argument evaluation exactly. I tried to express that in the simplest terms I could.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point is: T might not be constructable from an int.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the text below: "It is important that T be defined as the argument type", so in this case, T is int.

@TurkeyMan TurkeyMan force-pushed the ref_args branch 3 times, most recently from d91d6af to f310689 Compare June 16, 2018 09:06
@TurkeyMan
Copy link
Contributor Author

Integrate changes and tweaked a touch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants