Skip to content

Move from last-use causes problems with RAII types, and types with member functions with side effects #1002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bluetarpmedia opened this issue Feb 28, 2024 · 21 comments
Labels
bug Something isn't working

Comments

@bluetarpmedia
Copy link
Contributor

Describe the bug
A combination of UFCS and std::move introduced by a variable's last-use causes a C++ compiler error.

To Reproduce
Run cppfront on this code:

main: () -> int = {
    
    m : std::mutex = ();
    cv : std::condition_variable = ();

    (copy lk: std::unique_lock = (m))
    {
        cv.wait(lk, :() -> bool = true);
    }

    _ = m.native_handle();  // Workaround for bug #968

    return 0;
}

After compiling with clang it results in an error:

error: no matching function for call to object of type '(lambda at main.cpp2:8:9)'
    8 |         CPP2_UFCS(wait)(std::move(cv), std::move(lk), []() mutable -> bool { return true;  });
      |         ^~~~~~~~~~~~~~~

One workaround is to add something after the cv.wait line that becomes the last use of lk:

_ = lk.mutex();

Repro on Godbolt

@hsutter
Copy link
Owner

hsutter commented Mar 18, 2024

Thanks! Ah, this is similar to #999, except the key thing is that lk is not the first argument (before .), it's another argument.

[Updated to add:] It's actually intentional that this is an error. Briefly, the reason is that if a definite last use of a local object modifies that object, then the code is not looking at the object again (because by definition it's a last use) and therefore trying to implicitly discard the new value. For a complete discussion, please see Design note: Explicit discard.

Like #999, this is still improved by the recent commit 7041994, but instead of the nicer error message #999 gets, currently this example gets this:

demo.cpp2(7): error C2338: static_assert failed: 'this function call syntax tries 'obj.func(...)', then 'func(obj,...);' - both failed, did you spell the function name correctly?'

And on MSVC that's followed by this, which gives the right clue never the end:

demo.cpp2(8): note: see reference to function template instantiation 'decltype(auto) main::<lambda_1>::operator ()<std::condition_variable,std::unique_lock<std::mutex>,main::<lambda_2>,false>(Obj &&,std::unique_lock<std::mutex> &&,main::<lambda_2> &&) noexcept(false) const' being compiled
        with
        [
            Obj=std::condition_variable
        ]
demo.cpp2(8): error C2660: 'std::condition_variable::wait': function does not take 2 arguments
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include\mutex(561): note: see declaration of 'std::condition_variable::wait'
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include\mutex(567): note: could be 'void std::condition_variable::wait(std::unique_lock<std::mutex> &,_Predicate)'
demo.cpp2(8): note: 'initializing': cannot convert from 'std::unique_lock<std::mutex>' to 'std::unique_lock<std::mutex> &'
demo.cpp2(8): note: A non-const reference may only be bound to an lvalue
demo.cpp2(8): note: while trying to match the argument list '(std::unique_lock<std::mutex>, main::<lambda_2>)'
demo.cpp2(8): error C3861: 'wait': identifier not found
demo.cpp2(8): note: 'wait': function was not declared in the template definition context and can be found only via argument-dependent lookup in the instantiation context

So that's better than above, but not as nice as for the first argument before .. Let me see if I can make this message better...

@hsutter
Copy link
Owner

hsutter commented Mar 18, 2024

Based on this feedback, I think the first error message should be:

demo.cpp2(8): error C2338: static_assert failed: 'this function call syntax tries 'obj.func(...)', then 'func(obj,...);', but both failed - if this function call is passing a local variable that will be modified by the function, but that variable is never used again in the function so the new value is never used, that's likely the problem - if that's what you intended, add another line '_ = obj;' afterward to explicitly discard the new value of the object'

That's at least a clearer clue to the programmer.

Thanks!

@bluetarpmedia
Copy link
Contributor Author

bluetarpmedia commented Mar 18, 2024

I may be missing something but I don't think the issue is just that the Cppfront error could be clearer?

Here's a fuller example: Godbolt

See lines 20 and 42. How should this code be changed to fix the error?

EDIT - I saw from the suggestion to add the _ = obj; statement. So the relevant snippet would become:

(copy lk: std::unique_lock = (m))
{
    cv.wait(lk, :() -> bool = processed);

    _ = lk;  // <-- New
}

Similar to my RAII example in #999, I find this awkward. IMO the code is clear that the condition_variable wait has a side-effect on the lock, so it seems surprising that I have to discard the lock's new value, because the new value was in fact used -- just not by me.

@hsutter
Copy link
Owner

hsutter commented Mar 18, 2024

After thinking about this, I think this is about "guard" objects. Those are the local objects whose values are used "by someone else than this function's remaining body code" after their definite last use because:

  • they're synchronization primitives like cvs and locks where the modified values are consumed on another thread that will do something with the value, or
  • they're RAII helpers like "finally" types or scope-guards where their values will be consumed in their destructors that will do a commit or rollback (or nothing if it has been 'dismissed') based on the value

... and so I think those are the only cases? I can't think of another case where code could consume the new value of a modified-at-last-use local object.

Next, what to do about it? At minimum we could just teach people who use local synchronization variables that this is how to use them in Cpp2. Or... we could add some way to declare guards (possibly even by recognizing the prefix guard in the variable's name as a meaningful convention) and then Cpp2 would know not to move from last use.

@DyXel
Copy link
Contributor

DyXel commented Mar 18, 2024

Its a tricky one, it seems std::move from definite last use inherently conflicts with the RAII helpers. Not a big fan of "(possibly even by recognizing the prefix guard in the variable's name as a meaningful convention)", doesn't seem general enough. I thought about it for days, and I've got some ideas:

  • Allow discarding directly in the parameter list to inhibit lowering std::move: so cv.wait(_ = lk, :() -> bool = processed); would lower as cv.wait(lk, ...) despite being a definite last use.

  • Create a helper cpp::impl::last_use(lk) that is somehow able to either move or not move when appropriate (I don't know if this is even implementable).

  • Teach people to grab pointers and deref them more for these cases:

main: () = {
    m : std::mutex = ();
    cv : std::condition_variable = ();

    (copy lk: std::unique_lock = (m&*)) // <-- grab pointer, deref pointer
    {
        cv.wait(lk&*, :() -> bool = true); // <-- grab pointer, deref pointer
    }
}

This works (please confirm if that's a coincidence 😛) because std::move(m) returns std::mutex&&, for which we then take the pointer std::mutex*, and which we then de-reference std::mutex&. I really like this because it gives attention to the fact that we are constructing some kind of guard / wrapper. The only thing I dislike about this is that it could cause dangling? (not sure).

EDIT: Silly me, you can't grab the pointer of a rvalue... This happens to work because std::move is not emitted when doing the above. Still to be confirmed if that's a coincidence, a bug, or intended behavior 😅. On the bright side, in that case dangling isn't possible.

@bluetarpmedia
Copy link
Contributor Author

Should #999 be merged with this issue? They’re not duplicates but I think both are symptoms of a bigger, parent issue.

I think there may be more cases than just guard objects to consider.

Here are a few that I’m wondering about (but haven’t tested yet):

  • member function with inout this that has side effects, e.g. it calls an operating system or graphics API
  • a type that implements the observer pattern and notifies its observers when its value changes. I guess this is a variation of the above with an inout member function with side effects
  • a type that is implemented as thread-safe by owning its own mutex and locking that mutex in every member function

@DyXel
Copy link
Contributor

DyXel commented Mar 18, 2024

Yes, I think we should try to merge them both, its not about UFCS per se. It's about not giving up one of the main strengths of RAII (I pass you this value, you do the cleanup for me, regardless of how we exit the scope).

@JohelEGP
Copy link
Contributor

Still to be confirmed if that's a coincidence, a bug, or intended behavior

Intended. See #558 (comment).

@hsutter
Copy link
Owner

hsutter commented Mar 18, 2024

Yes, I agree this isn't about UFCS -- would you like to update the issue titles to reflect that?

@gregmarr
Copy link
Contributor

gregmarr commented Mar 18, 2024

What if we had a way to say this?

"The destructor of this type is always the definite last use of an object of this type, no matter what inout this functions are called on it during its lifetime."

That would take care of one group of issues.

Another group of issues is this.

"This function has side effects so we need it to not be const, but these side effects are not observable on the object itself, so it's okay that the caller doesn't read from this object again later to observe them."

Is that an accurate statement of the group of issues? Do we need a decoration for this that indicates this class of behavior?

@bluetarpmedia bluetarpmedia changed the title [BUG] UFCS fails with move after last-use Move from last-use causes problems with RAII types, and types with member functions with side effects Mar 19, 2024
@bluetarpmedia
Copy link
Contributor Author

Is that an accurate statement of the group of issues?

Yes, I think you nicely summarised the issues. After thinking about this further I prefer the idea of opting-in to the "move from last-use" behaviour, for the reasons I outlined in #1030.

One major point, for me, is that I want to use my existing C++ types (e.g. RAII types, or code that I'm slowly migrating to Cpp2) which I can't modify with a metafunction or a new decoration for this.

@hsutter
Copy link
Owner

hsutter commented Mar 19, 2024

Thanks! See #1030, this should now be fixed in the above commit.

The original code example now works. (Unrelatedly, making the code work on MSVC also requires removing the native_handle call, because evidently the MS STL std::mutex doesn't have that member, I don't know why; but that's unrelated to this issue.)

@bluetarpmedia
Copy link
Contributor Author

Interesting, I just learned that the presence of std::mutex::native_handle is implementation-defined (not just its behaviour). Anyway, that line with native_handle was only a workaround for #968.

@hsutter
Copy link
Owner

hsutter commented Mar 20, 2024

TIL:

the presence of std::mutex::native_handle is implementation-defined (not just its behaviour)

🧑‍🎓

Thanks for sharing that info!

@hsutter
Copy link
Owner

hsutter commented Nov 1, 2024

Brief update: On reviewing these changes in a recent commit discussion, I think I'm going to switch gears and not move from last use for object whose names start with guard. I think that expresses the first intent that Greg summarized, non-intrusively at each use site in a way that's self-documenting.

@DyXel
Copy link
Contributor

DyXel commented Nov 1, 2024

I find depending on a naming convention to be inelegant, what about users that use CamelCase such as GuardLock. I also don't have a better alternative so we'll see how this one pans out.

@gregmarr
Copy link
Contributor

gregmarr commented Nov 4, 2024

I find depending on a naming convention to be inelegant,

Agreed, I had the same thought. It's okay for a short term test, but I wouldn't want to see it as a long term feature. Also, it seems like this marking should be on the type definition rather than the name of the particular object. This is a property of the type, not of the particular instance of the type.

Spitballing here, what if marking the destructor as nodiscard means that the destructor itself is considered the definite last use? It means that the destructor has important behavior beyond just cleanup at an arbitrary place, that the enclosing scope is important.

Another possibility is to use a metafunction tag such as @scoped on the type to indicate that behavior. I'm not sure if that could be implemented as a user metafunction, I suspect it would need to be a built-in capability of the compiler in order to suppress the moves.

@gregmarr
Copy link
Contributor

gregmarr commented Nov 4, 2024

Tagging @hsutter just in case you don't get a notification. I see this destructor tagging is basically the same thing I said earlier this year. Glad I was consistent.

@DyXel
Copy link
Contributor

DyXel commented Nov 4, 2024

Also, it seems like this marking should be on the type definition rather than the name of the particular object. This is a property of the type, not of the particular instance of the type.

Very good point.

The problem with any kind of annotation on Cpp2 end is that it will not pan out to C++-written types. I am starting to think that automatic move from last use is inherently incompatible with scope-guard type of objects, at least named ones.

@gregmarr
Copy link
Contributor

gregmarr commented Nov 4, 2024

The problem with any kind of annotation on Cpp2 end is that it will not pan out to C++-written types. I am starting to think that automatic move from last use is inherently incompatible with scope-guard type of objects, at least named ones.

That's unfortunately true. :(

I tried putting [[nodiscard]] on a Cpp1 destructor but got a compiler warning because it's on a thing with a void return type.

You can actually put the [[nodiscard]] attribute on a class definition in Cpp1. I don't know if this has any kind of meaning there right now.

@DyXel
Copy link
Contributor

DyXel commented Nov 4, 2024

I guess Herb went with the name annotation because is the simplest solution to implement at the moment, and an issue that only shows when dealing with very specific Cpp2 code. I think what he did is add a way to mark a specific instance of an object as "never move me implicitly" when its identifier matches the guard_.* regex, the marking which I am fine with, the problem is using identifiers to achieve that effect.

I think a more general solution can be combined with something similar to what is proposed at #1332. In fact, we still need a way to mark certain things like static (notwithstanding safety issues of globals). But for this issue an example usage would be: nomove lk: std::unique_lock = (m);. _ named variables would be implicitly nomove already.

You can actually put the [[nodiscard]] attribute on a class definition in Cpp1. I don't know if this has any kind of meaning there right now.

It means you should not discard the value you get of that type. So its a good idea to put it in types like lock guards which you want to "pin" to a scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants