Skip to content

is()/as(): refactor of is() and as() for variant to new design (part 2 of n) #1203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 28, 2024

Conversation

filipsajdak
Copy link
Contributor

@filipsajdak filipsajdak commented Aug 4, 2024

Currently, is() and as() implementations are done using two redundant approaches - using concepts subsumption rules and using constexpr ifs.

Thanks to the discussions with @JohelEGP, we moved some of them to the constexpr ifs. This PR is meant to move all implementations to one version, which will simplify the implementation and make modification of them more manageable.

My intention is not to introduce any functional change - only refactoring.

Unfortunately, I have already found a couple of bugs that were accidentally fixed in a new version.

Example:

v : std::variant<int, int> = (42);
if v is (in(0,100)) {
    std::cout << " in(0,100)"; // this works
}
v.emplace<1>(42);
if v is (in(0,100)) {
    std::cout << " in(0,100)"; // that will not work
}

I will minimize these changes and introduce workarounds to keep the old behavior. To ensure there is no functional change introduced. These workarounds will be removed in the upcoming PRs.

This PR introduces extensive tests that will ensure no functional change is introduced, and that will track future changes of is() or as() implementations.

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch 6 times, most recently from 6077305 to 6c1d14d Compare August 10, 2024 14:30
@filipsajdak filipsajdak changed the title is(): make is() for inspecting variant generic is()/as(): refactor of is() and as() to new design Aug 13, 2024
@filipsajdak filipsajdak marked this pull request as draft August 13, 2024 19:00
@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from 6c1d14d to 7924b70 Compare August 13, 2024 20:06
@filipsajdak filipsajdak marked this pull request as ready for review August 13, 2024 20:07
@filipsajdak
Copy link
Contributor Author

I have added 6eb1ce1 that is already addressed by #1224 and #1223 - I will remove it after review.

@filipsajdak filipsajdak changed the title is()/as(): refactor of is() and as() to new design is()/as(): refactor of is() and as() for variant to new design (part 2 of n) Aug 13, 2024
@filipsajdak
Copy link
Contributor Author

@hsutter please review it. I am suppressing my willingness to do refactoring and enhancements at once. I understand that this code is not as it should be - I am pushing myself to keep the existing functionalities (I am adding tests that will track the future fixes and functional changes).

I need to discuss the approach that is() and as() should follow. I will describe it in the next comments.

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch 4 times, most recently from 660b4a5 to a28dbf9 Compare August 14, 2024 10:58
@hsutter
Copy link
Owner

hsutter commented Aug 14, 2024

Thanks!

Hmm, that example does not currently work the way I wanted it to. Here is a complete version I just tried:

in: (a, b) :(x) a$ <= x <= b$;

main: () = {
    v : std::variant<int, int, std::string> = "xyzzy";
    v.emplace<0>(42);
    if v is (in(0,100)) {
        std::cout << "<0>: in(0,100)"; // this is reached
    }
    v.emplace<1>(42);
    if v is (in(0,100)) {
        std::cout << "<1>: in(0,100)"; // this is not
    }
}

I intended that both of those succeed, because v holds an int in either case.

Edited to add: So please feel free to make it do that!

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from a28dbf9 to 92df7c6 Compare August 14, 2024 20:37
@filipsajdak
Copy link
Contributor Author

@hsutter OK, now this works as expected.

Should I also enable the handling of variants with more than 20 types? We can remove the limit and handle as many as variant supports :).

@hsutter
Copy link
Owner

hsutter commented Aug 14, 2024

Thanks!

Should I also enable the handling of variants with more than 20 types? We can remove the limit and handle as many as variant supports :).

Sure, if that's straightfoward to do... variadic is and as templates I assume?

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from 92df7c6 to ae252e9 Compare August 14, 2024 20:41
@filipsajdak
Copy link
Contributor Author

@hsutter OK, they are now generic (is() and as()).

I need to sync the tests.

@filipsajdak
Copy link
Contributor Author

@hsutter one another question that is important: how "smart" is() and as() should be?

The highest principle is that if x is T is true, then x as T should be valid.

We can do a lot of checks with is(), and generally, we check if something could bind to some other type. Things become more complex when we "extract" the type from a more complex type, like in the following examples:

v: std::variant<int,double,std::string> = 42;
if v is int {
  f(v as int);
}

o: std::optional<int> = 44;
if o is int {
  f(o as int);
}

a: std::any = 100;
if a is int {
  f(a as int);
}

In these examples, we are extracting the type from a more complex type - that works fine. And here is the question about how "smart" it should be: what is the expected behavior in the following cases:

v: std::variant<std::variant<std::variant<int>>> = 42;
if v is int {
  f(v as int);
}

The intention is clear, and we can unambiguously identify what we need to do to achieve it (and I have implementation that is capable of that).

A similar code can be written for std::optional, std::any, smart pointers, raw pointers, and a combination. It is possible to make the above code work.

I have some thoughts about it:

  1. the user knows what he/she is doing, so if he/she asks for int, we should try to extract it (with all the checks to prevent errors),
  2. the user might be wrong, so if he/she doesn't ask for the exact type, I will fail to let him/her know that something might be wrong,
  3. asking for base types in inspect() expressions will be more probable as asking for int will match many complex types that we will be able to extract int from.

I have also played a little bit with that approach, and it works but changes the behavior of this feature dramatically. On the positive side, I can add that it simplifies the code of is()/as() functions.

I am looking forward to your point of view.

@hsutter
Copy link
Owner

hsutter commented Aug 15, 2024

And here is the question about how "smart" it should be: what is the expected behavior in the following cases:

v: std::variant<std::variant<std::variant<int>>> = 42;
if v is int {
  f(v as int);
}

The intention is clear, and we can unambiguously identify what we need to do to achieve it (and I have implementation that is capable of that).

Excellent question. This comes up with future<future<int>> and vector<vector<int>> and similar as well (including, as you say, **int. For some of those, there is pressure to "be smart and do silent multiple unwrapping," but I'm against that.

The principle is: Cpp2 tried to avoid magic, and to make important operations explicit and visible. Therefore, when you unwrap something, always only unwrap once. Otherwise you get semantic surprises and inconsistencies.

Example of semantic surprises: Consider future<future<int>>. To get the int, you have to wait twice. The first wait gets you a future<int> (from thread or task A), and then you have to do a second wait on that to get the int (from thread or task B which can be completely unrelated to A). Waiting for both would inject blocking and invisible potential deadlocks. (Similarly, when you have **int or unique_ptr<shared_ptr<int>> you have to deference twice, and dereference (like wait) is a important operation that should be explicit, not hidden.)

Example of inconsistencies: I think it would be bad if variant<variant<int>> reported is int == true and as int worked. Because then as soon as you change it to variant<variant<int>, std::string> the behavior changes? Or what about variant<variant<int>, int>?

Does that make sense? I'm definitely always open to new information and examples, but that's my current thinking about it.

Great question, and a common one that comes up a lot!

@hsutter
Copy link
Owner

hsutter commented Aug 15, 2024

Oh, and a thought on this:

  1. the user knows what he/she is doing,

I don't think they usually do. First, they are asking is, which means they have a question. And in generic code like func: (x) = { if x is int { ... } ... } they likely have no idea whether that they're being given is int or variant<int> and the code likely wouldn't know what do with variant<variant<int>>...?

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Aug 15, 2024

generic code like func: (x) = { if x is int { ... } ... } they likely have no idea whether that they're being given is int or variant<int> and the code likely wouldn't know what do with variant<variant<int>>...?

Because we have special is() and as() for types like std::optional, std::variant, or std::any, we are already in some "magical behavior" area.

Without these special cases, we could write the following code:

//This is the incorrect code - do not use it
func: (x) = {
  if x is int {
    std::cout << (x + 1) << std::endl;
  }
}

Because we treat std::optional, std::variant, or std::any, as a special case and we extract the type from them, the correct way of writing that code is:

func: (x) = {
  if x is int {
    std::cout << ((x as int) + 1) << std::endl;
  }
}

The general rule is "if you check variable with is() you should use it with an as()" - that will make the code work also for cases when is() is "deep inspection" and as() is doing "extraction".

And if we accept that as a rule and the correct way of working, then there is not much difference between extracting int from std::optional<int> and extracting it from std::optional<std::variant<*int, *double, *std::string>> - the combination of is() and as() can do the work for you. The only question is if you can write the correct code that extracts int by knowing only the type.

If you write the same boilerplate code that checks and extracts the embedded type, then the compiler, knowing the type, can write the same code for you.

Let's assume that we write a special function to work with the complex type:

func: (x : std::optional<std::variant<*int, *double, *std::string>>) = {
  if x.has_value() && x*.index() == 0 && std::get<0>(x*) != nullptr {
    std::cout << ((std::get<0>(x*)*) + 1) << std::endl;
  }
}

I wrote this function based only on its type. Is it correct (assuming I didn't make any mistake in the code)? If the answer is yes, then the generic version of the function can do the same (assuming it will do the recursive job):

func: (x) = {
  if x is int {
    std::cout << ((x as int) + 1) << std::endl;
  }
}

And it could do even more! E.g. it will handle cases when std::variant might contain multiple *int types or other types that might contain int (that will work as the variant can only contain one type at the time so inspecting the type that it currently contains is unambiguous.

The same job it will do for the std::variant<std::variant<int,...>, ...> as from the logical perspective, these types can contain only one type at a time, so it is just a matter of inspection of the type to check if it includes the type in question and later extracting the type.

My thinking is based on the assumption that information about the type is enough to write correct code that checks if the extraction is possible and then extracts the type. If it is true, I believe that recursive is() and as() can heavily simplify the generic code, ensuring all the checks are in place.

Taking your example of future<future<int>>, what is the correct code we should write? If the following code is correct:

// This is the incorrect code - do not use it
func: (x) = {
  if (x.wait_for(0s) == future_status::ready) && (x.get().wait_for(0s) == future_status::ready) {
    std::cout << ((x.get().get()) + 1) << std::endl;
  }
}

Then, I believe it will be handled correctly. If it is incorrect code (and because std::future::get() returns T by value, I believe it is incorrect), then we will have an issue with this code. On the other hand, writing is() to do that kind of extract is impossible as std::future::get() is a non-const method, which will be forbidden in is() as we work there on the const type.

I will need to write the tests for that, but as for my current understanding, the generic function

func: (x) = {
  if x is int {
    std::cout << ((x as int) + 1) << std::endl;
  }
}

will not compile for future<future<int>> as it requires a "const-inspection" of the type, which in this case is not possible.

@hsutter
Copy link
Owner

hsutter commented Aug 15, 2024

After sleeping on this a bit: I think the recursive "flattening" feature is useful for the dynamic types (variant, any, optional), and maybe multilevel pointers, and I'm open to trying the experiment of making it the default! That would let us see if anyone reports surprises. (*)

The reason that we extract int from variant<int, ...> is because variant is a dynamic type that can hold one alternative at a time. So I don't view that as magical; it's part of variant being dynamic typing. It's analogous to a *Base holding either a *Derived1 or *Derived2 etc., exactly one at a time.

For the dynamic types (variant, any, optional) I could see recursing to "flatten" those, with the principled reason that all of them only contain one value at a time, so nesting them still means the outer object holds one value at a time. Then that isn't magical either, it's principled; it still contains only one thing, and that is an int or it isn't.

That principle may hold even for multilevel pointers, because although *Base and **Base and ***Base refer to different objects, they still refer to one Base-derived object at a time if you just strip of all the indirections, so it first the "it's exactly one concrete type at a time" principle.

So yes, okay, let's try it. Thanks!

(*) Later, if there are surprises, we could then look at making it opt-in. The solution I'm familiar with in other languages is to have an explicit .flatten() call that requests the recursion take place. In Cpp2, that could be written as a nonmember flatten: (variant<....>), flatten: (any), etc. set of overloads in cpp2util.h that thanks to UFCS will work as x.flatten() for any suitable type x (and possibly have a generic default case where flatten (x) is a no-op). And then something like x.flatten() is int to express where flattening is wanted.

@filipsajdak
Copy link
Contributor Author

@hsutter OK, I will prepare a separate PR for the recursive "flattening" feature. I have been using it for a while, and the only surprise I had was that I needed to pay more attention to the order of inspect checks.

In this one, I will focus on refactoring is()/as() with variant (plus fixes and removing limits for a number of types in variant).

I need to sync the tests, and the PR will be ready.

After this PR, I will do a refactoring for is()/as() for std::optional, std::any, and pointers. That will be a good base for introducing the recursive "flattening" feature.

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from 282484b to 7c06774 Compare August 18, 2024 00:33
@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from 992254c to f5c3f97 Compare August 20, 2024 19:39
@filipsajdak
Copy link
Contributor Author

Please merge #1250 before this PR - I will need to rebase after #1250 is merged, but it contains good corrections of the failing tests.

@hsutter
Copy link
Owner

hsutter commented Aug 27, 2024

Please merge #1250 before this PR - I will need to rebase after #1250 is merged, but it contains good corrections of the failing tests.

#1250 is now merged 👍

@filipsajdak
Copy link
Contributor Author

I will prepare this PR.

@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from 12c8a24 to 330e225 Compare August 27, 2024 20:14
@filipsajdak filipsajdak force-pushed the fsajdak-extension-of-is-part3 branch from def893c to 6ffde34 Compare August 27, 2024 21:49
@filipsajdak
Copy link
Contributor Author

@hsutter, the PR is ready!

@filipsajdak
Copy link
Contributor Author

After this PR is merged the next PR to be reviewed and merged is #1251

@hsutter hsutter merged commit 78867f8 into hsutter:main Aug 28, 2024
29 checks passed
@hsutter
Copy link
Owner

hsutter commented Aug 28, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants