-
Notifications
You must be signed in to change notification settings - Fork 259
Fix removing parentheses when using deducing assignement, closes #327 #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix removing parentheses when using deducing assignement, closes #327 #331
Conversation
Thanks! I'm not sure this is the right fix yet. Two things:
I see that this change fixes that, but it doesn't fix
which still have incorrect code gen with this fix. (Thanks for pointing this out!)
I see that those differences are that code gen like this today
with this fix becomes instead
and we should be avoiding generating extra parens (they're often innocuous, but I'm concerned about ending up hitting the comma operator). |
Will work on that. |
The issue is caused by `emit(expression_list_node)` that skips parentheses when node is inside initializer - that serves cases like: ```cpp v : std::vector<int> = (1,2,3); ``` Where it generates: ```cpp std::vector<int> v{1,2,3}; ``` When `:=` is used in the following cases: ```cpp d := (1 + 2) * (3 + 4) * (5 + 6); ``` It removes first parentheses and we end up with: ```cpp auto d = {1 + 2 * (3 + 4) * ( 5 + 6)}; ``` This change corrects this behaviour on the parsing side. After parsing expression list it checks if the next lexeme is `Semicolon`. If it is it means that we are on the initializer of the form: ```cpp d1 := ((2 + 1) * (4 - 1) * (8 - 3)); d3 := (move d2); v : std::vector<int> = (); ``` And we can suppres printing of parentheses - as there will be braces: ```cpp auto d1 {(2 + 1) * (4 - 1) * (8 - 3)}; auto d3 {std::move(d2)}; std::vector<int> v {}; ``` When next lexeme is not `Semicolon` it means that we are in initializer of the form: ```cpp d2 := (2 + 1) * (4 - 1) * (8 - 3); d4 : _ = (1 + 2) * (3 + 4) * (5 + 6); d5 : int = (1 + 2) * (3 + 4) * (5 + 6); ``` And we need to keep all the parentheses and it will be generates to: ```cpp auto d2 {(2 + 1) * (4 - 1) * (8 - 3)}; auto d4 {(1 + 2) * (3 + 4) * (5 + 6)}; int d5 {(1 + 2) * (3 + 4) * (5 + 6)}; ```
86bfe85
to
4ab0643
Compare
@hsutter I found the solution, and the current proposed solution solves all the cases. There is only one difference in generated code in res := (42).ufcs(); Generates after this change: auto res {CPP2_UFCS_0(ufcs, (42))}; // previously: auto res {CPP2_UFCS_0(ufcs, 42)}; This is correct behavior, as the parentheses are not required in this context so they should be left. I have checked the following code: r1 := 42.ufcs();
r2 := (42).ufcs(); That will generate after this change: auto r1 {CPP2_UFCS_0(ufcs, 42)};
auto r2 {CPP2_UFCS_0(ufcs, (42))}; I have updated the description of this PR. |
There is one issue left: shall we rename |
Thanks! Trying it out now... BTW in case I wasn't clear in my comment about hitting the comma operator, I meant that when there are multiple parameters, I want to avoid accidentally hit a case where a list of multiple things like |
OK, looks good -- including that now the one change in the code gen is actually a correction, that's always a good sign. Thanks! Merging...
What would be a better name, |
a : type = ( /* any complex expression */ ); So, it is more about the initializer being surrounded entirely by parenthesis. |
@hsutter I have just realized that we have anonymous objects that we can create in place e.g. as a function argument. Need to check but probably my change broke the following code: fun(:std::vector=(1,2,3)); Need to check it. |
Sorry, this change brake: fun(:std::vector=(1,2,3)); Generates: fun(std::vector{(1, 2, 3)}); Looking for a fix... and we need to add a regression test for in-place creation of objects. |
@hsutter OK, one way to fix is to replace (in if (curr().type() != lexeme::Semicolon) {
expr_list->inside_initializer = false;
} with if (
curr().type() != lexeme::Semicolon
&& curr().type() != lexeme::RightParen
&& curr().type() != lexeme::RightBracket
&& curr().type() != lexeme::Comma
) {
expr_list->inside_initializer = false;
} That should cover cases: f1(:std::vector=(1,2,3));
f2(:std::vector=(1,2,3),:std::vector=(4,5));
ar[:index=(1,2)];
ar2[:index=(1,2), :index=(2,1)]; that will generate: f1(std::vector{1, 2, 3});
f2(std::vector{1, 2, 3}, std::vector{4, 5});
cpp2::assert_in_bounds(ar, index{1, 2});
cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); Are there any other cases that we should cover in-placed created objects? |
That should cover cases: ```cpp f1(:std::vector=(1,2,3)); f2(:std::vector=(1,2,3),:std::vector=(4,5)); ar[:index=(1,2)]; ar2[:index=(1,2), :index=(2,1)]; ``` that will generate: ```cpp f1(std::vector{1, 2, 3}); f2(std::vector{1, 2, 3}, std::vector{4, 5}); cpp2::assert_in_bounds(ar, index{1, 2}); cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); ```
That should cover cases: ```cpp f1(:std::vector=(1,2,3)); f2(:std::vector=(1,2,3),:std::vector=(4,5)); ar[:index=(1,2)]; ar2[:index=(1,2), :index=(2,1)]; ``` that will generate: ```cpp f1(std::vector{1, 2, 3}); f2(std::vector{1, 2, 3}, std::vector{4, 5}); cpp2::assert_in_bounds(ar, index{1, 2}); cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); ```
That should cover cases: ```cpp f1(:std::vector=(1,2,3)); f2(:std::vector=(1,2,3),:std::vector=(4,5)); ar[:index=(1,2)]; ar2[:index=(1,2), :index=(2,1)]; ``` that will generate: ```cpp f1(std::vector{1, 2, 3}); f2(std::vector{1, 2, 3}, std::vector{4, 5}); cpp2::assert_in_bounds(ar, index{1, 2}); cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); ```
That should cover cases: ```cpp f1(:std::vector=(1,2,3)); f2(:std::vector=(1,2,3),:std::vector=(4,5)); ar[:index=(1,2)]; ar2[:index=(1,2), :index=(2,1)]; ``` that will generate: ```cpp f1(std::vector{1, 2, 3}); f2(std::vector{1, 2, 3}, std::vector{4, 5}); cpp2::assert_in_bounds(ar, index{1, 2}); cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); ```
The issue is caused by `emit(expression_list_node)` that skips parentheses when node is inside initializer - that serves cases like: ```cpp v : std::vector<int> = (1,2,3); ``` Where it generates: ```cpp std::vector<int> v{1,2,3}; ``` When `:=` is used in the following cases: ```cpp d := (1 + 2) * (3 + 4) * (5 + 6); ``` It removes first parentheses and we end up with: ```cpp auto d = {1 + 2 * (3 + 4) * ( 5 + 6)}; ``` This change corrects this behaviour on the parsing side. After parsing expression list it checks if the next lexeme is `Semicolon`. If it is it means that we are on the initializer of the form: ```cpp d1 := ((2 + 1) * (4 - 1) * (8 - 3)); d3 := (move d2); v : std::vector<int> = (); ``` And we can suppres printing of parentheses - as there will be braces: ```cpp auto d1 {(2 + 1) * (4 - 1) * (8 - 3)}; auto d3 {std::move(d2)}; std::vector<int> v {}; ``` When next lexeme is not `Semicolon` it means that we are in initializer of the form: ```cpp d2 := (2 + 1) * (4 - 1) * (8 - 3); d4 : _ = (1 + 2) * (3 + 4) * (5 + 6); d5 : int = (1 + 2) * (3 + 4) * (5 + 6); ``` And we need to keep all the parentheses and it will be generates to: ```cpp auto d2 {(2 + 1) * (4 - 1) * (8 - 3)}; auto d4 {(1 + 2) * (3 + 4) * (5 + 6)}; int d5 {(1 + 2) * (3 + 4) * (5 + 6)}; ```
That should cover cases: ```cpp f1(:std::vector=(1,2,3)); f2(:std::vector=(1,2,3),:std::vector=(4,5)); ar[:index=(1,2)]; ar2[:index=(1,2), :index=(2,1)]; ``` that will generate: ```cpp f1(std::vector{1, 2, 3}); f2(std::vector{1, 2, 3}, std::vector{4, 5}); cpp2::assert_in_bounds(ar, index{1, 2}); cpp2::assert_in_bounds(ar2, index{1, 2}, index{2, 1}); ```
The issue is caused by
emit(expression_list_node)
that skips parentheses when a node is inside the initializer - that serves cases like:Where it generates:
When
:=
is used in the following cases:It removes the first parentheses, and we end up with the following:
This change corrects this behavior on the parsing side. After parsing
the expression list it checks if the next lexeme is Semicolon. If it is it
means that we are on the initializer of the form:
And we can suppress printing of parentheses - as there will be braces:
When the next lexeme is not
Semicolon
it means that we are in an initializerof the form:
And we need to keep all the parentheses, and it will be generated to:
Closes #327. All regression tests pass.
There is only one difference in generated code in
pure2-ufcs-member-access-and-chaining.cpp2
test. The following line:res := (42).ufcs();
Generates after this change:
This is correct behavior, as the parentheses are not required in this context so they should be left. I have checked the following code:
That will generate after this change: