Skip to content

Ruby: Add post-update nodes for compound arguments #10444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 22, 2022

Conversation

hvitved
Copy link
Contributor

@hvitved hvitved commented Sep 15, 2022

Post-update nodes

Post-update nodes are used to enable flow out through arguments:

def taint_field x
  x.field = "taint"
end

taint_field(y)
sink(y.field)

There is flow from "taint" to the field field on post-update node for y in the call to taint_field, written (post) y [field], and from (post) y [field] in the call to taint_field to y [field] in the call to sink, and finally to y.field in the call to sink.

Compound arguments

However, when the argument is a compound expression, such as

taint_field(if b then x else y)
sink(y.field)

we fail to track flow, since there is no connection between (post) if b then x else y and the access to y in sink(y.field).

This PR fixes that, by also adding post-update nodes for x and y in if b then x else y, and let those nodes have two pre-update nodes:

  1. the compound argument, if b then x else y; and
  2. the underlying expressions, x and y, respectively.

This ensures that we get flow out of the call into both leafs (1), while still maintaining the invariant that the underlying expression is a pre-update node (2). Of course, compound expression can be nested, if b then x else if c then y else z, so we also need to handle that.

Common in Ruby (?)

While compound arguments such as if b then x else y may not be that common, it is easy to write an argument that is:

taint_field(x)
# vs
taint_field (x)

In the former call, x is just a normal argument, but in the latter the argument is a parenthesized expression with just one child expression, x.

@github-actions github-actions bot added the Ruby label Sep 15, 2022
@hvitved hvitved force-pushed the ruby/stmt-sequence-post-update branch 2 times, most recently from 9d4c1be to c356173 Compare September 16, 2022 11:47
@hvitved hvitved force-pushed the ruby/stmt-sequence-post-update branch from c356173 to 24b41ef Compare September 20, 2022 06:21
@hvitved hvitved changed the title Ruby/stmt sequence post update Ruby: Add post-update nodes for compound arguments Sep 20, 2022
@hvitved hvitved added the no-change-note-required This PR does not need a change note label Sep 20, 2022
@hvitved hvitved marked this pull request as ready for review September 20, 2022 11:44
@hvitved hvitved requested review from a team as code owners September 20, 2022 11:44
@tausbn
Copy link
Contributor

tausbn commented Sep 20, 2022

While compound arguments such as if b then x else y may not be that common, it is easy to write an argument that is:

taint_field(x)
# vs
taint_field (x)

In the former call, x is just a normal argument, but in the latter the argument is a parenthesized expression with just one child expression, x.

Fascinating, and a bit horrifying.

I wonder, though, could this be handled at the level of AST desugaring?

tausbn
tausbn previously approved these changes Sep 20, 2022
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python changes look, well, harmless to me. 🙂

MathiasVP
MathiasVP previously approved these changes Sep 20, 2022
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ 👍. I just realized that I forgot to update identical-files.json when I added a couple of copies of the dataflow library into experimental last week. So if #10487 gets merged before this you might need to pull in main and update the identical files in C++'s experimental directory. Sorry!

@hvitved
Copy link
Contributor Author

hvitved commented Sep 20, 2022

I wonder, though, could this be handled at the level of AST desugaring?

Yes, we could in principle desugar (x) to just x, but since we would have to handle (foo; x) as well anyway, it seems a bit pointless to special case.

@tausbn
Copy link
Contributor

tausbn commented Sep 20, 2022

I wonder, though, could this be handled at the level of AST desugaring?

Yes, we could in principle desugar (x) to just x, but since we would have to handle (foo; x) as well anyway, it seems a bit pointless to special case.

For this case, definitely. I was just thinking more generally if it would be less surprising for users to have foo (x) represented in the same way as foo(x).

@hvitved
Copy link
Contributor Author

hvitved commented Sep 20, 2022

I was just thinking more generally if it would be less surprising for users to have foo (x) represented in the same way as foo(x).

So, desugar (x) to x when it is the only argument of a call? I guess we could do that.

atorralba
atorralba previously approved these changes Sep 20, 2022
Copy link
Contributor

@atorralba atorralba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java looks harmless too 👍

michaelnebel
michaelnebel previously approved these changes Sep 21, 2022
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# LGTM

@hvitved hvitved dismissed stale reviews from michaelnebel and atorralba via 017e8de September 21, 2022 08:36
@hvitved hvitved dismissed stale reviews from MathiasVP and tausbn via 017e8de September 21, 2022 08:36
@hvitved hvitved force-pushed the ruby/stmt-sequence-post-update branch from a36c3f5 to 017e8de Compare September 21, 2022 08:36
@hvitved
Copy link
Contributor Author

hvitved commented Sep 21, 2022

Rebased to resolve merge conflicts.

michaelnebel
michaelnebel previously approved these changes Sep 21, 2022
Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# still looks good to me.

MathiasVP
MathiasVP previously approved these changes Sep 21, 2022
@hvitved hvitved dismissed stale reviews from MathiasVP and michaelnebel via 07f8b35 September 21, 2022 09:02
@hvitved hvitved force-pushed the ruby/stmt-sequence-post-update branch from 017e8de to 07f8b35 Compare September 21, 2022 09:02
@hvitved
Copy link
Contributor Author

hvitved commented Sep 21, 2022

Had to fix another semantic merge conflict.

Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

* Gets a node that may execute last in `n`, and which, when it executes last,
* will be the value of `n`.
*/
private CfgNodes::ExprCfgNode getALastEvalNode(CfgNodes::ExprCfgNode n) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When reading the comment explaining this predicate , I had expected it to be recursive. I see you take the transitive closure later.

@hvitved hvitved merged commit f0f4fe7 into github:main Sep 22, 2022
@hvitved hvitved deleted the ruby/stmt-sequence-post-update branch September 22, 2022 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants