Data flow: Cache `TNodeEx` #17300

hvitved · 2024-08-26T12:03:06Z

This PR moves the TNodeEx newtype from DataFlowImpl.qll into DataFlowImplCommon.qll (and makes it cached). The purpose of doing this is to avoid an additional newtype creation for all instantiations of the data flow library, and as witnessed by the DCA runs it has a very positive impact on memory/cache usage (and in general also a modest impact on timing).

Before moving the newtype, the first commit removes the Boolean hasRead column from TNodeImplicitRead, in order to the size of the newtype when later caching it. This commit then also adjusts the generated edges relation by never revealing the implicit reads that happen at sinks.

Commit-by-commit review is suggested.

geoffw0

CPP, Swift changes LGTM. DCA looks fine as well, though performance-wise I can only detect wobble.

We've lost some duplicate nodes in Swift, which is probably a good thing assuming it was expected.

aschackmull · 2024-09-05T12:39:15Z

Two comments so far:

I don't think it's necessary to cut down the size of NodeExImpl based on the configuration - simply restricting the read step relation instead ought to achieve the same and we avoid both scanning the NodeExImpl relation to create the reduced type and the risk of the compiler inserting hard-to-eliminate type-checks.
There's a bug in relation to provenance: When we merge the last two steps before a sink where the final step is an implicit read, and both steps have non-empty provenance, then we lose the sink-provenance.

github-actions bot added the DataFlow Library label Aug 26, 2024

hvitved force-pushed the dataflow/node-ex-cached branch from 9d68c95 to 140d0e2 Compare August 26, 2024 13:17

github-actions bot added C# Ruby labels Aug 26, 2024

hvitved force-pushed the dataflow/node-ex-cached branch from 35caf3c to 1b90332 Compare August 26, 2024 19:18

github-actions bot removed C# Ruby labels Aug 26, 2024

hvitved force-pushed the dataflow/node-ex-cached branch from 1b90332 to d7bb558 Compare August 27, 2024 07:19

github-actions bot added the C++ label Aug 27, 2024

hvitved force-pushed the dataflow/node-ex-cached branch from d7bb558 to cdf136f Compare August 27, 2024 08:30

github-actions bot added Java Ruby Swift labels Aug 27, 2024

hvitved force-pushed the dataflow/node-ex-cached branch 6 times, most recently from 2ef64aa to 36c92e2 Compare September 2, 2024 09:10

hvitved added 3 commits September 2, 2024 12:46

Data flow: Remove Boolean column from TNodeImplicitRead

ffade2d

Update expected test output

e064564

Data flow: Cache TNodeEx

cb57b7f

hvitved force-pushed the dataflow/node-ex-cached branch from 36c92e2 to cb57b7f Compare September 2, 2024 10:47

hvitved added the no-change-note-required This PR does not need a change note label Sep 3, 2024

hvitved marked this pull request as ready for review September 3, 2024 08:31

hvitved requested review from a team as code owners September 3, 2024 08:31

geoffw0 reviewed Sep 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data flow: Cache `TNodeEx` #17300

Data flow: Cache `TNodeEx` #17300

hvitved commented Aug 26, 2024 •

edited

Loading

geoffw0 left a comment

aschackmull commented Sep 5, 2024

Data flow: Cache TNodeEx #17300

Are you sure you want to change the base?

Data flow: Cache TNodeEx #17300

Conversation

hvitved commented Aug 26, 2024 • edited Loading

geoffw0 left a comment

Choose a reason for hiding this comment

aschackmull commented Sep 5, 2024

Data flow: Cache `TNodeEx` #17300

Data flow: Cache `TNodeEx` #17300

hvitved commented Aug 26, 2024 •

edited

Loading