C++: IR generation for `new` and `new[]` #82

dave-bartolomeo · 2018-08-20T17:08:05Z

These expressions are a little trickier than most because they include an implicit call to an allocator function. The database tells us which function to call, but we have to synthesize the allocation size and alignment arguments ourselves. The alignment argument, if it exists, is always a constant, but the size argument requires multiplication by the element count for most NewArrayExprs. I introduced the new TranslatedAllocationSize class to handle this.

I also factored the common bits of NewExpr and NewArrayExpr into a common base class NewOrNewArrayExpr in a separate commit.

This PR is rebased on top of #72. Once that's merged, I'll rebase this one to eliminate the extra commits.

jbj · 2018-08-21T07:42:32Z

cpp/ql/src/semmle/code/cpp/exprs/Expr.qll

+      result = getAllocatorCall().getArgument(1) or
+      // Otherwise, the alignment winds up as child number 3 of the `new`
+      // itself.
+      result = getChild(3)


I assume that if getAllocatorCall() has a result, then getChild(3) has no result?

That's right.

jbj · 2018-08-21T08:52:28Z

cpp/ql/src/semmle/code/cpp/ir/internal/TranslatedExpr.qll

+ * AST as an `Expr` (`TranslatedExpr`) or was synthesized from something other
+ * than an `Expr` .
+ */
+abstract class TranslatedOrSynthesizedExpr extends TranslatedElement {


This refactoring introduces a lot of complication into a file that's already very complicated, so I'd like to understand the need. What makes a "synthesized" element different from a "translated" one? The comment suggests that it might be synthesized from something other than an Expr, but AFAICT all classes in this file still have an Expr result from getAST. What would be the problem with using the existing mechanisms for turning one AST element into multiple Instructions? The end result of all this is that the new-expression effectively turns into instructions for computing sizes, calling the allocator, and calling the constructor, right?

Thanks for making me think harder about this. Having all of the instructions generated from the TranslatedNewArrayExpr itself makes for a few really complicated predicates, since you have to get the instruction successor relation right in the presence of a call to an allocator function (with its argument list) and the initializer, plus in the future the code to call the deallocator if the initialization throws an exception. Separating the work into multiple objects worked well for the parameter list in a function definition and for initializer lists, and I think it makes things easier here as well.

That said, the whole TranslatedFromExpr/TranslatedOrSynthesizedExpr distinction was unnecessary and overcomplicated. I've pushed another commit that makes everything a bit more sane: Everything is a TranslatedExpr, and each Expr has exactly one TranslatedCoreExpr, and optional TranslatedLoad, and zero or more other TranslatedExprs for complicated subcomponents like allocator calls. See the commit message and the comments I added for more details.

jbj · 2018-08-21T08:53:58Z

cpp/ql/src/semmle/code/cpp/ir/internal/TranslatedExpr.qll

-    result = getInstruction(CallTag())
-  }
-
+abstract class TranslatedCall extends TranslatedOrSynthesizedExpr {


Is the generalisation of TranslatedCall done only to share a bit of code between the old TranslatedCall and the new TranslatedAllocatorCall? Did it end up sharing enough code to justify these additional abstractions?

There are about 10 methods in the new abstract TranslatedCall, not counting the small number of abstract methods I had to introduce. The important parts of the shared code are the ones that handle the ordering of the children, between the qualifier, the call target, and the argument list. I think this was worth it.

These expressions are a little trickier than most because they include an implicit call to an allocator function. The database tells us which function to call, but we have to synthesize the allocation size and alignment arguments ourselves. The alignment argument, if it exists, is always a constant, but the size argument requires multiplication by the element count for most `NewArrayExpr`s. I introduced the new `TranslatedAllocationSize` class to handle this.

I introduced some unnecessary base classes in the `TranslatedExpr` hierarchy with a previous commit. This commit refactors the hierarchy a bit to align with the following high-level description: `TranslatedExpr` represents a translated piece of an `Expr`. Each `Expr` has exactly one `TranslatedCoreExpr`, which produces the result of that `Expr` ignoring any lvalue-to-rvalue conversion on its result. If an lvalue-to-rvalue converison is present, there is an additional `TranslatedLoad` for that `Expr` to do the conversion. For higher-level `Expr`s like `NewExpr`, there can also be additional `TranslatedExpr`s to represent the sub-operations within the overall `Expr`, such as the allocator call.

jbj

This is impressive work. One comment.

jbj · 2018-08-23T13:12:33Z

cpp/ql/src/semmle/code/cpp/ir/internal/TranslatedExpr.qll

+    (
+      tag = CallTargetTag() and
+      opcode instanceof Opcode::FunctionAddress and
+      resultType instanceof BoolType and //HACK


Please elaborate on what the problem is and why BoolType is the best choice here. Is it because void* is not always in a db? We do have void in every db, so maybe use that and set isGLValue = true.

Changed to glval<Unknown>. I didn't want to use void because there's no such thing as a glvalue of type void in C++, and checking for instr.getResultType() instanceof VoidType is the current way to determine if an instruction returns a result at all.

Also shared some code between `TranslatedFunctionCall` and `TranslatedAllocatorCall`, and fixed dumps of glval<Unknown> to not print the size.

Add AST library for control expressions (conditionals and loops)

Kotlin: Add object support

…r-repos

…other-repos

dave-bartolomeo added C++ WIP This is a work-in-progress, do not merge yet! labels Aug 20, 2018

dave-bartolomeo assigned jbj Aug 20, 2018

jbj reviewed Aug 21, 2018

View reviewed changes

dave-bartolomeo added 2 commits August 21, 2018 11:10

Create common base class for NewExpr and NewArrayExpr

07c08f8

dave-bartolomeo force-pushed the dave/NewDelete2 branch from e9db15f to b9a8293 Compare August 22, 2018 17:49

dave-bartolomeo removed the WIP This is a work-in-progress, do not merge yet! label Aug 23, 2018

jbj reviewed Aug 23, 2018

View reviewed changes

C++: Use glval<Unknown> as type of call target

72e7235

Also shared some code between `TranslatedFunctionCall` and `TranslatedAllocatorCall`, and fixed dumps of glval<Unknown> to not print the size.

jbj approved these changes Aug 23, 2018

View reviewed changes

jbj merged commit 58e993e into github:master Aug 23, 2018

dave-bartolomeo deleted the dave/NewDelete2 branch September 5, 2018 18:49

kamarcum unassigned jbj Apr 28, 2020

snyk-bot mentioned this pull request Apr 20, 2021

[Snyk] Upgrade xpath from 0.0.27 to 0.0.32 majacQ/codeql#2

Merged

snyk-bot mentioned this pull request Jun 23, 2021

[Snyk] Upgrade xpath from 0.0.27 to 0.0.32 aliscco/codeql#8

Open

aibaars added a commit that referenced this pull request Oct 14, 2021

Merge pull request #82 from github/more_exprs

03d407e

Add AST library for control expressions (conditionals and loops)

smowton pushed a commit to smowton/codeql that referenced this pull request Dec 6, 2021

Merge pull request github#82 from github/igfoo/object

c803328

Kotlin: Add object support

erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021

Merge pull request github#82 from github/esbena/codeql-action-on-othe…

e46ccc0

…r-repos

erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021

QL: Merge pull request github#82 from github/esbena/codeql-action-on-…

238fba9

…other-repos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++: IR generation for `new` and `new[]` #82

C++: IR generation for `new` and `new[]` #82

Uh oh!

dave-bartolomeo commented Aug 20, 2018

Uh oh!

jbj Aug 21, 2018

Uh oh!

dave-bartolomeo Aug 22, 2018

Uh oh!

jbj Aug 21, 2018

Uh oh!

dave-bartolomeo Aug 23, 2018

Uh oh!

jbj Aug 21, 2018

Uh oh!

dave-bartolomeo Aug 22, 2018

Uh oh!

jbj left a comment

Uh oh!

jbj Aug 23, 2018

Uh oh!

dave-bartolomeo Aug 23, 2018

Uh oh!

Uh oh!

C++: IR generation for new and new[] #82

C++: IR generation for new and new[] #82

Uh oh!

Conversation

dave-bartolomeo commented Aug 20, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

C++: IR generation for `new` and `new[]` #82

C++: IR generation for `new` and `new[]` #82