Skip to content

C++: IR generation for new and new[] #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 23, 2018

Conversation

dave-bartolomeo
Copy link
Contributor

These expressions are a little trickier than most because they include an implicit call to an allocator function. The database tells us which function to call, but we have to synthesize the allocation size and alignment arguments ourselves. The alignment argument, if it exists, is always a constant, but the size argument requires multiplication by the element count for most NewArrayExprs. I introduced the new TranslatedAllocationSize class to handle this.

I also factored the common bits of NewExpr and NewArrayExpr into a common base class NewOrNewArrayExpr in a separate commit.

This PR is rebased on top of #72. Once that's merged, I'll rebase this one to eliminate the extra commits.

@dave-bartolomeo dave-bartolomeo added C++ WIP This is a work-in-progress, do not merge yet! labels Aug 20, 2018
result = getAllocatorCall().getArgument(1) or
// Otherwise, the alignment winds up as child number 3 of the `new`
// itself.
result = getChild(3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that if getAllocatorCall() has a result, then getChild(3) has no result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right.

* AST as an `Expr` (`TranslatedExpr`) or was synthesized from something other
* than an `Expr` .
*/
abstract class TranslatedOrSynthesizedExpr extends TranslatedElement {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactoring introduces a lot of complication into a file that's already very complicated, so I'd like to understand the need. What makes a "synthesized" element different from a "translated" one? The comment suggests that it might be synthesized from something other than an Expr, but AFAICT all classes in this file still have an Expr result from getAST. What would be the problem with using the existing mechanisms for turning one AST element into multiple Instructions? The end result of all this is that the new-expression effectively turns into instructions for computing sizes, calling the allocator, and calling the constructor, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making me think harder about this. Having all of the instructions generated from the TranslatedNewArrayExpr itself makes for a few really complicated predicates, since you have to get the instruction successor relation right in the presence of a call to an allocator function (with its argument list) and the initializer, plus in the future the code to call the deallocator if the initialization throws an exception. Separating the work into multiple objects worked well for the parameter list in a function definition and for initializer lists, and I think it makes things easier here as well.

That said, the whole TranslatedFromExpr/TranslatedOrSynthesizedExpr distinction was unnecessary and overcomplicated. I've pushed another commit that makes everything a bit more sane: Everything is a TranslatedExpr, and each Expr has exactly one TranslatedCoreExpr, and optional TranslatedLoad, and zero or more other TranslatedExprs for complicated subcomponents like allocator calls. See the commit message and the comments I added for more details.

result = getInstruction(CallTag())
}

abstract class TranslatedCall extends TranslatedOrSynthesizedExpr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the generalisation of TranslatedCall done only to share a bit of code between the old TranslatedCall and the new TranslatedAllocatorCall? Did it end up sharing enough code to justify these additional abstractions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are about 10 methods in the new abstract TranslatedCall, not counting the small number of abstract methods I had to introduce. The important parts of the shared code are the ones that handle the ordering of the children, between the qualifier, the call target, and the argument list. I think this was worth it.

These expressions are a little trickier than most because they include an implicit call to an allocator function. The database tells us which function to call, but we have to synthesize the allocation size and alignment arguments ourselves. The alignment argument, if it exists, is always a constant, but the size argument requires multiplication by the element count for most `NewArrayExpr`s. I introduced the new `TranslatedAllocationSize` class to handle this.
I introduced some unnecessary base classes in the `TranslatedExpr` hierarchy with a previous commit. This commit refactors the hierarchy a bit to align with the following high-level description:
`TranslatedExpr` represents a translated piece of an `Expr`. Each `Expr` has exactly one `TranslatedCoreExpr`, which produces the result of that `Expr` ignoring any lvalue-to-rvalue conversion on its result. If an lvalue-to-rvalue converison is present, there is an additional `TranslatedLoad` for that `Expr` to do the conversion. For higher-level `Expr`s like `NewExpr`, there can also be additional `TranslatedExpr`s to represent the sub-operations within the overall `Expr`, such as the allocator call.
@dave-bartolomeo dave-bartolomeo removed the WIP This is a work-in-progress, do not merge yet! label Aug 23, 2018
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impressive work. One comment.

(
tag = CallTargetTag() and
opcode instanceof Opcode::FunctionAddress and
resultType instanceof BoolType and //HACK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on what the problem is and why BoolType is the best choice here. Is it because void* is not always in a db? We do have void in every db, so maybe use that and set isGLValue = true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to glval<Unknown>. I didn't want to use void because there's no such thing as a glvalue of type void in C++, and checking for instr.getResultType() instanceof VoidType is the current way to determine if an instruction returns a result at all.

Also shared some code between `TranslatedFunctionCall` and `TranslatedAllocatorCall`, and fixed dumps of glval<Unknown> to not print the size.
@jbj jbj merged commit 58e993e into github:master Aug 23, 2018
@dave-bartolomeo dave-bartolomeo deleted the dave/NewDelete2 branch September 5, 2018 18:49
aibaars added a commit that referenced this pull request Oct 14, 2021
Add AST library for control expressions (conditionals and loops)
smowton pushed a commit to smowton/codeql that referenced this pull request Dec 6, 2021
erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021
erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants