Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stage2 translate-c: function body statements #2569

Merged
merged 38 commits into from
Jul 16, 2019

Conversation

hryx
Copy link
Contributor

@hryx hryx commented May 27, 2019

See #1964

Overview

This PR adds various translations that apply to statements inside function bodies.

Support translating from these Clang classes:

  • BinaryOperator
    • Assign
    • Add
    • Sub
  • CStyleCastExpr
  • DeclStmt
    • Var
  • DeclRefExpr
  • ImplicitCastExpr
    • ArrayToPointerDecay
    • BitCast
    • FunctionToPointerDecay
    • IntegralCast
    • LValueToRValue
  • IntegerLiteral
  • ReturnStmt
  • StringLiteral
  • Type
    • Paren
    • Pointer

Other changes

  • Make stage1 always render a space after fn to match stage2 behavior. This convergence was done to make existing translate-c test cases work for both stage1 and stage2. stage2 behavior was favored because the existing meaningful tests on rendering (i.e. zig fmt) pertain to stage2.
  • Separate generated Zig source tokens by a space instead of a newline to prevent the rendered Zig code from having excessive newlines/indentation.

Regressions

  • The stage2-only tests were commented out because function prototype translation changed after implementing "translate mode" for stage2. I left them commented rather than removing them because, as discussed, the concept of translate mode will be removed, at which point this test should be reinstated as-is.

TODO

  • Fix recursive rendering of local var decls
  • Render integer literal with correct value
  • Find "wanted" Zig name from C name
  • Convert several a few stage1 test cases to stage1+stage2
  • Clean up clang.zig?
    • This PR follows the existing organization, no need to change that right now
  • Have stage2 respect "translate mode"
  • Make a follow-up issue to remove "translate mode" and converge behavior

@andrewrk
Copy link
Member

BTW did you see that test/translate_c.zig supports stage2 tests now?

@hryx
Copy link
Contributor Author

hryx commented Jun 10, 2019

@andrewrk Cool, I hadn't seen that! I'll make use of that once the new changes can pass a test.

Current status: This is getting bigger than I wanted, but after a squish 'n' rebase it shouldn't cause any problems. Something I did is causing infinitely nesting blocks to be rendered when one or more local var decl exists in the function body. After that is solved I'll fast-track this to mergeability.

@hryx hryx marked this pull request as ready for review June 28, 2019 04:59
@hryx hryx changed the title stage2 translate-c: local var decls stage2 translate-c: function body statements Jun 28, 2019
@hryx
Copy link
Contributor Author

hryx commented Jun 28, 2019

Something is wrong with this test. It passes on my Linux machine but not on mac:

Test 6/119 translate-c-2 ignore result, explicit function arguments...
========= Expected this output: ================
pub fn foo() void {
    var a: c_int = undefined;
    _ = 1;
    _ = c"hey";
    _ = (1 + 1);
    _ = (1 - 1);
    a = 1;
}
========= But found: ===========================
pub fn foo() void {
    var a: c_int = undefined;
    _ = 1;
    _ = c"hey<\xb0\x7f";
    _ = (1 + 1);
    _ = (1 - 1);
    a = 1;
}

I think there may be a legitimate bug in my string literal implementation but I haven't identified it yet.

@hryx
Copy link
Contributor Author

hryx commented Jun 29, 2019

Ok @andrewrk, this should be good to review now — I expect the tests to pass Windows CI is failing, see below.

If you wouldn't mind sanity-checking that the last commit (247e567) was actually necessary?

@Sahnvour
Copy link
Contributor

Regarding the error, I was able to get more information by running debug zig, and it looks like it may be related to result location. See https://gist.github.com/Sahnvour/16d36b44e0d313a1cd520ba4b6e0eccd.

@andrewrk
Copy link
Member

Indeed, the crash appears to be a regression from result location changes, in the return statement of this code:

fn getExprQualType(c: *Context, expr: *const ZigClangExpr) ZigClangQualType {
    blk: {
        // If this is a C `char *`, turn it into a `const char *`
        if (ZigClangExpr_getStmtClass(expr) != .ImplicitCastExprClass) break :blk;
        const cast_expr = @ptrCast(*const ZigClangImplicitCastExpr, expr);
        if (ZigClangImplicitCastExpr_getCastKind(cast_expr) != .ArrayToPointerDecay) break :blk;
        const sub_expr = ZigClangImplicitCastExpr_getSubExpr(cast_expr);
        if (ZigClangExpr_getStmtClass(sub_expr) != .StringLiteralClass) break :blk;
        const array_qt = ZigClangExpr_getType(sub_expr);
        const array_type = @ptrCast(*const ZigClangArrayType, ZigClangQualType_getTypePtr(array_qt));
        var pointee_qt = ZigClangArrayType_getElementType(array_type);
        ZigClangQualType_addConst(&pointee_qt);
        return ZigClangASTContext_getPointerType(c.clang_context, pointee_qt);
    }
    return ZigClangExpr_getType(expr);
}

I'll modify the code to work around the issue and then try to come up with a reduction for the bug to fix separately.

@andrewrk
Copy link
Member

I was able to fix the issue. Here's an explanation.

First of all, why is it failing on Windows? I don't know, but we were tripping an LLVM assert on all platforms and Windows just happened to be the one where Undefined Behavior caused a crash. It would have eventually been caught regardless, when we tested the code on a debug build of LLVM, which I always do for the branch that tests against the next LLVM version. A debug build of LLVM catches this issue on all platforms, not just Windows.

What happened is that, before, we had this code:

    } else if (handle_is_ptr(src_return_type)) {
        LLVMValueRef store_instr = LLVMBuildStore(g->builder, result, result_loc);
        LLVMSetAlignment(store_instr, LLVMGetAlignment(result_loc)); // note this line
        return result_loc;
    } else {

Except before the result location branch, result_loc was instruction->tmp_ptr. And that was always an alloca instruction. Look at LLVM's implementation of the relevant function:

unsigned LLVMGetAlignment(LLVMValueRef V) {
  Value *P = unwrap<Value>(V);
  if (GlobalValue *GV = dyn_cast<GlobalValue>(P))
    return GV->getAlignment();
  if (AllocaInst *AI = dyn_cast<AllocaInst>(P))
    return AI->getAlignment();
  if (LoadInst *LI = dyn_cast<LoadInst>(P))
    return LI->getAlignment();
  if (StoreInst *SI = dyn_cast<StoreInst>(P))
    return SI->getAlignment();

  llvm_unreachable(
      "only GlobalValue, AllocaInst, LoadInst and StoreInst have alignment");
}

So you can see that it always worked before because it always hit the AllocaInst case. However, after result location changes, the result location value is not necessarily an LLVM Alloca instruction. However, it is always a pointer, and we have a Zig type for it. So the fix is to use that type information:

    } else if (handle_is_ptr(src_return_type)) {
        LLVMValueRef store_instr = LLVMBuildStore(g->builder, result, result_loc);
-        LLVMSetAlignment(store_instr, LLVMGetAlignment(result_loc));
+        LLVMSetAlignment(store_instr, get_ptr_align(g, instruction->result_loc->value.type));
        return result_loc;
    } else {

After this fix, I can build zig against debug LLVM and all the translate-c tests pass 🎉

@andrewrk andrewrk merged commit 0e38f72 into ziglang:master Jul 16, 2019
@hryx hryx deleted the translate-c-userland branch July 18, 2019 09:33
@hryx
Copy link
Contributor Author

hryx commented Jul 18, 2019

Amazing explanation of what happened!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants