Optimize ABIRewrite system for lvalues #1528

kinke · 2016-05-28T18:21:58Z

Allow ABIRewrites to return the D parameter's LL lvalue directly.
Most rewrites store to memory anyway, so let the D parameter point directly to that memory instead of a dedicated alloca bitcopy.

dnadlinger · 2016-05-28T18:27:30Z

gen/abi.h

@@ -39,16 +39,17 @@ class FunctionType;
 struct ABIRewrite {
  virtual ~ABIRewrite() = default;

-  /// get a rewritten value back to its original form
-  virtual llvm::Value *get(Type *dty, llvm::Value *v) = 0;
+  /// Transforms the D argument to a suited LL argument.


"suitable"? I'm not a native speaker either, but "suited" as an adjective conjures weird images of LLVM values wearing suits in my head.

Heh yes of course ;)

dnadlinger · 2016-05-28T18:31:21Z

gen/tocall.cpp

-      LLValue *mem = DtoAlloca(returntype);
-      irFty.getRet(returntype, retllval, mem);
-      retllval = mem;
+      retllval = irFty.getRetLVal(returntype, retllval);


With your new interface, it is still necessary to have both the getRetLVal branch and the "manual" storeReturnValueOnStack branch below?

That's still a mess to sort out. As is all the repainting going on for return values and parameters...

kinke · 2016-05-28T18:32:25Z

Passes the full debug test suite on my Win64 box.

kinke · 2016-05-28T18:38:13Z

gen/llvmhelpers.cpp

-        return new DVarValue(type, getIrValue(vd));
-      }
-      if (llvm::isa<llvm::Argument>(getIrValue(vd))) {
-        return new DImValue(type, getIrValue(vd));


This took quite some time to get my attention. A delegate rewritten by ExplicitByvalRewrite (to an lvalue, a LL pointer to the delegate struct) would be bound to the parameter as DImValue, and it wouldn't get dereferenced properly later on. That was because the pointer is now the actual argument itself instead of a private alloca (so llvm::isa<llvm::Argument>() returns true).

Yeah, as I recently discovered too (cf. all the return value instruction domination issues), there are a lot of places like this that confuse IR-level semantics with whatever confused concepts of lvalues/rvalues we have on the DValue layer.

Allow ABIRewrites to return the D parameter's LL value directly. Most rewrites store to memory anyway, so let the D parameter point directly to that memory instead of a dedicated alloca bitcopy.

kinke · 2016-05-28T19:55:14Z

It would be nice if someone could verify that the ARM ABIs still work too (@smolt, @redstar, @joakim-noah).

smolt · 2016-05-29T04:42:59Z

I am currently useless. All my stuff is in boxes as I am moving.

joakim-noah · 2016-05-29T09:35:49Z

Applied this PR to master and cross-compiled for Android/ARM, all the same standard library tests pass.

kinke · 2016-05-29T11:57:47Z

gen/structs.cpp

@@ -168,7 +168,7 @@ void DtoPaddedStruct(Type *dty, LLValue *v, LLValue *lval) {
      // Nested structs are the only members that can contain padding
      DtoPaddedStruct(fields[i]->type, fieldval, fieldptr);
    } else {
-      DtoStore(fieldval, fieldptr);
+      DtoStoreZextI8(fieldval, fieldptr);


This bug (unable to store i1 bits) appears to have been the reason why the return value of intrinsics was never rewritten by the ABI (intrinsics ABI in this case) in gen/tocall.cpp.
The return value of intrinsics returning a LL struct (e.g., llvm.sadd.with.overflow) therefore didn't undergo the usual ABI transformation process; in this case, the LL struct was simply dumped to memory instead of going through DtoPaddedStruct() via the RemoveStructPadding rewrite which is otherwise applied to all LL structs passed to (and now also returned by) intrinsics.

kinke · 2016-05-29T12:25:51Z

Thanks @joakim-noah for the ARM tests.

kinke · 2016-05-29T13:01:02Z

This is the effect on the unoptimized IR for Win64 (ExplicitByvalRewrite):

struct Struct { long a, b; }

Struct foo(Struct s) { return s; }

void main()
{
    auto s = Struct(1, 2);
    foo(s);
}

// old:
define void @_D5byval3fooFS5byval6StructZS5byval6Struct(%byval.Struct* noalias sret align 8 %.sret_arg, %byval.Struct* noalias nocapture align 16 %s_arg) #0 comdat {
  %s = alloca %byval.Struct, align 8              ; [#uses = 2, size/byte = 16]
  %1 = bitcast %byval.Struct* %s to i8*           ; [#uses = 1]
  %2 = bitcast %byval.Struct* %s_arg to i8*       ; [#uses = 1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %2, i64 16, i32 1, i1 false)
  %3 = bitcast %byval.Struct* %.sret_arg to i8*   ; [#uses = 1]
  %4 = bitcast %byval.Struct* %s to i8*           ; [#uses = 1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %4, i64 16, i32 1, i1 false)
  ret void
}

// new:
define void @_D5byval3fooFS5byval6StructZS5byval6Struct(%byval.Struct* noalias sret align 8 %.sret_arg, %byval.Struct* noalias nocapture align 16 %s) #0 comdat {
  %1 = bitcast %byval.Struct* %.sret_arg to i8*   ; [#uses = 1]
  %2 = bitcast %byval.Struct* %s to i8*           ; [#uses = 1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %2, i64 16, i32 1, i1 false)
  ret void
}

// common:
define i32 @_Dmain({ i64, { i64, i8* }* } %unnamed) #0 comdat {
  %s = alloca %byval.Struct, align 8              ; [#uses = 2, size/byte = 16]
  %.structliteral = alloca %byval.Struct, align 8 ; [#uses = 3, size/byte = 16]
  %.rettmp = alloca %byval.Struct, align 8        ; [#uses = 1, size/byte = 16]
  %.ExplicitByvalRewrite_dump = alloca %byval.Struct, align 16 ; [#uses = 2, size/byte = 16]
  // initialize .structliteral with (1,2)
  // memcpy .structliteral to .s
  // memcpy .s to .ExplicitByvalRewrite_dump
  call void @_D5byval3fooFS5byval6StructZS5byval6Struct(%byval.Struct* noalias sret align 8 %.rettmp, %byval.Struct* noalias nocapture align 16 %.ExplicitByvalRewrite_dump) #0
  ret i32 0
}

This useless additional copy by the callee has been bugging me for quite a while.

smolt · 2016-05-29T15:35:10Z

For ARM, test-suite is where ABI better is tested. @joakim-noah did you run test-suite also?

joakim-noah · 2016-05-29T16:13:17Z

No, I'll try that next.

dnadlinger · 2016-05-29T18:39:56Z

I'd rather amend/revert this change afterwards if it happens to break ARM. (Merge/CI speed is currently one of our biggest bottle necks, unfortunately.)

joakim-noah · 2016-05-29T19:56:58Z

Built and ran dmd testsuite natively on Android/ARM with this PR, identical results.

dnadlinger · 2016-05-29T20:30:40Z

Nice to hear, thanks!

smolt · 2016-05-30T22:48:27Z

One more plus for ARM. Raspberry Pi 3 and master (with this merged PR) passes all tests, except lit-tests don't run correctly due to my setup. I have a space in a USB HD label and that dorks up the lit command lines.

JohanEngelen · 2016-05-31T08:00:22Z

except lit-tests don't run correctly due to my setup. I have a space in a USB HD label and that dorks up the lit command lines.

Can you share some details? Sounds like a bug in the way I set up lit.

smolt · 2016-05-31T18:13:47Z

Sure, I'll paste in the error when I get home today.

smolt · 2016-06-01T16:34:00Z

@JohanEngelen lit-test issue #1536 (which I am sure github emailed you about too)

kinke force-pushed the abi_cleanup branch from 7aa3b10 to 686f118 Compare May 28, 2016 18:25

dnadlinger reviewed May 28, 2016
View reviewed changes

kinke force-pushed the abi_cleanup branch from 686f118 to cccb057 Compare May 28, 2016 18:30

dnadlinger reviewed May 28, 2016
View reviewed changes

kinke reviewed May 28, 2016
View reviewed changes

Optimize ABIRewrite system for lvalues

fc6c340

Allow ABIRewrites to return the D parameter's LL value directly. Most rewrites store to memory anyway, so let the D parameter point directly to that memory instead of a dedicated alloca bitcopy.

kinke force-pushed the abi_cleanup branch from cccb057 to fc6c340 Compare May 28, 2016 19:07

kinke added 2 commits May 29, 2016 10:41

Remove obsolete helper ABIRewrite::storeToMemory()

769dac1

Trim signature of method ABIRewrite::type()

dd21a80

Refactor defining a function's explicit parameters

0f41c0c

kinke reviewed May 29, 2016
View reviewed changes

Simplify generation of a call's return value

7778db0

kinke force-pushed the abi_cleanup branch from 82a1959 to 7778db0 Compare May 29, 2016 12:01

dnadlinger merged commit 67c0d97 into ldc-developers:master May 29, 2016

kinke deleted the abi_cleanup branch August 24, 2017 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ABIRewrite system for lvalues #1528

Optimize ABIRewrite system for lvalues #1528

kinke commented May 28, 2016 •

edited

Loading

dnadlinger May 28, 2016 •

edited

Loading

kinke May 28, 2016

dnadlinger May 28, 2016

kinke May 28, 2016

kinke commented May 28, 2016

kinke May 28, 2016 •

edited

Loading

dnadlinger May 28, 2016

kinke commented May 28, 2016 •

edited

Loading

smolt commented May 29, 2016

joakim-noah commented May 29, 2016

kinke May 29, 2016 •

edited

Loading

kinke commented May 29, 2016

kinke commented May 29, 2016 •

edited

Loading

smolt commented May 29, 2016

joakim-noah commented May 29, 2016

dnadlinger commented May 29, 2016 •

edited

Loading

joakim-noah commented May 29, 2016 •

edited

Loading

dnadlinger commented May 29, 2016 •

edited

Loading

smolt commented May 30, 2016

JohanEngelen commented May 31, 2016

smolt commented May 31, 2016

smolt commented Jun 1, 2016

Optimize ABIRewrite system for lvalues #1528

Optimize ABIRewrite system for lvalues #1528

Conversation

kinke commented May 28, 2016 • edited Loading

dnadlinger May 28, 2016 • edited Loading

Choose a reason for hiding this comment

kinke May 28, 2016

Choose a reason for hiding this comment

dnadlinger May 28, 2016

Choose a reason for hiding this comment

kinke May 28, 2016

Choose a reason for hiding this comment

kinke commented May 28, 2016

kinke May 28, 2016 • edited Loading

Choose a reason for hiding this comment

dnadlinger May 28, 2016

Choose a reason for hiding this comment

kinke commented May 28, 2016 • edited Loading

smolt commented May 29, 2016

joakim-noah commented May 29, 2016

kinke May 29, 2016 • edited Loading

Choose a reason for hiding this comment

kinke commented May 29, 2016

kinke commented May 29, 2016 • edited Loading

smolt commented May 29, 2016

joakim-noah commented May 29, 2016

dnadlinger commented May 29, 2016 • edited Loading

joakim-noah commented May 29, 2016 • edited Loading

dnadlinger commented May 29, 2016 • edited Loading

smolt commented May 30, 2016

JohanEngelen commented May 31, 2016

smolt commented May 31, 2016

smolt commented Jun 1, 2016

kinke commented May 28, 2016 •

edited

Loading

dnadlinger May 28, 2016 •

edited

Loading

kinke May 28, 2016 •

edited

Loading

kinke commented May 28, 2016 •

edited

Loading

kinke May 29, 2016 •

edited

Loading

kinke commented May 29, 2016 •

edited

Loading

dnadlinger commented May 29, 2016 •

edited

Loading

joakim-noah commented May 29, 2016 •

edited

Loading

dnadlinger commented May 29, 2016 •

edited

Loading