-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dump register variables correctly #2
base: master
Are you sure you want to change the base?
Conversation
I missed this PR somehow. Is this necessary for you now? My workflow of submitting gcc patches is cumbersome, so I'm not sure how I'll handle external contribution to those patches yet. |
Not really, just a little thing. It was more of "let's finally take a look at GCC internals, shall we?", and then I got distracted again and forgot about it.
Oh, I was planning to send more while working (for example) on rust-lang/rustc_codegen_gcc#87 . Could that be a problem? |
If you send a PR that is completely independent of the other patches that should be relatively easy for me to handle it eventually. Please read the guidelines in order to not forget anything that should be done for a gcc patch. Also, if you are interested in sending your patches yourself to gcc, note that you don't need the copyright assignment anymore. |
I sent this PR here and not to the mailing list because this patch is just a little extension to your patch. I'd be super happy to transfer these poor 5 LOC to public domain so you could just attribute them to yourself.
Yeah, please help. I see GCC uses some mix of tabs and spaces that resembles "indent with tabs (width 8), pad with spaces" scheme, but sometimes it's spaces only. I'm lost. Any place this scheme is written down to?
What's copyright assignment? |
There are two scripts that I use:
and
GCC used to require that you give the copyright to the FSF and you had to sign a document. Otherwise, I can always send your patches for you, but I'd appreciate if they pass the above tests. |
DR 2352 changed the definitions of reference-related (so that it uses "similar type" instead of "same type") and of reference-compatible (use a standard conversion sequence). That means that reference-related is now more broad, which means that we will be binding more things directly. The original patch for DR 2352 caused some problems, which were fixed in r276251 by creating a "fake" ck_qual in direct_reference_binding, so that in void f(int *); // #1 void f(const int * const &); // #2 int *x; int main() { f(x); // call #1 } we call #1. The extra ck_qual in #2 causes compare_ics to select #1, which is a better match for "int *" because then we don't have to do a qualification conversion. Let's turn to the problem in this PR. We have void f(const int * const &); // #1 void f(const int *); // #2 int *x; int main() { f(x); } We arrive in compare_ics to decide which one is better. The ICS for #1 looks like ck_ref_bind <- ck_qual <- ck_identity const int *const & const int *const int * and the ICS for #2 is ck_qual <- ck_rvalue <- ck_identity const int * int * int * We strip the reference and then comp_cv_qual_signature when comparing two ck_quals sees that "const int *" is a proper subset of "const int *const" and we return -1. But that's wrong; presumably the top-level "const" should be ignored and the call should be ambiguous. This patch adjust the type of the "fake" ck_qual so that this problem doesn't arise. PR c++/97296 gcc/cp/ChangeLog: * call.cc (direct_reference_binding): strip_top_quals when creating a ck_qual. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/ref-bind4.C: Add dg-error. * g++.dg/cpp0x/ref-bind8.C: New test.
0a94141
to
17858b5
Compare
648c851
to
16686cb
Compare
Improve stack protector patterns and peephole2s even more: a. Use unrelated register clears with integer mode size <= word mode size to clear stack protector scratch register. b. Use unrelated register initializations in front of stack protector sequence to clear stack protector scratch register. c. Use unrelated register initializations using LEA instructions to clear stack protector scratch register. These stack protector improvements reuse 6914 unrelated register initializations to substitute the clear of stack protector scratch register in 12034 instances of stack protector sequence in recent linux defconfig build. gcc/ChangeLog: * config/i386/i386.md (@stack_protect_set_1_<PTR:mode>_<W:mode>): Use W mode iterator instead of SWI48. Output MOV instead of XOR for TARGET_USE_MOV0. (stack_protect_set_1 peephole2): Use integer modes with mode size <= word mode size for operand 3. (stack_protect_set_1 peephole2 rust-lang#2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization, originally in front of stack protector sequence. (*stack_protect_set_3_<PTR:mode>_<SWI48:mode>): New insn pattern. (stack_protect_set_1 peephole2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization involving LEA instruction.
Use unrelated register initializations using zero/sign-extend instructions to clear stack protector scratch register. Hanlde only SI -> DImode extensions for 64-bit targets, as this is the only extension that triggers the peephole in a non-negligible number. Also use explicit check for word_mode instead of mode iterator in peephole2 patterns to avoid pattern explosion. gcc/ChangeLog: * config/i386/i386.md (stack_protect_set_1 peephole2): Explicitly check operand 2 for word_mode. (stack_protect_set_1 peephole2 rust-lang#2): Ditto. (stack_protect_set_2 peephole2): Ditto. (stack_protect_set_3 peephole2): Ditto. (*stack_protect_set_4z_<mode>_di): New insn patter. (*stack_protect_set_4s_<mode>_di): Ditto. (stack_protect_set_4 peephole2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization involving zero/sign-extend instruction.
2fc8940
to
89a92e5
Compare
cdd8978
to
ad4ffde
Compare
cf95541
to
b6f163f
Compare
We evaluate constexpr functions on the original, pre-genericization bodies. That means that the function body we're evaluating will not have gone through cp_genericize_r's "Map block scope extern declarations to visible declarations with the same name and type in outer scopes if any". Here: constexpr bool bar() { return true; } // #1 constexpr bool foo() { constexpr bool bar(void); // rust-lang#2 return bar(); } it means that we: 1) register_constexpr_fundef (#1) 2) cp_genericize (#1) nothing interesting happens 3) register_constexpr_fundef (foo) does copy_fn, so we have two copies of the BIND_EXPR 4) cp_genericize (foo) this remaps rust-lang#2 to #1, but only on one copy of the BIND_EXPR 5) retrieve_constexpr_fundef (foo) we find it, no problem 6) retrieve_constexpr_fundef (rust-lang#2) and here rust-lang#2 isn't found in constexpr_fundef_table, because we're working on the BIND_EXPR copy where rust-lang#2 wasn't mapped to #1 so we fail. We've only registered #1. It should work to use DECL_LOCAL_DECL_ALIAS (which used to be extern_decl_map). We evaluate constexpr functions on pre-cp_fold bodies to avoid diagnostic problems, but the remapping I'm proposing should not interfere with diagnostics. This is not a problem for a global scope redeclaration; there we go through duplicate_decls which keeps the DECL_UID: DECL_UID (olddecl) = olddecl_uid; and DECL_UID is what constexpr_fundef_hasher::hash uses. PR c++/111132 gcc/cp/ChangeLog: * constexpr.cc (get_function_named_in_call): Use cp_get_fndecl_from_callee. * cvt.cc (cp_get_fndecl_from_callee): If there's a DECL_LOCAL_DECL_ALIAS, use it. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-redeclaration3.C: New test. * g++.dg/cpp0x/constexpr-redeclaration4.C: New test.
e744a94
to
bb9fe7d
Compare
9cec8ab
to
c4ee893
Compare
1e817bd
to
b4002fd
Compare
2e49eb4
to
85e56c5
Compare
This is another case of load hoisting breaking UID order in the preheader, this time between two hoistings. The easiest way out is to do what we do for the main stmt - copy instead of move. PR tree-optimization/116902 PR tree-optimization/116842 * tree-vect-stmts.cc (sort_after_uid): Remove again. (hoist_defs_of_uses): Copy defs instead of hoisting them so we can zero their UID. (vectorizable_load): Separate analysis and transform call, do transform on the stmt copy. * g++.dg/torture/pr116902.C: New testcase.
Whenever C1 and C2 are integer constants, X is of a wrapping type, and cmp is a relational operator, the expression X +- C1 cmp C2 can be simplified in the following cases: (a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial comparison in the following way: X +- C1 <= C2 -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1) -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions) -INF -+ C1 <= X <= +INF (due to (1)) -INF -+ C1 <= X (eliminate the right hand side since it holds for any X) (b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following sequence of transformations: X +- C1 >= C2 +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1) +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions) +INF -+ C1 >= X >= -INF (due to (1)) +INF -+ C1 >= X (eliminate the right hand side since it holds for any X) (c) The > and < cases are negations of (a) and (b), respectively. This transformation allows to occasionally save add / sub instructions, for instance the expression 3 + (uint32_t)f() < 2 compiles to cmn w0, #4 cset w0, ls instead of add w0, w0, 3 cmp w0, 2 cset w0, ls on aarch64. Testcases that go together with this patch have been split into two separate files, one containing testcases for unsigned variables and the other for wrapping signed ones (and thus compiled with -fwrapv). Additionally, one aarch64 test has been adjusted since the patch has caused the generated code to change from cmn w0, #2 csinc w0, w1, wzr, cc (x < -2) to cmn w0, #3 csinc w0, w1, wzr, cs (x <= -3) This patch has been bootstrapped and regtested on aarch64, x86_64, and i386, and additionally regtested on riscv32. gcc/ChangeLog: PR tree-optimization/116024 * match.pd: New transformation around integer comparison. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr116024-2.c: New test. * gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto. * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
PR jit/117275 reports various jit test failures seen on powerpc64le-unknown-linux-gnu due to hitting this assertion in varasm.cc on the 2nd compilation in a process: #2 0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8) at ../../src/gcc/varasm.cc:2650 2650 gcc_assert (!pending_assemble_externals_processed); (gdb) p pending_assemble_externals_processed $1 = true We're not properly resetting state in varasm.cc after a compile for libgccjit. Fixed thusly. gcc/ChangeLog: PR jit/117275 * toplev.cc (toplev::finalize): Call varasm_cc_finalize. * varasm.cc (varasm_cc_finalize): New. * varasm.h (varasm_cc_finalize): New decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
We currently crash upon the following invalid code (notice the "void void**" parameter) === cut here === using size_t = decltype(sizeof(int)); void *operator new(size_t, void void **p) noexcept { return p; } int x; void f() { int y; new (&y) int(x); } === cut here === The problem is that in this case, we end up with a NULL_TREE parameter list for the new operator because of the error, and (1) coerce_new_type wrongly complains about the first parameter type not being size_t, (2) std_placement_new_fn_p blindly accesses the parameter list, hence a crash. This patch does NOT address #1 since we can't easily distinguish between a new operator declaration without parameters from one with erroneous parameters (and it's not worth the risk to refactor and break things for an error recovery issue) hence a dg-bogus in new52.C, but it does address #2 and the ICE by simply checking the first parameter against NULL_TREE. It also adds a new testcase checking that we complain about new operators with no or invalid first parameters, since we did not have any. PR c++/117101 gcc/cp/ChangeLog: * init.cc (std_placement_new_fn_p): Check first_arg against NULL_TREE. gcc/testsuite/ChangeLog: * g++.dg/init/new52.C: New test. * g++.dg/init/new53.C: New test.
The second source register of insn "*extzvsi-1bit_addsubx" cannot be the same as the destination register, because that register will be overwritten with an intermediate value after insn splitting. /* example #1 */ int test1(int b, int a) { return ((a & 1024) ? 4 : 0) + b; } ;; result #1 (incorrect) test1: extui a2, a3, 10, 1 ;; overwrites A2 before used addx4 a2, a2, a2 ret.n This patch fixes that. ;; result #1 (correct) test1: extui a3, a3, 10, 1 ;; uses A3 and then overwrites addx4 a2, a3, a2 ret.n However, it should be noted that the first source register can be the same as the destination without any problems. /* example #2 */ int test2(int a, int b) { return ((a & 1024) ? 4 : 0) + b; } ;; result (correct) test2: extui a2, a2, 10, 1 ;; uses A2 and then overwrites addx4 a2, a2, a3 ret.n gcc/ChangeLog: * config/xtensa/xtensa.md (*extzvsi-1bit_addsubx): Add '&' to the destination register constraint to indicate that it is 'earlyclobber', append '0' to the first source register constraint to indicate that it can be the same as the destination register, and change the split condition from 1 to reload_completed so that the insn will be split only after RA in order to obtain allocated registers that satisfy the above constraints.
No description provided.