Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump register variables correctly #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Dump register variables correctly #2

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Sep 17, 2021

No description provided.

@antoyo
Copy link

antoyo commented Sep 25, 2021

I missed this PR somehow.

Is this necessary for you now?

My workflow of submitting gcc patches is cumbersome, so I'm not sure how I'll handle external contribution to those patches yet.

@ghost
Copy link
Author

ghost commented Sep 26, 2021

Not really, just a little thing. It was more of "let's finally take a look at GCC internals, shall we?", and then I got distracted again and forgot about it.

My workflow of submitting gcc patches is cumbersome, so I'm not sure how I'll handle external contribution to those patches yet.

Oh, I was planning to send more while working (for example) on rust-lang/rustc_codegen_gcc#87 . Could that be a problem?

@antoyo
Copy link

antoyo commented Sep 26, 2021

If you send a PR that is completely independent of the other patches that should be relatively easy for me to handle it eventually.
I'll probably not send it for review until all the previous ones are merged if it adds a new API because I don't want to manage the conflicts in the libgccjit.map file.

Please read the guidelines in order to not forget anything that should be done for a gcc patch.
If you have any trouble for stuff like how to check the formatting is right, I can help you.

Also, if you are interested in sending your patches yourself to gcc, note that you don't need the copyright assignment anymore.

@ghost
Copy link
Author

ghost commented Sep 26, 2021

I sent this PR here and not to the mailing list because this patch is just a little extension to your patch. I'd be super happy to transfer these poor 5 LOC to public domain so you could just attribute them to yourself.

If you have any trouble for stuff like how to check the formatting is right, I can help you.

Yeah, please help. I see GCC uses some mix of tabs and spaces that resembles "indent with tabs (width 8), pad with spaces" scheme, but sometimes it's spaces only. I'm lost. Any place this scheme is written down to?

Also, if you are interested in sending your patches yourself to gcc, note that you don't need the copyright assignment anymore.

What's copyright assignment?

@antoyo
Copy link

antoyo commented Sep 26, 2021

Yeah, please help. I see GCC uses some mix of tabs and spaces that resembles "indent with tabs (width 8), pad with spaces" scheme, but sometimes it's spaces only. I'm lost. Any place this scheme is written down to?

There are two scripts that I use:

./contrib/gcc-changelog/git_check_commit.py

and

./contrib/check_GNU_style.sh 0001-patch-name.patch

What's copyright assignment?

GCC used to require that you give the copyright to the FSF and you had to sign a document.
Now, it's not required anymore, so it should be easier for you if you want to contribute directly.

Otherwise, I can always send your patches for you, but I'd appreciate if they pass the above tests.

antoyo pushed a commit that referenced this pull request Apr 13, 2022
DR 2352 changed the definitions of reference-related (so that it uses
"similar type" instead of "same type") and of reference-compatible (use
a standard conversion sequence).  That means that reference-related is
now more broad, which means that we will be binding more things directly.

The original patch for DR 2352 caused some problems, which were fixed in
r276251 by creating a "fake" ck_qual in direct_reference_binding, so
that in

  void f(int *); // #1
  void f(const int * const &); // #2
  int *x;
  int main()
  {
    f(x); // call #1
  }

we call #1.  The extra ck_qual in #2 causes compare_ics to select #1,
which is a better match for "int *" because then we don't have to do
a qualification conversion.

Let's turn to the problem in this PR.  We have

  void f(const int * const &); // #1
  void f(const int *); // #2
  int *x;
  int main()
  {
    f(x);
  }

We arrive in compare_ics to decide which one is better. The ICS for #1
looks like

    ck_ref_bind      <-    ck_qual         <-   ck_identity
  const int *const &     const int *const         int *

and the ICS for #2 is

    ck_qual     <-  ck_rvalue   <-  ck_identity
  const int *          int *           int *

We strip the reference and then comp_cv_qual_signature when comparing two
ck_quals sees that "const int *" is a proper subset of "const int *const"
and we return -1.  But that's wrong; presumably the top-level "const"
should be ignored and the call should be ambiguous.  This patch adjust
the type of the "fake" ck_qual so that this problem doesn't arise.

	PR c++/97296

gcc/cp/ChangeLog:

	* call.cc (direct_reference_binding): strip_top_quals when creating
	a ck_qual.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/ref-bind4.C: Add dg-error.
	* g++.dg/cpp0x/ref-bind8.C: New test.
@antoyo antoyo force-pushed the master branch 3 times, most recently from 0a94141 to 17858b5 Compare December 8, 2022 23:45
@antoyo antoyo force-pushed the master branch 9 times, most recently from 648c851 to 16686cb Compare August 24, 2023 20:31
GuillaumeGomez pushed a commit to GuillaumeGomez/gcc that referenced this pull request Jan 10, 2024
Improve stack protector patterns and peephole2s even more:

a. Use unrelated register clears with integer mode size <= word
   mode size to clear stack protector scratch register.

b. Use unrelated register initializations in front of stack
   protector sequence to clear stack protector scratch register.

c. Use unrelated register initializations using LEA instructions
   to clear stack protector scratch register.

These stack protector improvements reuse 6914 unrelated register
initializations to substitute the clear of stack protector scratch
register in 12034 instances of stack protector sequence in recent linux
defconfig build.

gcc/ChangeLog:

	* config/i386/i386.md (@stack_protect_set_1_<PTR:mode>_<W:mode>):
	Use W mode iterator instead of SWI48.  Output MOV instead of XOR
	for TARGET_USE_MOV0.
	(stack_protect_set_1 peephole2): Use integer modes with
	mode size <= word mode size for operand 3.
	(stack_protect_set_1 peephole2 rust-lang#2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization, originally in front of stack
	protector sequence.
	(*stack_protect_set_3_<PTR:mode>_<SWI48:mode>): New insn pattern.
	(stack_protect_set_1 peephole2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization involving LEA instruction.
GuillaumeGomez pushed a commit to GuillaumeGomez/gcc that referenced this pull request Jan 10, 2024
Use unrelated register initializations using zero/sign-extend instructions
to clear stack protector scratch register.

Hanlde only SI -> DImode extensions for 64-bit targets, as this is the
only extension that triggers the peephole in a non-negligible number.

Also use explicit check for word_mode instead of mode iterator in peephole2
patterns to avoid pattern explosion.

gcc/ChangeLog:

	* config/i386/i386.md (stack_protect_set_1 peephole2):
	Explicitly check operand 2 for word_mode.
	(stack_protect_set_1 peephole2 rust-lang#2): Ditto.
	(stack_protect_set_2 peephole2): Ditto.
	(stack_protect_set_3 peephole2): Ditto.
	(*stack_protect_set_4z_<mode>_di): New insn patter.
	(*stack_protect_set_4s_<mode>_di): Ditto.
	(stack_protect_set_4 peephole2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization involving zero/sign-extend instruction.
@antoyo antoyo force-pushed the master branch 2 times, most recently from 2fc8940 to 89a92e5 Compare February 15, 2024 22:14
@antoyo antoyo force-pushed the master branch 3 times, most recently from cdd8978 to ad4ffde Compare February 16, 2024 21:34
@antoyo antoyo force-pushed the master branch 2 times, most recently from cf95541 to b6f163f Compare March 1, 2024 15:35
GuillaumeGomez pushed a commit to GuillaumeGomez/gcc that referenced this pull request Apr 8, 2024
We evaluate constexpr functions on the original, pre-genericization bodies.
That means that the function body we're evaluating will not have gone
through cp_genericize_r's "Map block scope extern declarations to visible
declarations with the same name and type in outer scopes if any".  Here:

  constexpr bool bar() { return true; } // #1
  constexpr bool foo() {
    constexpr bool bar(void); // rust-lang#2
    return bar();
  }

it means that we:
1) register_constexpr_fundef (#1)
2) cp_genericize (#1)
   nothing interesting happens
3) register_constexpr_fundef (foo)
   does copy_fn, so we have two copies of the BIND_EXPR
4) cp_genericize (foo)
   this remaps rust-lang#2 to #1, but only on one copy of the BIND_EXPR
5) retrieve_constexpr_fundef (foo)
   we find it, no problem
6) retrieve_constexpr_fundef (rust-lang#2)
   and here rust-lang#2 isn't found in constexpr_fundef_table, because
   we're working on the BIND_EXPR copy where rust-lang#2 wasn't mapped to #1
   so we fail.  We've only registered #1.

It should work to use DECL_LOCAL_DECL_ALIAS (which used to be
extern_decl_map).  We evaluate constexpr functions on pre-cp_fold
bodies to avoid diagnostic problems, but the remapping I'm proposing
should not interfere with diagnostics.

This is not a problem for a global scope redeclaration; there we go
through duplicate_decls which keeps the DECL_UID:
  DECL_UID (olddecl) = olddecl_uid;
and DECL_UID is what constexpr_fundef_hasher::hash uses.

	PR c++/111132

gcc/cp/ChangeLog:

	* constexpr.cc (get_function_named_in_call): Use
	cp_get_fndecl_from_callee.
	* cvt.cc (cp_get_fndecl_from_callee): If there's a
	DECL_LOCAL_DECL_ALIAS, use it.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/constexpr-redeclaration3.C: New test.
	* g++.dg/cpp0x/constexpr-redeclaration4.C: New test.
@antoyo antoyo force-pushed the master branch 2 times, most recently from e744a94 to bb9fe7d Compare September 28, 2024 13:44
@antoyo antoyo force-pushed the master branch 2 times, most recently from 9cec8ab to c4ee893 Compare October 3, 2024 21:55
@antoyo antoyo force-pushed the master branch 6 times, most recently from 1e817bd to b4002fd Compare October 17, 2024 14:21
@antoyo antoyo force-pushed the master branch 4 times, most recently from 2e49eb4 to 85e56c5 Compare November 14, 2024 18:02
antoyo pushed a commit that referenced this pull request Nov 21, 2024
This is another case of load hoisting breaking UID order in the
preheader, this time between two hoistings.  The easiest way out is
to do what we do for the main stmt - copy instead of move.

	PR tree-optimization/116902
	PR tree-optimization/116842
	* tree-vect-stmts.cc (sort_after_uid): Remove again.
	(hoist_defs_of_uses): Copy defs instead of hoisting them so
	we can zero their UID.
	(vectorizable_load): Separate analysis and transform call,
	do transform on the stmt copy.

	* g++.dg/torture/pr116902.C: New testcase.
antoyo pushed a commit that referenced this pull request Nov 21, 2024
Whenever C1 and C2 are integer constants, X is of a wrapping type, and
cmp is a relational operator, the expression X +- C1 cmp C2 can be
simplified in the following cases:

(a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
comparison in the following way:
   X +- C1 <= C2
   -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
   -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
   -INF -+ C1 <= X <= +INF (due to (1))
   -INF -+ C1 <= X (eliminate the right hand side since it holds for any X)

(b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
sequence of transformations:

   X +- C1 >= C2
   +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
   +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
   +INF -+ C1 >= X >= -INF (due to (1))
   +INF -+ C1 >= X (eliminate the right hand side since it holds for any X)

(c) The > and < cases are negations of (a) and (b), respectively.

This transformation allows to occasionally save add / sub instructions,
for instance the expression

3 + (uint32_t)f() < 2

compiles to

cmn     w0, #4
cset    w0, ls

instead of

add     w0, w0, 3
cmp     w0, 2
cset    w0, ls

on aarch64.

Testcases that go together with this patch have been split into two
separate files, one containing testcases for unsigned variables and the
other for wrapping signed ones (and thus compiled with -fwrapv).
Additionally, one aarch64 test has been adjusted since the patch has
caused the generated code to change from

cmn     w0, #2
csinc   w0, w1, wzr, cc   (x < -2)

to

cmn     w0, #3
csinc   w0, w1, wzr, cs   (x <= -3)

This patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-2.c: New test.
	* gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
	* gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
antoyo pushed a commit that referenced this pull request Nov 21, 2024
PR jit/117275 reports various jit test failures seen on
powerpc64le-unknown-linux-gnu due to hitting this assertion
in varasm.cc on the 2nd compilation in a process:

#2  0x00007ffff63e67d0 in assemble_external_libcall (fun=0x7ffff2a4b1d8)
    at ../../src/gcc/varasm.cc:2650
2650          gcc_assert (!pending_assemble_externals_processed);
(gdb) p pending_assemble_externals_processed
$1 = true

We're not properly resetting state in varasm.cc after a compile
for libgccjit.

Fixed thusly.

gcc/ChangeLog:
	PR jit/117275
	* toplev.cc (toplev::finalize): Call varasm_cc_finalize.
	* varasm.cc (varasm_cc_finalize): New.
	* varasm.h (varasm_cc_finalize): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
antoyo pushed a commit that referenced this pull request Nov 21, 2024
We currently crash upon the following invalid code (notice the "void
void**" parameter)

=== cut here ===
using size_t = decltype(sizeof(int));
void *operator new(size_t, void void **p) noexcept { return p; }
int x;
void f() {
    int y;
    new (&y) int(x);
}
=== cut here ===

The problem is that in this case, we end up with a NULL_TREE parameter
list for the new operator because of the error, and (1) coerce_new_type
wrongly complains about the first parameter type not being size_t,
(2) std_placement_new_fn_p blindly accesses the parameter list, hence a
crash.

This patch does NOT address #1 since we can't easily distinguish between
a new operator declaration without parameters from one with erroneous
parameters (and it's not worth the risk to refactor and break things for
an error recovery issue) hence a dg-bogus in new52.C, but it does
address #2 and the ICE by simply checking the first parameter against
NULL_TREE.

It also adds a new testcase checking that we complain about new
operators with no or invalid first parameters, since we did not have
any.

	PR c++/117101

gcc/cp/ChangeLog:

	* init.cc (std_placement_new_fn_p): Check first_arg against
	NULL_TREE.

gcc/testsuite/ChangeLog:

	* g++.dg/init/new52.C: New test.
	* g++.dg/init/new53.C: New test.
antoyo pushed a commit that referenced this pull request Nov 21, 2024
The second source register of insn "*extzvsi-1bit_addsubx" cannot be the
same as the destination register, because that register will be overwritten
with an intermediate value after insn splitting.

     /* example #1 */
     int test1(int b, int a) {
       return ((a & 1024) ? 4 : 0) + b;
     }

     ;; result #1 (incorrect)
     test1:
     	extui	a2, a3, 10, 1	;; overwrites A2 before used
     	addx4	a2, a2, a2
     	ret.n

This patch fixes that.

     ;; result #1 (correct)
     test1:
     	extui	a3, a3, 10, 1	;; uses A3 and then overwrites
     	addx4	a2, a3, a2
     	ret.n

However, it should be noted that the first source register can be the same
as the destination without any problems.

     /* example #2 */
     int test2(int a, int b) {
       return ((a & 1024) ? 4 : 0) + b;
     }

     ;; result (correct)
     test2:
     	extui	a2, a2, 10, 1	;; uses A2 and then overwrites
     	addx4	a2, a2, a3
     	ret.n

gcc/ChangeLog:

	* config/xtensa/xtensa.md (*extzvsi-1bit_addsubx):
	Add '&' to the destination register constraint to indicate that
	it is 'earlyclobber', append '0' to the first source register
	constraint to indicate that it can be the same as the destination
	register, and change the split condition from 1 to reload_completed
	so that the insn will be split only after RA in order to obtain
	allocated registers that satisfy the above constraints.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant