Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typo in config.h.in. #10

Closed
wants to merge 1 commit into from
Closed

Conversation

AraHaan
Copy link

@AraHaan AraHaan commented Mar 15, 2017

I have noticed (when trying to manually compile libstdc++fs manually for cygwin) that there was an typo on config.h.in that would transfer over to config.h. So this PR fixes it.

@chriscz
Copy link

chriscz commented Mar 15, 2017

Please see #5 (comment) on how to contribute to this repository.

@AraHaan
Copy link
Author

AraHaan commented Mar 15, 2017

closing since it has been commited already now. Thanks.

@AraHaan AraHaan closed this Mar 15, 2017
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 12, 2020
Prevents the following UBSAN error:

./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool'
    #0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511
    gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642
    gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732
    gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823
    gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441
    gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597
    gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598
    gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608
    gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726
    gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941
    gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642
    gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09)

gcc/ChangeLog:

	* ipa-modref.c (merge_call_side_effects): Clear modref_parm_map
	fields in the vector.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 19, 2020
It fixes:

/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool'
    #0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
    #1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779
    gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492
    gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216
    gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697
    gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096
    gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
    gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936
    gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700
    gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
    gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
    gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
    gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
    gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
    gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39
    gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314
    gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9)

gcc/ChangeLog:

	* ipa-modref.c (compute_parm_map): Clear vector.
nstester pushed a commit to nstester/gcc that referenced this pull request Jun 14, 2021
The fixed error is:

==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900
    #0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172
    #1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161
    #2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517
    #3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686
    #4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567
    gcc-mirror#5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656
    gcc-mirror#6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657
    gcc-mirror#7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667
    gcc-mirror#8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773
    gcc-mirror#9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191
    gcc-mirror#10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001
    gcc-mirror#11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154
    gcc-mirror#12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289
    gcc-mirror#13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269
    gcc-mirror#14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537
    gcc-mirror#15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482
    gcc-mirror#16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210
    gcc-mirror#17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349
    gcc-mirror#18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39
    gcc-mirror#19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332
    gcc-mirror#20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd)

0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920)
allocated by thread T0 here:
    #0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102
    #1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156

gcc/ChangeLog:

	* gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
nstester pushed a commit to nstester/gcc that referenced this pull request Dec 30, 2021
…imize or target pragmas [PR103012]

The following testcases ICE when an optimize or target pragma
is followed by a long line (4096+ chars).
This is because on such long lines we can't use columns anymore,
but the cpp_define calls performed by c_cpp_builtins_optimize_pragma
or from the backend hooks for target pragma are done on temporary
buffers and expect to get columns from whatever line they appear on
(which happens to be the long line after optimize/target pragma),
and we run into:
 #0  fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986
 #1  0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502
 #2  0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827
 #3  0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898
 #4  0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592
 gcc-mirror#5  0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394
 gcc-mirror#6  0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601
 gcc-mirror#7  0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639
 gcc-mirror#8  0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589
 gcc-mirror#9  0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513
 gcc-mirror#10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522
 gcc-mirror#11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>)
     at ../../gcc/c-family/c-cppbuiltin.c:600
assertion that LC_RENAME doesn't happen first.

I think the right fix is emit those predefined macros upon
optimize/target pragmas with BUILTINS_LOCATION, like we already do
for those macros at the start of the TU, they don't appear in columns
of the next line after it.  Another possibility would be to force them
at the location of the pragma.

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/103012
gcc/
	* config/i386/i386-c.c (ix86_pragma_target_parse): Perform
	cpp_define/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
	* config/arm/arm-c.c (arm_pragma_target_parse): Likewise.
	* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise.
	* config/s390/s390-c.c (s390_pragma_target_parse): Likewise.
gcc/c-family/
	* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform
	cpp_define_unused/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
gcc/testsuite/
	PR c++/103012
	* g++.dg/cpp/pr103012.C: New test.
	* g++.target/i386/pr103012.C: New test.
kraj pushed a commit to kraj/gcc that referenced this pull request Jan 24, 2022
…imize or target pragmas [PR103012]

The following testcases ICE when an optimize or target pragma
is followed by a long line (4096+ chars).
This is because on such long lines we can't use columns anymore,
but the cpp_define calls performed by c_cpp_builtins_optimize_pragma
or from the backend hooks for target pragma are done on temporary
buffers and expect to get columns from whatever line they appear on
(which happens to be the long line after optimize/target pragma),
and we run into:
 #0  fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986
 #1  0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502
 gcc-mirror#2  0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827
 gcc-mirror#3  0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898
 gcc-mirror#4  0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592
 gcc-mirror#5  0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394
 gcc-mirror#6  0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601
 gcc-mirror#7  0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639
 gcc-mirror#8  0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589
 gcc-mirror#9  0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513
 gcc-mirror#10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522
 gcc-mirror#11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>)
     at ../../gcc/c-family/c-cppbuiltin.c:600
assertion that LC_RENAME doesn't happen first.

I think the right fix is emit those predefined macros upon
optimize/target pragmas with BUILTINS_LOCATION, like we already do
for those macros at the start of the TU, they don't appear in columns
of the next line after it.  Another possibility would be to force them
at the location of the pragma.

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/103012
gcc/
	* config/i386/i386-c.c (ix86_pragma_target_parse): Perform
	cpp_define/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
	* config/arm/arm-c.c (arm_pragma_target_parse): Likewise.
	* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise.
	* config/s390/s390-c.c (s390_pragma_target_parse): Likewise.
gcc/c-family/
	* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform
	cpp_define_unused/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
gcc/testsuite/
	PR c++/103012
	* g++.dg/cpp/pr103012.C: New test.
	* g++.target/i386/pr103012.C: New test.

(cherry picked from commit 1dbe26b)
xionghul pushed a commit to xionghul/gcc that referenced this pull request May 12, 2022
A previous change in "Make debug printouts more robust" accidentally
removed double quotes around identifiers in debug printouts. This patch
restores those. So for example, we have:

       Prev_Entity = Node gcc-mirror#10 N_Defining_Identifier "foo" (Entity_Id=795)

and not:

       Prev_Entity = Node gcc-mirror#10 N_Defining_Identifier foo (Entity_Id=795)

This affects the output of -gnatdt, and certain calls normally done from
gdb.

gcc/ada/

	* namet.ads, namet.adb (Write_Name_For_Debug): Add Quote
	parameter to allow conditional addition of quotes. Note that
	some calls to Write_Name_For_Debug, for example for file names,
	shouldn't have quotes, as in some_package.adb:123:45.
	* treepr.adb (Print_Name): Add double quotes around the name
	using the above Quote parameters.
nstester pushed a commit to nstester/gcc that referenced this pull request Apr 28, 2023
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR.  The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:

(insn 7 4 8 2 (set (reg:HI 51)
        (lshiftrt:HI (reg/v:HI 49 [ b ])
            (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
     (nil))

into a pair of QImode SUBREG assignments:

(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
        (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
        (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))

but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions.  The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.

To answer Richard's question in comment gcc-mirror#10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).

2023-04-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/109476
	* lower-subreg.cc: Include explow.h for force_reg.
	(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
	If decomposing a suitable LSHIFTRT and we're not splitting
	ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
	instead of setting a high part SUBREG to zero, which helps combine.
	(decompose_multiword_subregs): Update call to resolve_shift_zext.

gcc/testsuite/ChangeLog
	PR rtl-optimization/109476
	* gcc.target/avr/mmcu/pr109476.c: New test case.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 19, 2024
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR.  The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:

(insn 7 4 8 2 (set (reg:HI 51)
        (lshiftrt:HI (reg/v:HI 49 [ b ])
            (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
     (nil))

into a pair of QImode SUBREG assignments:

(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
        (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
        (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))

but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions.  The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.

To answer Richard's question in comment gcc-mirror#10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).

2023-04-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/109476
	* lower-subreg.cc: Include explow.h for force_reg.
	(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
	If decomposing a suitable LSHIFTRT and we're not splitting
	ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
	instead of setting a high part SUBREG to zero, which helps combine.
	(decompose_multiword_subregs): Update call to resolve_shift_zext.

gcc/testsuite/ChangeLog
	PR rtl-optimization/109476
	* gcc.target/avr/mmcu/pr109476.c: New test case.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 20, 2024
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR.  The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:

(insn 7 4 8 2 (set (reg:HI 51)
        (lshiftrt:HI (reg/v:HI 49 [ b ])
            (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
     (nil))

into a pair of QImode SUBREG assignments:

(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
        (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
        (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))

but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions.  The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.

To answer Richard's question in comment gcc-mirror#10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).

2023-04-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/109476
	* lower-subreg.cc: Include explow.h for force_reg.
	(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
	If decomposing a suitable LSHIFTRT and we're not splitting
	ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
	instead of setting a high part SUBREG to zero, which helps combine.
	(decompose_multiword_subregs): Update call to resolve_shift_zext.

gcc/testsuite/ChangeLog
	PR rtl-optimization/109476
	* gcc.target/avr/mmcu/pr109476.c: New test case.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 25, 2024
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR.  The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:

(insn 7 4 8 2 (set (reg:HI 51)
        (lshiftrt:HI (reg/v:HI 49 [ b ])
            (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
     (nil))

into a pair of QImode SUBREG assignments:

(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
        (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
        (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
     (nil))

but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions.  The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.

To answer Richard's question in comment gcc-mirror#10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).

2023-04-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/109476
	* lower-subreg.cc: Include explow.h for force_reg.
	(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
	If decomposing a suitable LSHIFTRT and we're not splitting
	ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
	instead of setting a high part SUBREG to zero, which helps combine.
	(decompose_multiword_subregs): Update call to resolve_shift_zext.

gcc/testsuite/ChangeLog
	PR rtl-optimization/109476
	* gcc.target/avr/mmcu/pr109476.c: New test case.
trcrsired pushed a commit to trcrsired/gcc that referenced this pull request May 20, 2024
trcrsired pushed a commit to trcrsired/gcc that referenced this pull request May 26, 2024
NinaRanns referenced this pull request in NinaRanns/gcc Jul 3, 2024
pr113968 and minor natural syntax parsing fix
fjtrujy added a commit to fjtrujy/gcc that referenced this pull request Jul 15, 2024
Using POSIX atomic implementation
hubot pushed a commit that referenced this pull request Oct 24, 2024
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
    return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
        index   z0.d, #10, #3
        mul     z0.d, z0.d, #5
        ret
Now, it is compiled to
f2:
        mov     x0, 50
        index   z0.d, x0, #15
        ret

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svindex_impl::fold): Add constant folding.

gcc/testsuite/
	* gcc.target/aarch64/sve/index_const_fold.c: New test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants