-
-
Notifications
You must be signed in to change notification settings - Fork 15
[D] Implement d_system_linkage hook #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
jpf91
commented
Mar 2, 2019
b1ed9c7
to
396782a
Compare
0ba02b8
to
46b4537
Compare
96bbd63
to
2b97491
Compare
jpf91
pushed a commit
that referenced
this pull request
Oct 19, 2019
This is the first step of early splitting all the DImode comparison operations. We start by factoring the DImode handling out of arm_gen_compare_reg into its own function. Simple DImode equality comparisions (such as equality with zero, or equality with a constant that is zero in one of the two word values that it comprises) can be done using a single subtract followed by an ORRS instruction. This avoids the need for conditional execution. For example, (r0 != 5) can be written as SUB Rt, R0, #5 ORRS Rt, Rt, R1 The ORRS is now expanded using an SImode pattern that already exists in the MD file and this gives the register allocator more freedom to select registers (consecutive pairs are no-longer required). Furthermore, we can then delete the arm_cmpdi_zero pattern as it is no-longer required. We use SUB for the value adjustment as this has a generally more flexible range of immediates than XOR and what's more has the opportunity to be relaxed in thumb2 to a 16-bit SUBS instruction. * config/arm/arm.c (arm_select_cc_mode): For DImode equality tests return CC_Zmode if comparing against a constant where one word is zero. (arm_gen_compare_reg): Split DImode handling to ... (arm_gen_dicompare_reg): ... here. Handle equality comparisons against simple constants. * config/arm/arm.md (arm_cmpdi_zero): Delete pattern. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277177 138bc75d-0d04-0410-961f-82ee72b054a4
jpf91
pushed a commit
that referenced
this pull request
Oct 19, 2019
This patch improves the expansion of usubvsi4 by allowing suitable constants to be passed directly. Unlike normal subtraction, either operand may be a constant (and indeed I have seen cases where both can be with LTO enabled). One interesting testcase that improves as a result of this is: unsigned f6 (unsigned a) { unsigned x; return __builtin_sub_overflow (5U, a, &x) ? 0 : x; } Which previously compiled to: rsbs r3, r0, #5 cmp r0, #5 movls r0, r3 movhi r0, #0 but now generates the optimal sequence: rsbs r0, r0, #5 movcc r0, #0 * config/arm/arm.md (usubv<mode>4): Delete expansion. (usubvsi4): New pattern. Allow some immediate values for inputs. (usubvdi4): New pattern. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277187 138bc75d-0d04-0410-961f-82ee72b054a4
jpf91
pushed a commit
that referenced
this pull request
Jun 18, 2020
Made apparent by recent commit dc70315 "openmp: Implement discovery of implicit declare target to clauses": +FAIL: libgomp.c/target-39.c (internal compiler error) +FAIL: libgomp.c/target-39.c (test for excess errors) +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa' enabled (by default). during GIMPLE pass: hsagen source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’: source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault 23 | #pragma omp target map(from:err) | ^~~ [...] GDB: Program received signal SIGSEGV, Segmentation fault. fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 6267 return (fndecl_built_in_p (node, BUILT_IN_NORMAL) (gdb) bt #0 fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304 #2 0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770 #3 0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999 #4 0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596 #5 0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680 #6 0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502 [...] (gdb) up #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) (gdb) print function_decl $1 = (tree) 0x0 (gdb) list 5299 if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) 5300 { 5301 tree function_decl = gimple_call_fndecl (stmt); 5302 /* Prefetch pass can create type-mismatching prefetch builtin calls which 5303 fail the gimple_call_builtin_p test above. Handle them here. */ 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) 5305 return; 5306 5307 if (function_decl == NULL_TREE) 5308 { The problem is present already since 2016-11-23 commit 56b1c60 (r242761) "Merge from HSA branch to trunk", and the fix obvious enough. gcc/ * hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl == NULL_TREE' check earlier. gcc/testsuite/ * c-c++-common/gomp/hsa-indirect-call-1.c: New file.
jpf91
pushed a commit
that referenced
this pull request
Oct 13, 2020
Prevents the following UBSAN error: ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool' #0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482 #1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511 #2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642 #3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732 #4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823 #5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441 #6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509 #7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597 #8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598 #9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608 #10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726 #11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941 #12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642 #13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777 #14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022 #15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485 #16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321 #17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460 #18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39 #19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314 #20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09) gcc/ChangeLog: * ipa-modref.c (merge_call_side_effects): Clear modref_parm_map fields in the vector.
jpf91
pushed a commit
that referenced
this pull request
Oct 20, 2020
It fixes: /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool' #0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482 #1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779 #2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492 #3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216 #4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697 #5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096 #6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509 #7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936 #8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700 #9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777 #10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022 #11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485 #12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321 #13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460 #14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39 #15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314 #16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9) gcc/ChangeLog: * ipa-modref.c (compute_parm_map): Clear vector.
jpf91
pushed a commit
that referenced
this pull request
Feb 24, 2021
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128' #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77 #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190 #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338 #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370 #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f) #5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24) #6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd) Differential Revision: https://reviews.llvm.org/D97263 Cherry-pick from 16ede0956cb1f4b692dfa619ccfa6ab1de28e19b.
ibuclaw
pushed a commit
that referenced
this pull request
Apr 23, 2021
Made apparent by recent commit dc70315 "openmp: Implement discovery of implicit declare target to clauses": +FAIL: libgomp.c/target-39.c (internal compiler error) +FAIL: libgomp.c/target-39.c (test for excess errors) +UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa' enabled (by default). during GIMPLE pass: hsagen source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’: source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault 23 | #pragma omp target map(from:err) | ^~~ [...] GDB: Program received signal SIGSEGV, Segmentation fault. fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 6267 return (fndecl_built_in_p (node, BUILT_IN_NORMAL) (gdb) bt #0 fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267 #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304 #2 0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770 #3 0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999 #4 0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596 #5 0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680 #6 0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502 [...] (gdb) up #1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) (gdb) print function_decl $1 = (tree) 0x0 (gdb) list 5299 if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) 5300 { 5301 tree function_decl = gimple_call_fndecl (stmt); 5302 /* Prefetch pass can create type-mismatching prefetch builtin calls which 5303 fail the gimple_call_builtin_p test above. Handle them here. */ 5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH)) 5305 return; 5306 5307 if (function_decl == NULL_TREE) 5308 { The problem is present already since 2016-11-23 commit 56b1c60 (r242761) "Merge from HSA branch to trunk", and the fix obvious enough. gcc/ * hsa-gen.c (gen_hsa_insns_for_call): Move 'function_decl == NULL_TREE' check earlier. gcc/testsuite/ * c-c++-common/gomp/hsa-indirect-call-1.c: New file. (cherry picked from commit 973bce0)
jpf91
pushed a commit
that referenced
this pull request
May 8, 2021
gcc/ada/ * raise-gcc.c (__gnat_others_value): Remove const qualifier. (__gnat_all_others_value): Likewise. (__gnat_unhandled_others_value): Likewise. (GNAT_OTHERS): Cast to Exception_Id instead of _Unwind_Ptr. (GNAT_ALL_OTHERS): Likewise. (GNAT_UNHANDLED_OTHERS): Likewise. (Is_Handled_By_Others): Change parameter type to Exception_Id. (Language_For): Likewise. (Foreign_Data_For): Likewise. (is_handled_by): Likewise. Adjust throughout, remove redundant line and fix indentation. * libgnat/a-exexpr.adb (Is_Handled_By_Others): Remove pragma and useless qualification from parameter type. (Foreign_Data_For): Likewise. (Language_For): Likewise.
jpf91
pushed a commit
that referenced
this pull request
Jun 15, 2021
The fixed error is: ==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900 #0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172 #1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161 #2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517 #3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686 #4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567 #5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656 #6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657 #7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667 #8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773 #9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191 #10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001 #11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154 #12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289 #13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269 #14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537 #15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482 #16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210 #17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349 #18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39 #19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332 #20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd) 0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920) allocated by thread T0 here: #0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102 #1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156 gcc/ChangeLog: * gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
jpf91
pushed a commit
that referenced
this pull request
Jul 9, 2021
…atch.pd All of the optimizations/transformations mentioned in bugzilla for PR tree-optimization/40210 are already implemented in mainline GCC, with one exception. In comment #5, there's a suggestion that (bswap64(x)>>56)&0xff can be implemented without the bswap as (unsigned char)x, or equivalently x&0xff. This patch implements the above optimization, and closely related variants. For any single bit, (bswap(X)>>C1)&1 can be simplified to (X>>C2)&1, where bit position C2 is the appropriate permutation of C1. Similarly, the bswap can eliminated if the desired set of bits all lie within the same byte, hence (bswap(x)>>8)&255 can always be optimized, as can (bswap(x)>>8)&123. Previously, int foo(long long x) { return (__builtin_bswap64(x) >> 56) & 0xff; } compiled with -O2 to foo: movq %rdi, %rax bswap %rax shrq $56, %rax ret with this patch, it now compiles to foo: movzbl %dil, %eax ret 2021-07-08 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR tree-optimization/40210 * match.pd (bswap optimizations): Simplify (bswap(x)>>C1)&C2 as (x>>C3)&C2 when possible. Simplify bswap(x)>>C1 as ((T)x)>>C2 when possible. Simplify bswap(x)&C1 as (x>>C2)&C1 when 0<=C1<=255. gcc/testsuite/ChangeLog PR tree-optimization/40210 * gcc.dg/builtin-bswap-13.c: New test. * gcc.dg/builtin-bswap-14.c: New test.
jpf91
pushed a commit
that referenced
this pull request
Nov 27, 2021
Fixes: ==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178 READ of size 4 at 0x00000666ca5c thread T0 #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855 #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916 #2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887 #3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829 #4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427 #5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346 #6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967 #7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808 #8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146 for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d. gcc/d/ChangeLog: * d-attribs.cc (parse_optimize_options): Check index before accessing cl_options.
jpf91
pushed a commit
that referenced
this pull request
Dec 4, 2021
Fixes: ==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178 READ of size 4 at 0x00000666ca5c thread T0 #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855 #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916 #2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887 #3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829 #4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427 #5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346 #6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967 #7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808 #8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146 for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d. gcc/d/ChangeLog: * d-attribs.cc (parse_optimize_options): Check index before accessing cl_options.
jpf91
pushed a commit
that referenced
this pull request
Dec 31, 2021
…imize or target pragmas [PR103012] The following testcases ICE when an optimize or target pragma is followed by a long line (4096+ chars). This is because on such long lines we can't use columns anymore, but the cpp_define calls performed by c_cpp_builtins_optimize_pragma or from the backend hooks for target pragma are done on temporary buffers and expect to get columns from whatever line they appear on (which happens to be the long line after optimize/target pragma), and we run into: #0 fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986 #1 0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502 #2 0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827 #3 0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898 #4 0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592 #5 0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394 #6 0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601 #7 0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639 #8 0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589 #9 0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>) at ../../gcc/c-family/c-cppbuiltin.c:600 assertion that LC_RENAME doesn't happen first. I think the right fix is emit those predefined macros upon optimize/target pragmas with BUILTINS_LOCATION, like we already do for those macros at the start of the TU, they don't appear in columns of the next line after it. Another possibility would be to force them at the location of the pragma. 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR c++/103012 gcc/ * config/i386/i386-c.c (ix86_pragma_target_parse): Perform cpp_define/cpp_undef calls with forced token locations BUILTINS_LOCATION. * config/arm/arm-c.c (arm_pragma_target_parse): Likewise. * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise. * config/s390/s390-c.c (s390_pragma_target_parse): Likewise. gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform cpp_define_unused/cpp_undef calls with forced token locations BUILTINS_LOCATION. gcc/testsuite/ PR c++/103012 * g++.dg/cpp/pr103012.C: New test. * g++.target/i386/pr103012.C: New test.
jpf91
pushed a commit
that referenced
this pull request
Jan 26, 2022
…imize or target pragmas [PR103012] The following testcases ICE when an optimize or target pragma is followed by a long line (4096+ chars). This is because on such long lines we can't use columns anymore, but the cpp_define calls performed by c_cpp_builtins_optimize_pragma or from the backend hooks for target pragma are done on temporary buffers and expect to get columns from whatever line they appear on (which happens to be the long line after optimize/target pragma), and we run into: #0 fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986 #1 0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502 #2 0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827 #3 0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898 #4 0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592 #5 0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394 #6 0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601 #7 0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639 #8 0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589 #9 0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>) at ../../gcc/c-family/c-cppbuiltin.c:600 assertion that LC_RENAME doesn't happen first. I think the right fix is emit those predefined macros upon optimize/target pragmas with BUILTINS_LOCATION, like we already do for those macros at the start of the TU, they don't appear in columns of the next line after it. Another possibility would be to force them at the location of the pragma. 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR c++/103012 gcc/ * config/i386/i386-c.c (ix86_pragma_target_parse): Perform cpp_define/cpp_undef calls with forced token locations BUILTINS_LOCATION. * config/arm/arm-c.c (arm_pragma_target_parse): Likewise. * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise. * config/s390/s390-c.c (s390_pragma_target_parse): Likewise. gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform cpp_define_unused/cpp_undef calls with forced token locations BUILTINS_LOCATION. gcc/testsuite/ PR c++/103012 * g++.dg/cpp/pr103012.C: New test. * g++.target/i386/pr103012.C: New test. (cherry picked from commit 1dbe26b)
jpf91
pushed a commit
that referenced
this pull request
Feb 23, 2022
…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive).
jpf91
pushed a commit
that referenced
this pull request
Mar 25, 2022
Here we weren't respecting SFINAE when evaluating a call to a consteval function, which caused us to reject the new testcase below. This patch fixes this by making build_over_call use the SFINAE-friendly version of cxx_constant_value. This change causes us to no longer diagnose ahead of time a couple of non-constant non-dependent consteval calls in consteval-if2.C with -fchecking=2. These errors were apparently coming from the call to fold_non_dependent_expr in build_non_dependent_expr (for the RHS of the +=) despite complain=tf_none being passed. Now that build_over_call respects the value of complain during constant evaluation of a consteval call, the errors are gone. That the errors are also gone without -fchecking=2 is a regression caused by r12-7264-gc19f317a78c0e4 and is the subject of PR104620. As described in comment #5, I think it's basically an accident that we were diagnosing these two calls correctly before r12-7264, so perhaps we can live without these errors for GCC 12. Thus this patch just XFAILs the two tests. PR c++/104620 gcc/cp/ChangeLog: * call.cc (build_over_call): Use cxx_constant_value_sfinae instead of cxx_constant_value to evaluate a consteval call. * constexpr.cc (cxx_constant_value_sfinae): Add decl parameter and pass it to cxx_eval_outermost_constant_expr. * cp-tree.h (cxx_constant_value_sfinae): Add decl parameter. * pt.cc (fold_targs_r): Pass NULL_TREE as decl parameter to cxx_constant_value_sfinae. gcc/testsuite/ChangeLog: * g++.dg/cpp23/consteval-if2.C: XFAIL two dg-error tests where the argument to the non-constant non-dependent consteval call is wrapped by NON_DEPENDENT_EXPR. * g++.dg/cpp2a/consteval30.C: New test.
jpf91
pushed a commit
that referenced
this pull request
Mar 30, 2022
…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive). (cherry picked from commit 2f59f06)
jpf91
pushed a commit
that referenced
this pull request
May 13, 2022
…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive). (cherry picked from commit 2f59f06)
jpf91
pushed a commit
that referenced
this pull request
May 13, 2022
…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive). (cherry picked from commit 2f59f06)
jpf91
pushed a commit
that referenced
this pull request
Mar 22, 2024
Consider constexpr int VAL = 1; struct foo { template <int B> void bar(typename std::conditional<B==VAL, int, float>::type arg) { } }; template void foo::bar<1>(int arg); where we since r11-291 fail to emit the code for the explicit instantiation. That's because cp_walk_subtrees/TYPENAME_TYPE now walks TYPE_CONTEXT ('conditional' here) as well, and in a template finds the B==VAL template argument. VAL is constexpr, which implies const, which in the global scope implies static. constrain_visibility_for_template then makes "struct conditional<(B == VAL), int, float>" non-TREE_PUBLIC. Then symtab_node::needed_p checks TREE_PUBLIC, sees it's 0, and we don't emit any code. I thought the fix would be some ODR-esque check to not consider constexpr variables/fns that are used just for their value. But it turned out to be tricky. For instance, we can't skip determine_visibility in a template; we can't even skip it for value-dep expressions. For example, no-linkage-expr1.C has using P = struct {}*; template <int N> void f(int(*)[((P)0, N)]) {} where ((P)0, N) is value-dep, but N is not relevant here: we have to ferret out the anonymous type. When instantiating, it's already gone. This patch uses decl_constant_var_p. This is to implement (an approximation) [basic.def.odr]#14.5.1 and [basic.def.odr]#5.2. PR c++/110323 gcc/cp/ChangeLog: * decl2.cc (min_vis_expr_r) <case VAR_DECL>: Do nothing for decl_constant_var_p VAR_DECLs. gcc/testsuite/ChangeLog: * g++.dg/template/explicit-instantiation6.C: New test. * g++.dg/template/explicit-instantiation7.C: New test.
jpf91
pushed a commit
that referenced
this pull request
Sep 8, 2024
…o_debug_section [PR116614] cat abc.C #define A(n) struct T##n {} t##n; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(1) E(2) E(3) int main () { return 0; } ./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c ./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2 (not included in testsuite as it takes a while to compile) FAILs with lto-wrapper: fatal error: Too many copied sections: Operation not supported compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status The following patch fixes that. Most of the 64K+ section support for reading and writing was already there years ago (and especially reading used quite often already) and a further bug fixed in it in the PR104617 fix. Yet, the fix isn't solely about removing the if (new_i - 1 >= SHN_LORESERVE) { *err = ENOTSUP; return "Too many copied sections"; } 5 lines, the missing part was that the function only handled reading of the .symtab_shndx section but not copying/updating of it. If the result has less than 64K-epsilon sections, that actually wasn't needed, but e.g. with -fdebug-types-section one can exceed that pretty easily (reported to us on WebKitGtk build on ppc64le). Updating the section is slightly more complicated, because it basically needs to be done in lock step with updating the .symtab section, if one doesn't need to use SHN_XINDEX in there, the section should (or should be updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would be overwise stored but couldn't fit. But repeating due to that all the symtab decisions what to discard and how to rewrite it would be ugly. So, the patch instead emits the .symtab_shndx section (or sections) last and prepares the content during the .symtab processing and in a second pass when going just through .symtab_shndx sections just uses the saved content. 2024-09-07 Jakub Jelinek <jakub@redhat.com> PR lto/116614 * simple-object-elf.c (SHN_COMMON): Align comment with neighbouring comments. (SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for consistency. (simple_object_elf_find_sections): Formatting fixes. (simple_object_elf_fetch_attributes): Likewise. (simple_object_elf_attributes_merge): Likewise. (simple_object_elf_start_write): Likewise. (simple_object_elf_write_ehdr): Likewise. (simple_object_elf_write_shdr): Likewise. (simple_object_elf_write_to_file): Likewise. (simple_object_elf_copy_lto_debug_section): Likewise. Don't fail for new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy over .symtab_shndx sections, though emit those last and compute their section content when processing associated .symtab sections. Handle simple_object_internal_read failure even in the .symtab_shndx reading case.
jpf91
pushed a commit
that referenced
this pull request
Sep 20, 2024
…o_debug_section [PR116614] cat abc.C #define A(n) struct T##n {} t##n; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(1) E(2) E(3) int main () { return 0; } ./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c ./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2 (not included in testsuite as it takes a while to compile) FAILs with lto-wrapper: fatal error: Too many copied sections: Operation not supported compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status The following patch fixes that. Most of the 64K+ section support for reading and writing was already there years ago (and especially reading used quite often already) and a further bug fixed in it in the PR104617 fix. Yet, the fix isn't solely about removing the if (new_i - 1 >= SHN_LORESERVE) { *err = ENOTSUP; return "Too many copied sections"; } 5 lines, the missing part was that the function only handled reading of the .symtab_shndx section but not copying/updating of it. If the result has less than 64K-epsilon sections, that actually wasn't needed, but e.g. with -fdebug-types-section one can exceed that pretty easily (reported to us on WebKitGtk build on ppc64le). Updating the section is slightly more complicated, because it basically needs to be done in lock step with updating the .symtab section, if one doesn't need to use SHN_XINDEX in there, the section should (or should be updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would be overwise stored but couldn't fit. But repeating due to that all the symtab decisions what to discard and how to rewrite it would be ugly. So, the patch instead emits the .symtab_shndx section (or sections) last and prepares the content during the .symtab processing and in a second pass when going just through .symtab_shndx sections just uses the saved content. 2024-09-07 Jakub Jelinek <jakub@redhat.com> PR lto/116614 * simple-object-elf.c (SHN_COMMON): Align comment with neighbouring comments. (SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for consistency. (simple_object_elf_find_sections): Formatting fixes. (simple_object_elf_fetch_attributes): Likewise. (simple_object_elf_attributes_merge): Likewise. (simple_object_elf_start_write): Likewise. (simple_object_elf_write_ehdr): Likewise. (simple_object_elf_write_shdr): Likewise. (simple_object_elf_write_to_file): Likewise. (simple_object_elf_copy_lto_debug_section): Likewise. Don't fail for new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy over .symtab_shndx sections, though emit those last and compute their section content when processing associated .symtab sections. Handle simple_object_internal_read failure even in the .symtab_shndx reading case. (cherry picked from commit bb8dd09)
jpf91
pushed a commit
that referenced
this pull request
Oct 25, 2024
This patch folds svindex with constant arguments into a vector series. We implemented this in svindex_impl::fold using the function build_vec_series. For example, svuint64_t f1 () { return svindex_u642 (10, 3); } compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...} in the gimple pass lower. This optimization benefits cases where svindex is used in combination with other gimple-level optimizations. For example, svuint64_t f2 () { return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5); } has previously been compiled to f2: index z0.d, #10, #3 mul z0.d, z0.d, #5 ret Now, it is compiled to f2: mov x0, 50 index z0.d, x0, #15 ret We added test cases checking - the application of the transform during gimple for constant arguments, - the interaction with another gimple-level optimization. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svindex_impl::fold): Add constant folding. gcc/testsuite/ * gcc.target/aarch64/sve/index_const_fold.c: New test.
jpf91
pushed a commit
that referenced
this pull request
Nov 5, 2024
We can make use of the integrated rotate step of the XAR instruction to implement most vector integer rotates, as long we zero out one of the input registers for it. This allows for a lower-latency sequence than the fallback SHL+USRA, especially when we can hoist the zeroing operation away from loops and hot parts. This should be safe to do for 64-bit vectors as well even though the XAR instructions operate on 128-bit values, as the bottom 64-bit results is later accessed through the right subregs. This strategy is used whenever we have XAR instructions, the logic in aarch64_emit_opt_vec_rotate is adjusted to resort to expand_rotate_as_vec_perm only when it's expected to generate a single REV* instruction or when XAR instructions are not present. With this patch we can gerate for the input: v4si G1 (v4si r) { return (r >> 23) | (r << 9); } v8qi G2 (v8qi r) { return (r << 3) | (r >> 5); } the assembly for +sve2: G1: movi v31.4s, 0 xar z0.s, z0.s, z31.s, #23 ret G2: movi v31.4s, 0 xar z0.b, z0.b, z31.b, #5 ret instead of the current: G1: shl v31.4s, v0.4s, 9 usra v31.4s, v0.4s, 23 mov v0.16b, v31.16b ret G2: shl v31.8b, v0.8b, 3 usra v31.8b, v0.8b, 5 mov v0.8b, v31.8b ret Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Add generation of XAR sequences when possible. gcc/testsuite/ * gcc.target/aarch64/rotate_xar_1.c: New test.
jpf91
pushed a commit
that referenced
this pull request
Jun 15, 2025
…o_debug_section [PR116614] cat abc.C #define A(n) struct T##n {} t##n; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(1) E(2) E(3) int main () { return 0; } ./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c ./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2 (not included in testsuite as it takes a while to compile) FAILs with lto-wrapper: fatal error: Too many copied sections: Operation not supported compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status The following patch fixes that. Most of the 64K+ section support for reading and writing was already there years ago (and especially reading used quite often already) and a further bug fixed in it in the PR104617 fix. Yet, the fix isn't solely about removing the if (new_i - 1 >= SHN_LORESERVE) { *err = ENOTSUP; return "Too many copied sections"; } 5 lines, the missing part was that the function only handled reading of the .symtab_shndx section but not copying/updating of it. If the result has less than 64K-epsilon sections, that actually wasn't needed, but e.g. with -fdebug-types-section one can exceed that pretty easily (reported to us on WebKitGtk build on ppc64le). Updating the section is slightly more complicated, because it basically needs to be done in lock step with updating the .symtab section, if one doesn't need to use SHN_XINDEX in there, the section should (or should be updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would be overwise stored but couldn't fit. But repeating due to that all the symtab decisions what to discard and how to rewrite it would be ugly. So, the patch instead emits the .symtab_shndx section (or sections) last and prepares the content during the .symtab processing and in a second pass when going just through .symtab_shndx sections just uses the saved content. 2024-09-07 Jakub Jelinek <jakub@redhat.com> PR lto/116614 * simple-object-elf.c (SHN_COMMON): Align comment with neighbouring comments. (SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for consistency. (simple_object_elf_find_sections): Formatting fixes. (simple_object_elf_fetch_attributes): Likewise. (simple_object_elf_attributes_merge): Likewise. (simple_object_elf_start_write): Likewise. (simple_object_elf_write_ehdr): Likewise. (simple_object_elf_write_shdr): Likewise. (simple_object_elf_write_to_file): Likewise. (simple_object_elf_copy_lto_debug_section): Likewise. Don't fail for new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy over .symtab_shndx sections, though emit those last and compute their section content when processing associated .symtab sections. Handle simple_object_internal_read failure even in the .symtab_shndx reading case. (cherry picked from commit bb8dd09)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.