Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[upstream] [2024-02-09] Prepare commits #41

Open
wants to merge 79 commits into
base: gcc-patch-dev
Choose a base branch
from

Conversation

gerris-rs
Copy link
Collaborator

@gerris-rs gerris-rs commented Feb 9, 2024

This pull-request aims to help upstreaming commits to the GCC repository by formatting them and checking that they can be cherry-picked/rebased properly.

The last commit upstreamed was:

gccrs: Fix macro parsing for trait items.

The list of commits prepared is as follows:

Commit Build Test
da5e5ee
1d8f5d5

🐙

jakubjelinek and others added 30 commits February 6, 2024 12:57
On the following testcase we emit invalid stmt:
error: type mismatch in ‘widen_mult_plus_expr’
    6 | foo (int c, int b)
      | ^~~
unsigned long
int
unsigned int
unsigned long
_31 = WIDEN_MULT_PLUS_EXPR <b_5(D), 2, _30>;

The recent PR113560 r14-8680 changes tweaked convert_mult_to_widen,
but didn't change convert_plusminus_to_widen for the
TREE_TYPE (rhsN) != typeN cases, but looking at this, it was already
before that change quite weird.

Earlier in those functions it determines actual_precision and from_unsignedN
and wants to use that precision and signedness for the operands and
it used build_and_insert_cast for that (which emits a cast stmt, even for
INTEGER_CSTs) and later on for INTEGER_CST arguments fold_converted them
to typeN (which is unclear to me why, because it seems to have assumed
that TREE_TYPE (rhsN) is typeN, for the actual_precision or from_unsignedN
cases it would be wrong except that build_and_insert_cast forced a SSA_NAME
and so it doesn't trigger anymore).
Now, since r14-8680 it is possible that rhsN also has some other type from
typeN and we again want to cast.

The following patch changes this, so that for the differences in
actual_precision and/or from_unsignedN we actually update typeN and then use
it as the type to convert the arguments to if it isn't useless, for
INTEGER_CSTs by just fold_converting, otherwise using build_and_insert_cast.
And uses useless_type_conversion_p test so that we don't convert unless
necessary.  Plus by doing that effectively also doing the important part of
the r14-8680 convert_mult_to_widen changes in convert_plusminus_to_widen.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113759
	* tree-ssa-math-opts.cc (convert_mult_to_widen): If actual_precision
	or from_unsignedN differs from properties of typeN, update typeN
	to build_nonstandard_integer_type.  If TREE_TYPE (rhsN) is not
	uselessly convertible to typeN, convert it using fold_convert or
	build_and_insert_cast depending on if rhsN is INTEGER_CST or not.
	(convert_plusminus_to_widen): Likewise.

	* gcc.c-torture/compile/pr113759.c: New test.
…PR113736]

As discussed in the PR, e.g. build_fold_addr_expr needs TYPE_ADDR_SPACE
on the outermost reference rather than just on the base, so the
following patch makes sure to propagate the address space from
the accessed var to the MEM_REFs and/or VIEW_CONVERT_EXPRs used to
access those.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113736
	* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use
	var's address space for MEM_REF or VIEW_CONVERT_EXPRs.

	* gcc.dg/bitint-86.c: New test.
The UB on the following testcase isn't diagnosed by -fsanitize=address,
because we see that the array has a single element and optimize the
strlen to 0.  I think it is fine to assume e.g. for range purposes the
lower bound for the strlen as long as we don't try to optimize
strlen (str)
where we know that it returns [26, 42] to
26 + strlen (str + 26), but for the upper bound we really want to punt
on optimizing that for -fsanitize=address to read all the bytes of the
string and diagnose if we run to object end etc.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/110676
	* gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address
	reset maxlen to sizetype maximum.

	* gcc.dg/asan/pr110676.c: New test.
This patch fixes issue reported by Jeff.

Testing is running. Ok for trunk if I passed the testing with no regression ?

gcc/ChangeLog:

	* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Fix inifinite compilation.
	(pre_vsetvl::remove_vsetvl_pre_insns): Ditto.
gcc/cp/ChangeLog:

	* method.cc (early_check_defaulted_comparison): Add
	auto_diagnostic_group.
std::pair ctor used in tiles constexpr variable is only constexpr in C++14
and later, it works with libstdc++ because it is marked constexpr there even
in C++11 mode.

The following patch fixes it by using an unnamed local class instead of
std::pair, and additionally changes the first element from unsigned int to
unsigned char because 0xff has to fit into unsigned char on all hosts.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR target/113763
	* config/aarch64/aarch64.cc (aarch64_output_sme_zero_za): Change tiles
	element from std::pair<unsigned int, char> to an unnamed struct.
	Adjust uses of tile range variable.
It would be neater if the middle end for target_clones used a target
hook for version name mangling, so we only do version name mangling
once.  However, that would require more intrusive refactoring that will
have to wait till Stage 1.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_mangle_decl_assembler_name):
	Move before new caller, and add ".default" suffix.
	(get_suffixed_assembler_name): New.
	(make_resolver_func): Use get_suffixed_assembler_name.
	(aarch64_generate_version_dispatcher_body): Redo name mangling.

gcc/testsuite/ChangeLog:

	* g++.target/aarch64/mv-symbols1.C: New test.
	* g++.target/aarch64/mv-symbols2.C: Ditto.
	* g++.target/aarch64/mv-symbols3.C: Ditto.
	* g++.target/aarch64/mv-symbols4.C: Ditto.
	* g++.target/aarch64/mv-symbols5.C: Ditto.
	* g++.target/aarch64/mvc-symbols1.C: Ditto.
	* g++.target/aarch64/mvc-symbols2.C: Ditto.
	* g++.target/aarch64/mvc-symbols3.C: Ditto.
	* g++.target/aarch64/mvc-symbols4.C: Ditto.
I was suprised to find out that r14-8759 fixed this accepts-invalid.

clang version 17.0.6 rejects the test as well; clang version 19.0.0git
crashes for which I opened
<llvm/llvm-project#80869>.

	PR c++/94231

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/deleted17.C: New test.
If we can't find a scratch register for large model profiling, return
R10_REG.

	PR target/113689
	* config/i386/i386.cc (x86_64_select_profile_regnum): Return
	R10_REG after sorry.
…788]

The deducing this patchset added parsing of this specifier to
cp_parser_decl_specifier_seq unconditionally, but in the C++ grammar
this[opt] only appears in the parameter-declaration non-terminal, so
rather than checking in all the callers of cp_parser_decl_specifier_seq
except for cp_parser_parameter_declaration that this specifier didn't
appear I think it is far easier and closer to what the standard says
to only parse this specifier when called from
cp_parser_parameter_declaration.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR c++/113788
	* parser.cc (CP_PARSER_FLAGS_PARAMETER): New enumerator.
	(cp_parser_decl_specifier_seq): Parse RID_THIS only if
	CP_PARSER_FLAGS_PARAMETER is set in flags.
	(cp_parser_parameter_declaration): Or in CP_PARSER_FLAGS_PARAMETER
	when calling cp_parser_decl_specifier_seq.

	* g++.dg/parse/pr113788.C: New test.
There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with empty args
    8 |   __riscv_vfredosum_tu();
      |   ^~~~~~~~~~~~~~~~~~~~

Below test are passed for this patch.

* The riscv regression tests.

	PR target/113766

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
	the signature of func.
	* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
	* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make
	overloaded func with empty args error.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
	* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
As the following testcases show, the overflow (needs_overflow) and high
handling in wi::mul_internal seem to only work correctly for either
small precisions (less than or equal to 32, that is handled by earlier
simpler code, not the full Knuth's multiplication algorithm) or for
precisions which are multiple of HOST_BITS_PER_WIDE_INT, so it happened
to work fine in most pre-_BitInt era types (and for large bitfields we
probably didn't have good coverage or were lucky and nothing was asking
if there was overflow or not; I think high multiplication is typically
used only when we have optab in corresponding mode).
E.g. on the gcc.dg/bitint-87.c testcase, there were overflow warnings
emitted only the the / 2wb * 3wb _BitInt(128) cases, but not in the
other precisions.

I found 3 issues when prec > HOST_BITS_PER_HALF_WIDE_INT and
(prec % HOST_BITS_PER_WIDE_INT) != 0:
1) the checking for overflow was simply checking the values of the
   r array members from half_blocks_needed to 2 * half_blocks_needed - 1,
   for UNSIGNED overflow checking if any of them is non-zero, for
   SIGNED comparing them if any is different from top where top is computed
   from the sign bit of the result (see below); similarly, the highpart
   multiplication simply picks the half limbs at r + half_blocks_needed
   offset; and furthermore, for SIGNED there is an adjustment if either
   operand was negative which also just walks r half-limbs from
   half_blocks_needed onwards;
   this works great for precisions like 64 or 128, but for precisions like
   129, 159, 160 or 161 doesn't, it would need to walk the bits in the
   half-limbs starting right above the most significant bit of the base
   precision; that can be up to a whole half-limb and all but one bit from
   the one below it earlier
2) as the comment says, the multiplication is originally done as unsigned
   multiplication, with adjustment of the high bits which subtracts the
   other operand once:
      if (wi::neg_p (op1))
        {
          b = 0;
          for (i = 0; i < half_blocks_needed; i++)
            {
              t = (unsigned HOST_WIDE_INT)r[i + half_blocks_needed]
                - (unsigned HOST_WIDE_INT)v[i] - b;
              r[i + half_blocks_needed] = t & HALF_INT_MASK;
              b = t >> (HOST_BITS_PER_WIDE_INT - 1);
            }
        }
   and similarly for the other one.  Now, this also only works nicely if
   a negative operand has just a single sign bit set in the given precision;
   but we were unpacking the operands with wi_unpack (..., SIGNED);, so
   say for the negative operand in 129-bit precision, that means the least
   significant bit of u[half_blocks_needed - 2] (or v instead of u depending
   on which argument it is) is the set sign bit, but then there are 31
   further copies of the sign bit in u[half_blocks_needed - 2] and
   further 32 copies in u[half_blocks_needed - 1]; the above adjustment
   for signed operands doesn't really do the right job in such cases, it
   would need to subtract many more times the other operand
3) the computation of top for SIGNED
          top = r[(half_blocks_needed) - 1];
          top = SIGN_MASK (top << (HOST_BITS_PER_WIDE_INT / 2));
          top &= mask;
   also uses the most significant bit which fits into prec of the result
   only if prec is multiple of HOST_BITS_PER_WIDE_INT, otherwise we need
   to look at a different bit and sometimes it can be also a bit in
   r[half_blocks_needed - 2]

For 1), while for UNSIGNED overflow it could be fairly easy to check
the bits above prec in r half-limbs for being non-zero, doing all the
shifts also in the SIGNED adjustment code in 2 further locations and finally
for the high handling (unless we want to assert one doesn't do the highpart
multiply for such precisions) would be quite ugly and hard to maintain, so
I instead chose (implemented in the second hunk) to shift the
beyond-precision bits up such that the expectations of the rest of the
code are met, that is the LSB of r[half_blocks_needed] after adjustment
is the bit immediately above the precision, etc.  We don't need to care
about the bits it shifts out, because the multiplication will yield at most
2 * prec bits.

For 2), the patch changes the wi_unpack argument from SIGNED to UNSIGNED,
so that we get all zero bits above the precision.

And finally for 3) it does shifts and perhaps picks lower r half-limb so
that it uses the actual MSB of the result within prec.

2024-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113753
	* wide-int.cc (wi::mul_internal): Unpack op1val and op2val with
	UNSIGNED rather than SIGNED.  If high or needs_overflow and prec is
	not a multiple of HOST_BITS_PER_WIDE_INT, shift left bits above prec
	so that they start with r[half_blocks_needed] lowest bit.  Fix up
	computation of top mask for SIGNED.

	* gcc.dg/torture/bitint-56.c: New test.
	* gcc.dg/bitint-87.c: New test.
gcc.dg/debug/dwarf2/inline5.c has been XPASSing on Solaris (both SPARC
and x86, 32 and 64-bit) with the native assembler since 20210429.
According to gcc-testresults postings, the same is true on AIX.

XPASS: gcc.dg/debug/dwarf2/inline5.c scan-assembler-not \\\\(DIE \\\\(0x([0-9a-f]*)\\\\) DW_TAG_lexical_block\\\\)[^#/!@;\\\\|]*[#/!@;\\\\|]+ +DW_AT.*DW_TAG_lexical_block\\\\)[^#/!@;\\\\|x]*x\\\\1[^#/!@;\\\\|]*[#/!@;\\\\|] +DW_AT_abstract_origin

This is obviously due to

commit 16683ce
Author: Alexandre Oliva <oliva@adacore.com>
Date:   Wed Apr 28 14:07:41 2021 -0300

    fix asm-not pattern in dwarf2/inline5.c

This patch thus removes the xfail.

Tested on i386-pc-solaris2.11 (as and gas), sparc-sun-solaris2.11 (as
and gas), and i686-pc-linux-gnu.

2024-02-06  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* gcc.dg/debug/dwarf2/inline5.c: Don't xfail scan-assembler-not on
	{ aix || solaris2 } && !gas.
ABSU_EXPR unary expr is special because it has a signed integer
argument and unsigned integer result (of the same precision).

The following testcase is miscompiled since ABSU_EXPR handling has
been added to range-op because it uses widest_int::from with the
result sign (i.e. UNSIGNED) rather than the operand sign (i.e. SIGNED),
so e.g. for the 32-bit int argument mask ends up 0xffffffc1 or something
similar and even when it has most significant bit in the precision set,
in widest_int (tree-ssa-ccp.cc really should stop using widest_int, but
that is I think stage1 task) it doesn't appear to be negative and so
bit_value_unop ABSU_EXPR doesn't set the resulting mask/value from
oring of the argument and its negation.

Fixed thusly, not doing that for GIMPLE_BINARY_RHS because I don't know
about a binary op that would need something similar.

2024-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113756
	* range-op.cc (update_known_bitmask): For GIMPLE_UNARY_RHS,
	use TYPE_SIGN (lh.type ()) instead of sign for widest_int::from
	of lh_bits value and mask.

	* gcc.dg/pr113756.c: New test.
This just adds an additional runtime testcase for the fixed issue.

gcc/testsuite/ChangeLog:

	PR tree-optimization/113467
	* gcc.dg/vect/vect-early-break_110-pr113467.c: New test.
We use gsi_move_before (&stmt_gsi, &dest_gsi); to request that the new statement
be placed before any other statement.  Typically this then moves the current
pointer to be after the statement we just inserted.

However it looks like when the BB is empty, this does not happen and the CUR
pointer stays NULL.   There's a comment in the source of gsi_insert_before that
explains:

/* If CUR is NULL, we link at the end of the sequence (this case happens

This adds a default parameter to gsi_move_before to allow us to control where
the insertion happens.

gcc/ChangeLog:

	PR tree-optimization/113731
	* gimple-iterator.cc (gsi_move_before): Take new parameter for update
	method.
	* gimple-iterator.h (gsi_move_before): Default new param to
	GSI_SAME_STMT.
	* tree-vect-loop.cc (move_early_exit_stmts): Call gsi_move_before with
	GSI_NEW_STMT.

gcc/testsuite/ChangeLog:

	PR tree-optimization/113731
	* gcc.dg/vect/vect-early-break_111-pr113731.c: New test.
…l [PR113750]

The report shows that if the FE leaves a label as the first thing in the dest
BB then we ICE because we move the stores before the label.

This is easy to fix if we know that there's still only one way into the BB.
We would have already rejected the loop if there was multiple paths into the BB
however I added an additional check just for early break in case the other
constraints are relaxed later with an explanation.

After that we fix the issue just by getting the GSI after the labels and I add
a bunch of testcases for different positions the label can be added.  Only the
vect-early-break_112-pr113750.c one results in the label being kept.

gcc/ChangeLog:

	PR tree-optimization/113750
	* tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Check
	for single predecessor when doing early break vect.
	* tree-vect-loop.cc (move_early_exit_stmts): Get gsi at the start but
	after labels.

gcc/testsuite/ChangeLog:

	PR tree-optimization/113750
	* gcc.dg/vect/vect-early-break_112-pr113750.c: New test.
	* gcc.dg/vect/vect-early-break_113-pr113750.c: New test.
	* gcc.dg/vect/vect-early-break_114-pr113750.c: New test.
	* gcc.dg/vect/vect-early-break_115-pr113750.c: New test.
	* gcc.dg/vect/vect-early-break_116-pr113750.c: New test.
gcc/rust/ChangeLog:

	* rust-lang.cc (run_rust_tests): Add test.
	* rust-system.h: Add <algorithm>.
	* util/make-rust-unicode.py: Output NFC_Quick_Check table.
	* util/rust-codepoint.h (struct Codepoint): Add is_supplementary
	method.
	* util/rust-unicode-data.h: Generated.
	* util/rust-unicode.cc (binary_search_sorted_array): Removed.
	(lookup_cc): Remove namespace.
	(is_alphabetic): Use std::binary_search
	(nfc_quick_check): New function.
	(nfc_normalize): Use nfc_quick_check.
	(is_nfc_qc_maybe): New function.
	(is_nfc_qc_no): New function.
	(rust_nfc_qc_test): New test.
	* util/rust-unicode.h (is_nfc_qc_no): New function.
	(is_nfc_qc_maybe): New function.
	(enum class): New enum class.
	(nfc_quick_check): New function.
	(rust_nfc_qc_test): New test.

Signed-off-by: Raiki Tamura <tamaron1203@gmail.com>
gcc/rust/ChangeLog:

	* typecheck/rust-hir-type-check.h (class Lifetime): add interned lifetime class
	* typecheck/rust-typecheck-context.cc (TypeCheckContext::TypeCheckContext): add
	resolution tool
	(TypeCheckContext::intern_lifetime): add method to intern lifetime from tyctx
	(TypeCheckContext::lookup_lifetime): add method to lookup lifetime from tyctx
	(TypeCheckContext::intern_and_insert_lifetime): add a helper method

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
gcc/rust/ChangeLog:

	* typecheck/rust-tyty-region.h: New file.

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
gcc/rust/ChangeLog:

	* hir/tree/rust-hir-item.h: Add missing getter

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
gcc/rust/ChangeLog:

	* typecheck/rust-hir-trait-resolve.cc: add regions
	* typecheck/rust-hir-type-check-base.cc (TypeCheckBase::resolve_literal):
	add regions, resolve generic lifetimes
	* typecheck/rust-hir-type-check-expr.cc (TypeCheckExpr::visit): add regions
	* typecheck/rust-hir-type-check-implitem.cc (TypeCheckTopLevelExternItem::visit):
	add regions, resolve lifetimes
	(TypeCheckImplItem::visit): add regions, resove lifetimes
	* typecheck/rust-hir-type-check-implitem.h: add default value for result
	* typecheck/rust-hir-type-check-item.cc (TypeCheckItem::visit): add regions,
	resove lifetimes
	(TypeCheckItem::resolve_impl_block_substitutions): add regions, resove lifetimes
	* typecheck/rust-hir-type-check-path.cc (TypeCheckExpr::visit): add regions,
	resove lifetimes
	(TypeCheckExpr::resolve_root_path): add regions, resove lifetimes
	(TypeCheckExpr::resolve_segments): add regions, resove lifetimes
	* typecheck/rust-hir-type-check-type.cc (TypeCheckType::visit): add regions,
	resove lifetimes
	(TypeCheckType::resolve_root_path): add regions, resove lifetimes
	(ResolveWhereClauseItem::Resolve): add regions, resove lifetimes
	(ResolveWhereClauseItem::visit): add regions, resove lifetimes
	* typecheck/rust-hir-type-check.cc (TypeCheckContext::LifetimeResolver::resolve):
	add regions, resolve lifetimes
	(TraitItemReference::get_type_from_fn): add regions, resove lifetimes
	* typecheck/rust-hir-type-check.h: add regions, resove lifetimes
	* typecheck/rust-substitution-mapper.cc (SubstMapper::SubstMapper): add regions,
	resove lifetimes
	(SubstMapper::Resolve): add regions, resove lifetimes
	(SubstMapper::InferSubst): add regions, resove lifetimes
	(SubstMapper::visit): add regions, resove lifetimes
	* typecheck/rust-substitution-mapper.h: add regions, resove lifetimes
	* typecheck/rust-typecheck-context.cc (TypeCheckContext::TypeCheckContext):
	lifetime resolution
	(TypeCheckContext::regions_from_generic_args): lifetime resolution helper
	* typecheck/rust-tyty-bounds.cc (TypeBoundPredicate::TypeBoundPredicate):
	add regions, resove lifetimes
	(TypeBoundPredicate::operator=): add regions, resove lifetimes
	(TypeBoundPredicate::apply_generic_arguments): add regions, resove lifetimes
	(TypeBoundPredicateItem::get_tyty_for_receiver): add regions, resove lifetimes
	* typecheck/rust-tyty-subst.cc (SubstitutionArgumentMappings::get_regions):
	add regions, resove lifetimes
	(SubstitutionArgumentMappings::get_mut_regions): getter
	(SubstitutionArgumentMappings::error): split error and empty
	(SubstitutionArgumentMappings::empty): split error and empty
	(SubstitutionArgumentMappings::find_symbol): helper
	(SubstitutionRef::get_num_lifetime_params): getter
	(SubstitutionRef::get_num_type_params): getter
	(SubstitutionRef::needs_substitution): extend to regions
	(SubstitutionRef::get_mappings_from_generic_args): helper
	(SubstitutionRef::infer_substitions): add regions
	* typecheck/rust-tyty-subst.h (class RegionParamList): region param handler
	* typecheck/rust-tyty.cc (BaseType::monomorphized_clone): add regions, resove lifetimes
	(InferType::default_type): add regions, resove lifetimes
	(FnType::clone): add regions, resove lifetimes
	(ReferenceType::ReferenceType): add regions
	(ReferenceType::get_region): getter
	(ReferenceType::clone): add regions
	* typecheck/rust-tyty.h: add regions, resove

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
gcc/rust/ChangeLog:

	* typecheck/rust-hir-type-check-implitem.cc (TypeCheckTopLevelExternItem::visit):
	Add region constraints.
	(TypeCheckImplItem::visit): Add region constraints.
	* typecheck/rust-hir-type-check-implitem.h: Add region constraints.
	* typecheck/rust-hir-type-check-item.cc (TypeCheckItem::ResolveImplBlockSelf):
	Add region constraints.
	(TypeCheckItem::visit): Add region constraints.
	(TypeCheckItem::resolve_impl_item): Add region constraints.
	(TypeCheckItem::resolve_impl_block_substitutions): Add region constraints.
	* typecheck/rust-hir-type-check-item.h: Add region constraints.
	* typecheck/rust-hir-type-check-type.cc (ResolveWhereClauseItem::Resolve):
	Add region constraints.
	(ResolveWhereClauseItem::visit): Add region constraints.
	* typecheck/rust-hir-type-check-type.h (class ResolveWhereClauseItem):
	Add region constraints.
	* typecheck/rust-hir-type-check.cc (TraitItemReference::get_type_from_fn):
	Add region constraints.
	* typecheck/rust-tyty-bounds.cc (TypeBoundPredicate::TypeBoundPredicate):
	Add region constraints.
	* typecheck/rust-tyty-subst.cc (SubstitutionRef::get_region_constraints):
	Add region constraints.
	* typecheck/rust-tyty-subst.h (class BaseType): Add region constraints.
	(struct RegionConstraints): Add region constraints.
	* typecheck/rust-tyty.cc (BaseType::monomorphized_clone): Add region constraints.
	(ADTType::clone): Add region constraints.
	(FnType::clone): Add region constraints.
	(ProjectionType::clone): Add region constraints.
	* typecheck/rust-tyty.h: Add region constraints.

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
gcc/rust/ChangeLog:

	* typecheck/rust-tyty.cc (BaseType::BaseType): Store orig ref.
	(BaseType::get_orig_ref): Add getter.
	* typecheck/rust-tyty.h: Store orig ref.

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>
Previously, the default ABI was set to Rust, which is not correct for
extern blocks and extern functions. This patch changes the default
ABI to C for these cases.

gcc/rust/ChangeLog:

	* hir/rust-ast-lower-base.cc (ASTLoweringBase::lower_qualifiers):
	Change default ABI to C for extern functions
	(ASTLoweringBase::lower_extern_block): Likewise

Signed-off-by: Nobel Singh <nobel2073@gmail.com>
Fixes Rust-GCC#1483

gcc/testsuite/ChangeLog:

	* rust/compile/issue-1483.rs: New test.
Fixes Rust-GCC#2772

gcc/testsuite/ChangeLog:

	* rust/compile/issue-2772-1.rs: New test.
	* rust/compile/issue-2772-2.rs: New test.
Fixes Rust-GCC#2747

gcc/rust/ChangeLog:

	* typecheck/rust-tyty-subst.cc (SubstitutionRef::get_mappings_from_generic_args): fix

gcc/testsuite/ChangeLog:

	* rust/compile/issue-2747.rs: New test.
…s seg

This patch introduces one regression because generics are getting better
understood over time. The code here used to apply generics with the same
symbol from previous segments which was a bit of a hack with out limited
inference variable support. The regression looks like it will be related
to another issue which needs to default integer inference variables much
more aggresivly to default integer.

Fixes Rust-GCC#2723

gcc/rust/ChangeLog:

	* typecheck/rust-hir-type-check-path.cc (TypeCheckExpr::resolve_segments): remove hack

gcc/testsuite/ChangeLog:

	* rust/compile/issue-1773.rs: Moved to...
	* rust/compile/issue-1773.rs.bak: ...here.
	* rust/compile/issue-2723-1.rs: New test.
	* rust/compile/issue-2723-2.rs: New test.
GCC Administrator and others added 30 commits February 8, 2024 00:18
The new testcase from P2308 crashed trying to expand 'this' without an
object to refer to, because we stripped the TARGET_EXPR in
create_template_parm_object.  So let's leave it on for giving an error.

gcc/cp/ChangeLog:

	* pt.cc (create_template_parm_object): Pass TARGET_EXPR to
	cxx_constant_value.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/nontype-class64.C: New test.
…pression [PR113776]

My fix for bug 111059 and bug 111911 caused a conversion of a floating
constant to boolean to wrongly no longer be considered an integer
constant expression, because logic to insert a NOP_EXPR in
c_objc_common_truthvalue_conversion for an argument not an integer
constant expression itself now took place after rather than before the
conversion to bool.  In the specific case of casting a floating
constant to bool, the result is an integer constant expression even
though the argument isn't (build_c_cast deals with ensuring that casts
to integer type of anything of floating type more complicated than a
single floating constant don't get wrongly treated as integer constant
expressions even if they fold to constants), so fix the logic in
c_objc_common_truthvalue_conversion to handle that special case.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

	PR c/113776

gcc/c
	* c-typeck.cc (c_objc_common_truthvalue_conversion): Return an
	integer constant expression for boolean conversion of floating
	constant.

gcc/testsuite/
	* gcc.dg/pr113776-1.c, gcc.dg/pr113776-2.c, gcc.dg/pr113776-3.c,
	gcc.dg/pr113776-4.c: New tests.
There is another corn case when similar as below example:

void test (void)
{
  __riscv_vaadd ();
}

We report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vaadd' with empty args
    8 |   __riscv_vaadd ();
      |   ^~~~~~~~~~~~~~~~~~~~

Unfortunately, it will meet another ICE similar to below after above
message.  The underlying build function checker will have zero args
and break some assumption of the function checker.  For example, the
count of args is not less than 2.

ice.c: In function ‘foo’:
ice.c:8:3: internal compiler error: in require_immediate, at
config/riscv/riscv-vector-builtins.cc:4252
    8 |   __riscv_vaadd ();
      |   ^~~~~~~~~~~~~
0x20b36ac riscv_vector::function_checker::require_immediate(unsigned
int, long, long) const
        .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4252
0x20b890c riscv_vector::alu_def::check(riscv_vector::function_checker&) const
        .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins-shapes.cc:387
0x20b38d7 riscv_vector::function_checker::check()
        .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4315
0x20b4876 riscv_vector::check_builtin_call(unsigned int, vec<unsigned int, va_heap, vl_ptr>,
        .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-vector-builtins.cc:4605
0x2069393 riscv_check_builtin_call
        .../__RISC-V_BUILD__/../gcc/config/riscv/riscv-c.cc:227

Below test are passed for this patch.

* The riscv regression tests.

	PR target/113766

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Make
	sure the c.arg_num is >= 2 before checking.
	(struct build_frm_base): Ditto.
	(struct narrow_alu_def): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr113766-1.c: Add new cases.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/
	* config/avr/gen-avr-mmcu-specs.cc: Rename spec cc1_misc to
	cc1_rodata_in_ram.  Rename spec link_misc to link_rodata_in_ram.
	Remove spec asm_misc.
	* config/avr/specs.h: Same.
…110-pr113467.c

I had missed a conversion from unsigned long to uint64_t.
This fixes the failing test on -m32.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-early-break_110-pr113467.c: Change unsigned long *
	to uint64_t *.
I've reconsidered my last change to dr_may_alias_p and decided
it was correct before.  The following reverts that change.

	* tree-vect-data-refs.cc (vect_analyze_early_break_dependences):
	Revert last change to dr_may_alias_p.
… has it.

gcc/
	* config/avr/gen-avr-mmcu-specs.cc (print_mcu) <*cpp_mcu>: Spec always
	defines __AVR_PM_BASE_ADDRESS__ if the core has it.
…[PR113808]

There's a bug in vectorizable_live_operation that restart_loop is defined
outside the loop.

This variable is supposed to indicate whether we are doing a first or last
index reduction.  The problem is that by defining it outside the loop it becomes
dependent on the order we visit the USE/DEFs.

In the given example, the loop isn't PEELED, but we visit the early exit uses
first.  This then sets the boolean to true and it can't get to false again.

So when we visit the main exit we still treat it as an early exit for that
SSA name.

This cleans it up and renames the variables to something that's hopefully
clearer to their intention.

gcc/ChangeLog:

	PR tree-optimization/113808
	* tree-vect-loop.cc (vectorizable_live_operation): Don't cache the
	value cross iterations.

gcc/testsuite/ChangeLog:

	PR tree-optimization/113808
	* gfortran.dg/vect/vect-early-break_1-PR113808.f90: New test.
gcc/
	PR target/113824
	* config/avr/avr-mcus.def (ata5797): Move from avr5 to avr4.
	* doc/avr-mmcu.texi: Rebuild.
Rename pr to lowercase and drop lastprivate.

gcc/testsuite/ChangeLog:

	PR tree-optimization/113808
	* gfortran.dg/vect/vect-early-break_1-PR113808.f90: Moved to...
	* gfortran.dg/vect/vect-early-break_1-pr113808.f90: ...here.
1. The only supported TLS code sequence with ADD is

	addq foo@gottpoff(%rip),%reg

Change je constraint to a memory operand in APX NDD ADD pattern with
register source operand.

2. The instruction length of APX NDD instructions with immediate operand:

op imm, mem, reg

may exceed the size limit of 15 byes when non-default address space,
segment register or address size prefix are used.

Add jM constraint which is a memory operand valid for APX NDD instructions
with immediate operand and add jO constraint which is an offsetable memory
operand valid for APX NDD instructions with immediate operand.  Update
APX NDD patterns with jM and jO constraints.

gcc/

	PR target/113711
	PR target/113733
	* config/i386/constraints.md: List all constraints with j prefix.
	(j>): Change auto-dec to auto-inc in documentation.
	(je): Changed to a memory constraint with APX NDD TLS operand
	check.
	(jM): New memory constraint for APX NDD instructions.
	(jO): Likewise.
	* config/i386/i386-protos.h (x86_poff_operand_p): Removed.
	* config/i386/i386.cc (x86_poff_operand_p): Likewise.
	* config/i386/i386.md (*add<dwi>3_doubleword): Use rjO.
	(*add<mode>_1[SWI48]): Use je and jM.
	(addsi_1_zext): Use jM.
	(*addv<dwi>4_doubleword_1[DWI]): Likewise.
	(*sub<mode>_1[SWI]): Use jM.
	(@add<mode>3_cc_overflow_1[SWI]): Likewise.
	(*add<dwi>3_doubleword_cc_overflow_1): Use rjO.
	(*and<dwi>3_doubleword): Likewise.
	(*anddi_1): Use jM.
	(*andsi_1_zext): Likewise.
	(*and<mode>_1[SWI24]): Likewise.
	(*<code><dwi>3_doubleword[any_or]): Use rjO
	(*code<mode>_1[any_or SWI248]): Use jM.
	(*<code>si_1_zext[zero_extend + any_or]): Likewise.
	* config/i386/predicates.md (apx_ndd_memory_operand): New.
	(apx_ndd_add_memory_operand): Likewise.

gcc/testsuite/

	PR target/113711
	PR target/113733
	* gcc.target/i386/apx-ndd-2.c: New test.
	* gcc.target/i386/apx-ndd-base-index-1.c: Likewise.
	* gcc.target/i386/apx-ndd-no-seg-global-1.c: Likewise.
	* gcc.target/i386/apx-ndd-seg-1.c: Likewise.
	* gcc.target/i386/apx-ndd-seg-2.c: Likewise.
	* gcc.target/i386/apx-ndd-seg-3.c: Likewise.
	* gcc.target/i386/apx-ndd-seg-4.c: Likewise.
	* gcc.target/i386/apx-ndd-seg-5.c: Likewise.
	* gcc.target/i386/apx-ndd-tls-1a.c: Likewise.
	* gcc.target/i386/apx-ndd-tls-2.c: Likewise.
	* gcc.target/i386/apx-ndd-tls-3.c: Likewise.
	* gcc.target/i386/apx-ndd-tls-4.c: Likewise.
	* gcc.target/i386/apx-ndd-x32-1.c: Likewise.
Some information was (re-)computed in different places.
This patch computes them in new struct McuInfo and passes
it around in order to provide the information.

gcc/
	* config/avr/gen-avr-mmcu-specs.cc (struct McuInfo): New.
	(main, print_mcu, diagnose_mrodata_in_ram): Pass it down.
The relation oracle grows the internal vector of SSAs as needed, but
due to an oversight was not growing the basic block vector.  This
fixes the oversight.

	PR tree-optimization/113735

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr113735.c: New test.

gcc/ChangeLog:

	* value-relation.cc (equiv_oracle::add_equiv_to_block): Call
	limit_check().
Since template argument coercion happens relative to the most general
template (for a class template at least), during NTTP type CTAD we might
need to consider outer arguments particularly if the CTAD template is from
the current instantiation (and so depends on outer template parameters).

This patch makes do_class_deduction substitute as many levels of outer
template arguments into a CTAD template (from the current instantiation)
as it can take.

	PR c++/113649

gcc/cp/ChangeLog:

	* pt.cc (do_class_deduction): Add outer_targs parameter.
	Substitute outer arguments into the CTAD template.
	(do_auto_deduction): Pass outer_targs to do_class_deduction.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/nontype-class65.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
The v*_fp16_xN_1.c tests on Arm have been unstable since they were
added.  This is not a problem with the tests themselves, or even the
patches that were added, but with the testsuite infrastructure.  It
turned out that another set of dg- tests for fp16 were corrupting the
cached set of options used by the new tests, leading to running the
tests with incorrect flags.

So the primary goal of this patch is to fix the incorrect internal
caching of the options needed to enable fp16 alternative format on
Arm: the code was storing the result in the same variable that was
being used for neon_fp16 and this was leading to testsuite instability
for tests that were checking for neon with fp16.

But in cleaning this up I also noted that we weren't then applying the
flags correctly having detected what they were, so we also address
that.

I suspect there are still some further issues to address here, since
the framework does not correctly test that the multilibs and startup
code enable alternative format; but this is still an improvement over
what we had before.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_arm_fp16_alternative_ok_nocache): Use
	et_arm_fp16_alternative_flags to cache the result.  Improve test
	for FP16 availability.
	(add_options_for_arm_fp16_alternative): Use
	et_arm_fp16_alternative_flags.
	* g++.dg/ext/arm-fp16/arm-fp16-ops-3.C: Update dg-* flags.
	* g++.dg/ext/arm-fp16/arm-fp16-ops-4.C: Likewise.
	* gcc.dg/torture/arm-fp16-int-convert-alt.c: Likewise.
	* gcc.dg/torture/arm-fp16-ops-3.c: Likewise.
	* gcc.dg/torture/arm-fp16-ops-4.c: Likewise.
	* gcc.target/arm/fp16-aapcs-3.c: Likewise.
	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-1.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-10.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-11.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-12.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-2.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-3.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-4.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-5.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-6.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-7.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-8.c: Likewise.
	* gcc.target/arm/fp16-compile-alt-9.c: Likewise.
	* gcc.target/arm/fp16-rounding-alt-1.c: Likewise.
These non-standard extensions use GCC-specific built-ins. Use
__has_builtin to avoid errors when Clang compiles this header.

See llvm/llvm-project#24289

libstdc++-v3/ChangeLog:

	* include/tr2/type_traits (bases, direct_bases): Use
	__has_builtin to check if required built-ins are supported.
Adding rvv related flags (i.e. --param=riscv-autovec-preference) to
non vector targets bypassed the dejagnu skip test directive. Change the
target selector to skip if rvv is enabled

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/abi-1.c: change selector
	* gcc.target/riscv/rvv/base/pragma-2.c: ditto
	* gcc.target/riscv/rvv/base/pragma-3.c: ditto

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
There's no need to check for self-assignment here, it would just add
extra code for an unlikely case. Add a comment saying so.

libstdc++-v3/ChangeLog:

	PR libstdc++/100147
	* include/bits/gslice.h (operator=): Add comment about lack of
	self-assignment check.
libstdc++-v3/ChangeLog:

	* include/bits/shared_ptr_atomic.h: Fix typo in comment.
When running the testsuite for newlib nano, the --specs=nano.specs
option is used.  This option prepends cpp_unique_options with
"-isystem =/include/newlib-nano" so that the newlib nano header files
override the newlib standard ones.  As the -isystem option is prepended,
the -quiet option is no longer the first option to cc1.  Adjust the test
accordingly.

Patch has been verified on Windows and Linux.

gcc/testsuite/ChangeLog:

	* gcc.misc-tests/options.exp: Allow other options before the
	-quite option for cc1.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
…st [PR113774]

The following patch is the optimization part of PR113774, where in
handle_cast we emit some conditionals which are always true and presumably
VRP would figure that out later and clean it up, except that instead
thread1 is invoked and threads everything through the conditions, so we end
up with really ugly code which is hard to be cleaned up later and then
run into PR113831 VN bug and miscompile stuff.

handle_cast computes low and high as limb indexes, where idx < low
doesn't need any special treatment, just uses the operand's limb,
idx >= high cases all the bits in the limb are an extension (so, for
unsigned widening cast all those bits are 0, for signed widening cast
all those bits are equal to the in earlier code computed sign mask,
narrowing cast don't trigger this code) and then the idx == low && idx <
high case if it exists need special treatment (some bits are copied, others
extended, or all bits are copied but sign mask needs to be computed).

The code already attempted to optimize away some unneeded casts, in the
first hunk below e.g. for the case like 257 -> 321 bit extension, where
low is 4 and high 5 and we use a loop handling the first 4 limbs (2
iterations) with m_upwards_2limb 4 - no special handling is needed in the
loop, and the special handling is done on the first limb after the loop
and then the last limb after the loop gets the extension only, or
in the second hunk where can emit a single comparison instead of
2 e.g. for the low == high case - that must be a zero extension from
multiple of limb bits, say 192 -> 328, or for the case where we know
the idx == low case happens in the other limb processed in the loop, not
the current one.

But the testcase shows further cases where we always know some of the
comparisons can be folded to true/false, in particular there is
255 -> 257 bit zero extension, so low 3, high 4, m_upwards_2limb 4.
The loop handles 2 limbs at the time and for the first limb we were
emitting idx < low ? operand[idx] : 0; but because idx goes from 0
with step 2 2 iterations, idx < 3 is always true, so we can just
emit operand[idx].  This is handled in the first hunk.  In addition
to fixing it (that is the " - m_first" part in there) I've rewritten
it using low to make it more readable.

Similarly, in the other limb we were emitting
idx + 1 <= low ? (idx + 1 == low ? operand[idx] & 0x7ff....ff : operand[idx]) : 0
but idx + 1 <= 3 is always true in the loop, so all we should emit is
idx + 1 == low ? operand[idx] & 0x7ff....ff : operand[idx],
Unfortunately for the latter, when single_comparison is true, we emit
just one comparison, but the code which fills the branches will fill it
with the operand[idx] and 0 cases (for zero extension, for sign extension
similarly), not the operand[idx] (aka copy) and operand[idx] & 0x7ff....ff
(aka most significant limb of the narrower precision) cases.  Instead
of making the code less readable by using single_comparison for that and
handling it in the code later differently I've chosen to just emit
a condition which will be always true and let cfg cleanup clean it up.

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113774
	* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Don't
	emit any comparison if m_first and low + 1 is equal to
	m_upwards_2limb, simplify condition for that.  If not
	single_comparison, not m_first and we can prove that the idx <= low
	comparison will be always true, emit instead of idx <= low
	comparison low <= low such that cfg cleanup will optimize it at
	the end of the pass.

	* gcc.dg/torture/bitint-57.c: New test.
Due to -fnon-call-exceptions the bitint lowering adds new EH edges
in various places, so that the EH edge points from handling (e.g. load or
store) of each of the limbs.  The problem is that the EH edge destination
as shown in the testcase can have some PHIs.  If it is just a virtual
PHI, no big deal, the pass uses TODO_update_ssa_only_virtuals, but if
it has other PHIs, I think we need to copy the values from the preexisting
corresponding EH edge (which is from the original stmt to the EH pad)
to the newly added EH edge, so that the PHI arguments are the same rather
than missing (which ICEs during checking at the end of the pass).

This patch adds a function to do that and uses it whenever adding EH edges.

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113818
	* gimple-lower-bitint.cc (add_eh_edge): New function.
	(bitint_large_huge::handle_load,
	bitint_large_huge::lower_mergeable_stmt,
	bitint_large_huge::lower_muldiv_stmt): Use it.

	* gcc.dg/bitint-89.c: New test.
The asm goto expansion ICEs on the following testcase (which normally
is rejected later), because expand_asm_stmt emits the code to copy
the large var out of the out operand to its memory location into
after_rtl_seq ... after_rtl_end sequence and because it is asm goto,
it duplicates the sequence on each successor edge of the asm goto.
The problem is that with -mstringop-strategy=byte_loop that sequence
contains loops, so CODE_LABELs, JUMP_INSNs, with other strategies
could contain CALL_INSNs etc.
But the copying is done using a loop doing
emit_insn (copy_insn (PATTERN (curr)));
which does the right thing solely for INSNs, it will do the wrong thing
for JUMP_INSNs, CALL_INSNs, CODE_LABELs (with RTL checking even ICE on
them), BARRIERs and the like.

The following patch partially fixes it (with the hope that such stuff only
occurs in asms that really can't be accepted; if one uses say "=rm" or
"=g" constraint then the operand uses the memory directly and nothing is
copied) by using the
duplicate_insn_chain function which is used e.g. in RTL loop unrolling and
which can handle JUMP_INSNs, CALL_INSNs, BARRIERs etc.
As it is meant to operate on sequences inside of basic blocks, it doesn't
handle CODE_LABELs (well, it skips them), so if we need a solution that
will be correct at runtime here for those cases, we'd need to do further
work (e.g. still use duplicate_insn_chain, but if we notice any CODE_LABELs,
walk the sequence again, add copies of the CODE_LABELs and then remap
references to the old CODE_LABELs in the copied sequence to the new ones).
Because as is now, if the code in one of the sequence copies (where the
CODE_LABELs have been left out) decides to jump to such a CODE_LABEL, it
will jump to the CODE_LABEL which has been in the original sequence (which
the code emits on the last edge, after all, duplicating the sequence
EDGE_COUNT times and throwing away the original was wasteful, compared to
doing that just EDGE_COUNT - 1 times and using the original.

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/113415
	* cfgexpand.cc (expand_asm_stmt): For asm goto, use
	duplicate_insn_chain to duplicate after_rtl_seq sequence instead
	of hand written loop with emit_insn of copy_insn and emit original
	after_rtl_seq on the last edge.

	* gcc.target/i386/pr113415.c: New test.
Some exports were missed from the GCC-13 cycle, these are added here
along with the bitint-related ones added in GCC-14.

libgcc/ChangeLog:

	* config/i386/libgcc-darwin.ver: Export bf and bitint-related
	synbols.
build_conflict_bit_table uses %ld format string for
(long) some_int_expression * sizeof (something)
argument, that doesn't work on LLP64 hosts because the
expression has then size_t aka unsigned long long type there.
It can be fixed with
(long) (some_int_expression * sizeof (something))
but it means the value is truncated if it doesn't fit into long.
Ideally we'd use %zd or %zu modifiers here, but it is unclear if we
can rely on it on all hosts, it has been introduced in C99 and C++11
includes C99 by reference, but in reality whether this works or not
depends on the host C library and some of them are helplessly obsolete.

This patch instead introduces new macros HOST_SIZE_T_PRINT_* which
one can use in *printf family function format strings and cast to
fmt_size_t type.

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	* hwint.h (GCC_PRISZ, fmt_size_t, HOST_SIZE_T_PRINT_DEC,
	HOST_SIZE_T_PRINT_UNSIGNED, HOST_SIZE_T_PRINT_HEX,
	HOST_SIZE_T_PRINT_HEX_PURE): Define.
	* ira-conflicts.cc (build_conflict_bit_table): Use it.  Formatting
	fixes.
…ed huge INTEGER_TYPEs [PR113783]

On the following testcases memcpy lowering folds the calls to
reading and writing of MEM_REFs with huge INTEGER_TYPEs - uint256_t
with OImode or uint512_t with XImode.  Further optimization turn
the load from MEM_REF from the large/huge _BitInt var into VIEW_CONVERT_EXPR
from it to the uint256_t/uint512_t.  The backend doesn't really
support those except for "movoi"/"movxi" insns, so it isn't possible
to handle it like casts to supportable INTEGER_TYPEs where we can
construct those from individual limbs - there are no OImode/XImode shifts
and the like we can use.
So, the following patch makes sure for such VCEs that the SSA_NAME operand
of the VCE lives in memory and then turns it into a VIEW_CONVERT_EXPR so
that we actually load the OImode/XImode integer from memory (i.e. a mov).
We need to make sure those aren't merged with other
operations in the gimple_lower_bitint hunks.
For SSA_NAMEs which have underlying VAR_DECLs that is all we need, those
VAR_DECL have ARRAY_TYPEs.
For SSA_NAMEs which have underlying PARM_DECLs or RESULT_DECLs those have
BITINT_TYPE and I had to tweak expand_expr_real_1 for that so that it
doesn't try convert_modes on those when one of the modes is BLKmode - we
want to fall through into the adjust_address on the MEM.

2024-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113783
	* gimple-lower-bitint.cc (bitint_large_huge::lower_stmt): Look
	through VIEW_CONVERT_EXPR for final cast checks.  Handle
	VIEW_CONVERT_EXPRs from large/huge _BitInt to > MAX_FIXED_MODE_SIZE
	INTEGER_TYPEs.
	(gimple_lower_bitint): Don't merge mergeable operations or other
	casts with VIEW_CONVERT_EXPRs to > MAX_FIXED_MODE_SIZE INTEGER_TYPEs.
	* expr.cc (expand_expr_real_1): Don't use convert_modes if either
	mode is BLKmode.

	* gcc.dg/bitint-88.c: New test.
gcc/rust/ChangeLog:

	* typecheck/rust-hir-type-check-implitem.h: Fix typo in field
	(region_costraints -> region_constraints).
This adds a testcase for issue Rust-GCC#2129.

gcc/testsuite/ChangeLog:

	* rust/execute/torture/matches_macro.rs: New test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.