Skip to content

Commit

Permalink
SVE intrinsics: Add constant folding for svindex.
Browse files Browse the repository at this point in the history
This patch folds svindex with constant arguments into a vector series.
We implemented this in svindex_impl::fold using the function build_vec_series.
For example,
svuint64_t f1 ()
{
  return svindex_u642 (10, 3);
}
compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...}
in the gimple pass lower.
This optimization benefits cases where svindex is used in combination with
other gimple-level optimizations.
For example,
svuint64_t f2 ()
{
    return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5);
}
has previously been compiled to
f2:
        index   z0.d, #10, #3
        mul     z0.d, z0.d, #5
        ret
Now, it is compiled to
f2:
        mov     x0, 50
        index   z0.d, x0, #15
        ret

We added test cases checking
- the application of the transform during gimple for constant arguments,
- the interaction with another gimple-level optimization.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svindex_impl::fold): Add constant folding.

gcc/testsuite/
	* gcc.target/aarch64/sve/index_const_fold.c: New test.
  • Loading branch information
Jennifer Schmitz committed Oct 24, 2024
1 parent 078f7c4 commit 90e38c4
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 0 deletions.
14 changes: 14 additions & 0 deletions gcc/config/aarch64/aarch64-sve-builtins-base.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1301,6 +1301,20 @@ class svdup_neonq_impl : public function_base

class svindex_impl : public function_base
{
public:
gimple *
fold (gimple_folder &f) const override
{
/* Apply constant folding if base and step are integer constants. */
tree vec_type = TREE_TYPE (f.lhs);
tree base = gimple_call_arg (f.call, 0);
tree step = gimple_call_arg (f.call, 1);
if (TREE_CODE (base) != INTEGER_CST || TREE_CODE (step) != INTEGER_CST)
return NULL;
return gimple_build_assign (f.lhs,
build_vec_series (vec_type, base, step));
}

public:
rtx
expand (function_expander &e) const override
Expand Down
35 changes: 35 additions & 0 deletions gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized" } */

#include <arm_sve.h>
#include <stdint.h>

#define INDEX_CONST(TYPE, TY) \
sv##TYPE f_##TY##_index_const () \
{ \
return svindex_##TY (10, 3); \
}

#define MULT_INDEX(TYPE, TY) \
sv##TYPE f_##TY##_mult_index () \
{ \
return svmul_x (svptrue_b8 (), \
svindex_##TY (10, 3), \
5); \
}

#define ALL_TESTS(TYPE, TY) \
INDEX_CONST (TYPE, TY) \
MULT_INDEX (TYPE, TY)

ALL_TESTS (uint8_t, u8)
ALL_TESTS (uint16_t, u16)
ALL_TESTS (uint32_t, u32)
ALL_TESTS (uint64_t, u64)
ALL_TESTS (int8_t, s8)
ALL_TESTS (int16_t, s16)
ALL_TESTS (int32_t, s32)
ALL_TESTS (int64_t, s64)

/* { dg-final { scan-tree-dump-times "return \\{ 10, 13, 16, ... \\}" 8 "optimized" } } */
/* { dg-final { scan-tree-dump-times "return \\{ 50, 65, 80, ... \\}" 8 "optimized" } } */

0 comments on commit 90e38c4

Please sign in to comment.