Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector library cleanup #473

Merged
merged 10 commits into from
Jul 17, 2024
Merged

Vector library cleanup #473

merged 10 commits into from
Jul 17, 2024

Conversation

solidpixel
Copy link
Contributor

@solidpixel solidpixel commented Jun 7, 2024

The astcenc vector library effectively implements two different class APIs:

  • An explicit 4-wide API which is used via explicit 4-wide types (e.g. vfloat4) in the codec.
  • A vector-length agnostic API, which is used as N-wide types in the codec (e.g. vfloat) in the codec, and where the width is resolved at compile time.

For historical reasons the classes that are only used as a VLA classes (e.g. vfloat8 for AVX2) implement a lot of functionality which was inherited from the original 4-wide implementation and not actually used in the VLA parts of the codec. This makes adding new VLA implementation (e.g. Arm SVE) more expensive than it needs to be.

This PR doesn't add SVE support, but does some cleanup to minimize the vector library API as a precursor to doing so. The main changes are:

  • Remove VLA indexable lane read functions, as with true VLA code the lane count isn't known.
  • Replace VLA use of .lane<0>() with dedicated scalar function returns e.g. use hmax_s() rather than hmax.lane<0>(). This was beeing done in places before, but was not done consistently. Now this pattern is used everywhere.

@solidpixel solidpixel marked this pull request as draft June 7, 2024 20:41
@solidpixel solidpixel self-assigned this Jun 7, 2024
@solidpixel solidpixel added this to the 5.0.0 milestone Jun 7, 2024
@solidpixel solidpixel modified the milestones: 5.0.0, 4.9.0 Jun 7, 2024
@solidpixel solidpixel changed the title VLA vector library cleanup Vector library cleanup Jul 2, 2024
@solidpixel solidpixel marked this pull request as ready for review July 2, 2024 08:01
@@ -1307,7 +1307,7 @@ unsigned int compute_ideal_endpoint_formats(
vmask lanes_min_error = vbest_ep_error == hmin(vbest_ep_error);
vbest_error_index = select(vint(0x7FFFFFFF), vbest_error_index, lanes_min_error);
vbest_error_index = hmin(vbest_error_index);
int best_error_index = vbest_error_index.lane<0>();
int best_error_index = vbest_error_index.lane0();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You squished out the other lane0 accesses into hmax_s etc... any reason not to wrap this one?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind you fixed it later

@@ -169,8 +203,8 @@ TEST(vfloat, ChangeSign)
/** @brief Test VLA atan. */
TEST(vfloat, Atan)
{
vfloat a(-0.15f, 0.0f, 0.9f, 2.1f);
vfloat r = atan(a);
vfloa4 a(-0.15f, 0.0f, 0.9f, 2.1f);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vfloa4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind you fixed it later

Copy link

@bengaineyarm bengaineyarm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@solidpixel solidpixel merged commit 69bc17b into main Jul 17, 2024
4 checks passed
@solidpixel solidpixel deleted the vla_cleanup branch July 17, 2024 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants