New subtraction family of vector functions #364

BeeverFeever · 2023-12-04T12:40:57Z

Addresses: #363

I believe that this should be most of what is required for these functions to be implemented. I don't think I have missed anything but please take a close look at the vec4 functions to ensure they are correct as I wasn't entirely certain about what I was doing.

I wrote tests but when run the new functions do not show up. I don't know if I am just missing something or if I wrote the tests wrong. I am also unable to test on multiple platforms, I only have access to linux. However, I don't believe these changes are major enough to break anything platform specific.

I wasn't able to generate the documentation for the new functions. I am unfamiliar with sphinx and when I runt the commands from the README, I get an error:

Theme error:
no theme named 'sphinx_rtd_theme' found (missing theme.conf?)

Please inform me of anything that requires changing or adjustments, especially the tests and docs on how to get them working correctly.

Add subtraction family of functions for vectors.

gottfriedleibniz · 2023-12-04T23:31:46Z

While glancing over this, noticed a few things:

The glm_vec4_addsub documentation and/or name do not match its implementation (reference glm_vec3_addsub).
glm_vec4_mulsub's glmm_fmsub use may require a second look, i.e., glmm_fnmadd.
The assert statement in TEST_IMPL(GLM_PREFIX, vec4_subsub) may require some parenthesis (a - (b - c)).

BeeverFeever · 2023-12-04T23:57:48Z

Ok cool, I will address 1 and 3 later today. As for 2, I'm not very familiar with simd so could you expand on this a little bit please?

gottfriedleibniz · 2023-12-05T00:26:20Z

glm_vec4_mulsub is documented as: dest -= a * b. Referencing these functions (without __FMA__):

glmm_fmsub is defined as: _mm_sub_ps(_mm_mul_ps(a, b), c): a*b - c
glmm_fnmadd is defined as: _mm_sub_ps(c, _mm_mul_ps(a, b)): c - a*b

BeeverFeever · 2023-12-05T00:45:47Z

Ahh ok i see. Thanks for the explanation. I will change that later today as well.

BeeverFeever · 2023-12-05T08:54:50Z

I think I fixed all the things you mentioned. I'm still unable to get the tests to run for the new functions though.

recp · 2023-12-06T07:29:45Z

I'm still unable to get the tests to run for the new functions though.

You must add the tests declared with TEST_DECLARE() to TEST_LIST { like

TEST_LIST {
  TEST_ENTRY(glm_vec3_addadd)
  TEST_ENTRY(glm_vec3_subadd)
  TEST_ENTRY(glm_vec3_muladd)

maybe we should make the process of defining tests more easy later. For instance TEST_IMPL could add itself to test-list dynamically.

Comment top of the file :)

/*
 * To register a test:
 *   1. use TEST_DECLARE() to forward declare test
 *   2. use TEST_ENTRY() to add test to list
 */

BeeverFeever · 2023-12-06T21:43:06Z

Yeah making it so you only have to edit one place for the tests would make it easier. However, I am just dumb and blind and didnt read comments so thats my fault sorry.

I will fix that up later. Everything else should be good to be merged though right?

recp · 2023-12-07T18:52:49Z

@BeeverFeever thanks yes it looks ok to me at first glance 👍

BeeverFeever · 2023-12-10T04:42:50Z

So I got the tests working however the vec4 mulsub and mulsubs fail and I can not figure out what is wrong with them.

After v1, v2 and v4 are run through the mulsub function (in the tests), v4[0] = 5 when it should be 7. I've looked at it but the code is exactly the same as all the other mulsub functions and I don't know what is wrong. I hope I'm just missing something simple but I don't know what else to do.

recp · 2023-12-10T07:07:36Z

@BeeverFeever as @gottfriedleibniz suggested glmm_fnmadd does c - (a * b) which is correct operation for mulsub but you used glmm_fnmsub which does -c -(a * b).

I also fixed glmm_fmsub(): 8a1d1cf and will check other sub operations too asap

gottfriedleibniz · 2023-12-10T12:56:40Z

I also fixed glmm_fmsub

Both implementations still look incorrect (we are looking for (a * b) - c):

vfmsq_f32(c, a, b) -> -(a * b) + c.
vmlsq_f32(vnegq_f32(c), a, b) -> -(a * b) - c

BeeverFeever · 2023-12-10T13:51:22Z

@recp Oh interesting, I forgot to change those around after @gottfriedleibniz's comment. After changing those the tests pass. I didn't realise simd was being tested in those tests. Anyway, everything should be good now.

recp · 2023-12-10T14:06:36Z

@gottfriedleibniz just tried to quick fix but seems wrong, what about this?:

static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
#if CGLM_ARM64
  return vfmsq_f32(c, vnegq_f32(a), b);
#else
  return vmlsq_f32(c, vnegq_f32(a), b);
#endif
}

it should produce a * b -c now

recp · 2023-12-10T14:09:59Z

@BeeverFeever can you add test (TEST_ENTRY) and push latest changes?

Also It would be nice to register glmc_ versions in TESTs you can see similar tests. glmc_ is just wrapper but sometimes there can be copy-paste errors or calling wrong inline functions...

BeeverFeever · 2023-12-10T14:13:40Z

@recp Just did.

gottfriedleibniz · 2023-12-10T15:18:05Z

Another alternative could be: vfmaq_f32/vmlaq_f32 with vnegq_f32(c)

recp · 2023-12-10T17:29:07Z

@gottfriedleibniz just did like:

static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
  return glmm_fmadd(vnegq_f32(c), a, b);
}

thanks

recp · 2023-12-10T17:37:40Z

@BeeverFeever the PR is merged thanks for your contributions 🚀

Also any chance to get these updates for existing ivec[2|3|4] too? Otherwise I will add to TODOs

BeeverFeever · 2023-12-10T21:11:05Z

Yeah sure, I can do the ivec's.

gottfriedleibniz · 2023-12-11T23:23:38Z

return glmm_fmadd(vnegq_f32(c), a, b);

Should this not be glmm_fmadd(a, b, vnegq_f32(c))?

recp · 2023-12-12T08:22:23Z

AVX/FMA:

__m128 _mm_fmadd_ps (__m128 a, __m128 b, __m128 c)
dst[i+31:i] := (a[i+31:i] * b[i+31:i]) + c[i+31:i]

static inline
__m128
glmm_fmadd(__m128 a, __m128 b, __m128 c) {
#ifdef __FMA__
  return _mm_fmadd_ps(a, b, c);
#else
  return _mm_add_ps(c, _mm_mul_ps(a, b));
#endif
}

static inline
__m128
glmm_fmsub(__m128 a, __m128 b, __m128 c) {
#ifdef __FMA__
  return _mm_fmsub_ps(a, b, c);
#else
  return _mm_sub_ps(_mm_mul_ps(a, b), c);
#endif
}

NEON:

vmlaq_f32  (float32x4_t a, float32x4_t b, float32x4_t c)
RESULT[I] = a[i] + (b[i] * c[i]) for i = 0 to 3

static inline
float32x4_t
glmm_fmadd(float32x4_t a, float32x4_t b, float32x4_t c) {
#if CGLM_ARM64
  return vfmaq_f32(c, a, b); /* why vfmaq_f32 is slower than vmlaq_f32 ??? */
#else
  return vmlaq_f32(c, a, b);
#endif
}

static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
  return glmm_fmadd(vnegq_f32(c), a, b);
}

Wow, I thought glmm_fmadd as a + b * c as NEON recently but glmm_fmadd works as AVX/FMA: a * b + c. In this case yes it should be glmm_fmadd(a, b, vnegq_f32(c)).

I'll add intents of operations of glmm_ funs as documentation for the future, TODOs.

@gottfriedleibniz many thanks for helping to fix things

BeeverFeever added 3 commits December 4, 2023 21:32

Vector subtraction functions

7696374

Add subtraction family of functions for vectors.

Fixed missed function

15f29e0

Write tests for new functions

f8b6573

BeeverFeever added 2 commits December 5, 2023 18:49

Fix vec4 functions

39602ef

Fix test

22140d8

Fix simd versions

865faad

Should be all

4c4a69d

recp merged commit 4f88a02 into recp:master Dec 10, 2023

recp added a commit that referenced this pull request Dec 12, 2023

fix glmm_fmsub() on arm-neon: #364 thanks to @gottfriedleibniz

97575bd

Porkepix mentioned this pull request Dec 31, 2023

cglm 0.9.2 Homebrew/homebrew-core#158655

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New subtraction family of vector functions #364

New subtraction family of vector functions #364

BeeverFeever commented Dec 4, 2023

gottfriedleibniz commented Dec 4, 2023 •

edited

Loading

BeeverFeever commented Dec 4, 2023

gottfriedleibniz commented Dec 5, 2023

BeeverFeever commented Dec 5, 2023

BeeverFeever commented Dec 5, 2023

recp commented Dec 6, 2023 •

edited

Loading

BeeverFeever commented Dec 6, 2023

recp commented Dec 7, 2023

BeeverFeever commented Dec 10, 2023

recp commented Dec 10, 2023

gottfriedleibniz commented Dec 10, 2023 •

edited

Loading

BeeverFeever commented Dec 10, 2023

recp commented Dec 10, 2023

recp commented Dec 10, 2023 •

edited

Loading

BeeverFeever commented Dec 10, 2023

gottfriedleibniz commented Dec 10, 2023

recp commented Dec 10, 2023

recp commented Dec 10, 2023

BeeverFeever commented Dec 10, 2023

gottfriedleibniz commented Dec 11, 2023

recp commented Dec 12, 2023

New subtraction family of vector functions #364

New subtraction family of vector functions #364

Conversation

BeeverFeever commented Dec 4, 2023

gottfriedleibniz commented Dec 4, 2023 • edited Loading

BeeverFeever commented Dec 4, 2023

gottfriedleibniz commented Dec 5, 2023

BeeverFeever commented Dec 5, 2023

BeeverFeever commented Dec 5, 2023

recp commented Dec 6, 2023 • edited Loading

BeeverFeever commented Dec 6, 2023

recp commented Dec 7, 2023

BeeverFeever commented Dec 10, 2023

recp commented Dec 10, 2023

gottfriedleibniz commented Dec 10, 2023 • edited Loading

BeeverFeever commented Dec 10, 2023

recp commented Dec 10, 2023

recp commented Dec 10, 2023 • edited Loading

BeeverFeever commented Dec 10, 2023

gottfriedleibniz commented Dec 10, 2023

recp commented Dec 10, 2023

recp commented Dec 10, 2023

BeeverFeever commented Dec 10, 2023

gottfriedleibniz commented Dec 11, 2023

recp commented Dec 12, 2023

gottfriedleibniz commented Dec 4, 2023 •

edited

Loading

recp commented Dec 6, 2023 •

edited

Loading

gottfriedleibniz commented Dec 10, 2023 •

edited

Loading

recp commented Dec 10, 2023 •

edited

Loading