-
-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New subtraction family of vector functions #364
Conversation
Add subtraction family of functions for vectors.
While glancing over this, noticed a few things:
|
Ok cool, I will address 1 and 3 later today. As for 2, I'm not very familiar with simd so could you expand on this a little bit please? |
|
Ahh ok i see. Thanks for the explanation. I will change that later today as well. |
I think I fixed all the things you mentioned. I'm still unable to get the tests to run for the new functions though. |
You must add the tests declared with TEST_LIST {
TEST_ENTRY(glm_vec3_addadd)
TEST_ENTRY(glm_vec3_subadd)
TEST_ENTRY(glm_vec3_muladd) maybe we should make the process of defining tests more easy later. For instance Comment top of the file :) /*
* To register a test:
* 1. use TEST_DECLARE() to forward declare test
* 2. use TEST_ENTRY() to add test to list
*/ |
Yeah making it so you only have to edit one place for the tests would make it easier. However, I am just dumb and blind and didnt read comments so thats my fault sorry. I will fix that up later. Everything else should be good to be merged though right? |
@BeeverFeever thanks yes it looks ok to me at first glance 👍 |
So I got the tests working however the vec4 mulsub and mulsubs fail and I can not figure out what is wrong with them. After v1, v2 and v4 are run through the mulsub function (in the tests), v4[0] = 5 when it should be 7. I've looked at it but the code is exactly the same as all the other mulsub functions and I don't know what is wrong. I hope I'm just missing something simple but I don't know what else to do. |
@BeeverFeever as @gottfriedleibniz suggested I also fixed glmm_fmsub(): 8a1d1cf and will check other sub operations too asap |
Both implementations still look incorrect (we are looking for
|
@recp Oh interesting, I forgot to change those around after @gottfriedleibniz's comment. After changing those the tests pass. I didn't realise simd was being tested in those tests. Anyway, everything should be good now. |
@gottfriedleibniz just tried to quick fix but seems wrong, what about this?: static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
#if CGLM_ARM64
return vfmsq_f32(c, vnegq_f32(a), b);
#else
return vmlsq_f32(c, vnegq_f32(a), b);
#endif
} it should produce |
@BeeverFeever can you add test (TEST_ENTRY) and push latest changes? Also It would be nice to register |
@recp Just did. |
Another alternative could be: |
@gottfriedleibniz just did like: static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
return glmm_fmadd(vnegq_f32(c), a, b);
} thanks |
@BeeverFeever the PR is merged thanks for your contributions 🚀 Also any chance to get these updates for existing ivec[2|3|4] too? Otherwise I will add to TODOs |
Yeah sure, I can do the ivec's. |
Should this not be |
AVX/FMA:
static inline
__m128
glmm_fmadd(__m128 a, __m128 b, __m128 c) {
#ifdef __FMA__
return _mm_fmadd_ps(a, b, c);
#else
return _mm_add_ps(c, _mm_mul_ps(a, b));
#endif
}
static inline
__m128
glmm_fmsub(__m128 a, __m128 b, __m128 c) {
#ifdef __FMA__
return _mm_fmsub_ps(a, b, c);
#else
return _mm_sub_ps(_mm_mul_ps(a, b), c);
#endif
} NEON:
static inline
float32x4_t
glmm_fmadd(float32x4_t a, float32x4_t b, float32x4_t c) {
#if CGLM_ARM64
return vfmaq_f32(c, a, b); /* why vfmaq_f32 is slower than vmlaq_f32 ??? */
#else
return vmlaq_f32(c, a, b);
#endif
}
static inline
float32x4_t
glmm_fmsub(float32x4_t a, float32x4_t b, float32x4_t c) {
return glmm_fmadd(vnegq_f32(c), a, b);
} Wow, I thought I'll add intents of operations of glmm_ funs as documentation for the future, TODOs. @gottfriedleibniz many thanks for helping to fix things |
Addresses: #363
I believe that this should be most of what is required for these functions to be implemented. I don't think I have missed anything but please take a close look at the vec4 functions to ensure they are correct as I wasn't entirely certain about what I was doing.
I wrote tests but when run the new functions do not show up. I don't know if I am just missing something or if I wrote the tests wrong. I am also unable to test on multiple platforms, I only have access to linux. However, I don't believe these changes are major enough to break anything platform specific.
I wasn't able to generate the documentation for the new functions. I am unfamiliar with sphinx and when I runt the commands from the README, I get an error:
Please inform me of anything that requires changing or adjustments, especially the tests and docs on how to get them working correctly.