Skip to content

Commit

Permalink
Merge pull request #646 from sergeyvfx/mm_mul_ps_inline
Browse files Browse the repository at this point in the history
Fix performance regression after OPTNONE changes
  • Loading branch information
jserv committed Aug 16, 2024
2 parents 29716df + ab292f2 commit 227cc41
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion sse2neon.h
Original file line number Diff line number Diff line change
Expand Up @@ -2187,7 +2187,7 @@ FORCE_INLINE int _mm_movemask_ps(__m128 a)
// Multiply packed single-precision (32-bit) floating-point elements in a and b,
// and store the results in dst.
// https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_mul_ps
FORCE_INLINE_OPTNONE __m128 _mm_mul_ps(__m128 a, __m128 b)
FORCE_INLINE __m128 _mm_mul_ps(__m128 a, __m128 b)
{
return vreinterpretq_m128_f32(
vmulq_f32(vreinterpretq_f32_m128(a), vreinterpretq_f32_m128(b)));
Expand Down

0 comments on commit 227cc41

Please sign in to comment.