The HADD/SUB horizontal intrinsics can all be used in constexpr with suitable handling of the __builtin_ia32_phadd/sub* builtins inside VectorExprEvaluator::VisitCallExpr and InterBuiltin.cpp
_mm_hadd_pi16 _mm_hadd_epi16 _mm256_hadd_epi16
_mm_hadd_pi32 _mm_hadd_epi32 _mm256_hadd_epi32
_mm_hadds_pi16 _mm_hadds_epi16 _mm256_hadds_epi16
_mm_hsub_pi16 _mm_hsub_epi16 _mm256_hsub_epi16
_mm_hsub_pi32 _mm_hsub_epi32 _mm256_hsub_epi32
_mm_hsubs_pi16 _mm_hsubs_epi16 _mm256_hsubs_epi16
_mm_hadd_pd _mm256_hadd_pd
_mm_hadd_ps _mm256_hadd_ps
_mm_hsub_pd _mm256_hsub_pd
_mm_hsub_ps _mm256_hsub_ps
These are elementwise horizontal pairwise so getting the correct src/dst element indices is vital and use a mixture of regular add/sub, fadd/fsub and signed sat add/sub arithmetic