Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve abs() on floats for more correct and faster results #4797

Merged
merged 4 commits into from
Feb 8, 2025

Conversation

laytan
Copy link
Collaborator

@laytan laytan commented Feb 5, 2025

Fixes #4787

Seems like this is more correct (abs of -0 is +0), and also optimises better (into fabs and similar instructions, instead of a cmp&select)

draft because not handling endian types, want to add some tests, and need to apply it to gb_abs so constants also show the same behaviour.

@laytan laytan requested a review from gingerBill February 5, 2025 20:21
@laytan laytan marked this pull request as draft February 5, 2025 20:22
@laytan laytan marked this pull request as ready for review February 6, 2025 21:03
@@ -2174,7 +2174,32 @@ gb_internal lbValue lb_build_builtin_proc(lbProcedure *p, Ast *expr, TypeAndValu
case 128: return lb_emit_runtime_call(p, "abs_complex128", args);
}
GB_PANIC("Unknown complex type");
} else if (is_type_float(t)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an LLVM intrinsic we could use rather than doing this manually?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is (https://llvm.org/docs/LangRef.html#llvm-fabs-intrinsic), I thought you didn't use it initially for a good reason, I can use that instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot remember why I didn't use it. Maybe it required linking with libm?

Copy link
Collaborator Author

@laytan laytan Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried it and it does not require libm, it does generate the exact same code as what I do here (once optimized, just calling the intrinsic in non-optimized gives better results). But for the endian types we would need to byteswap, call intrinsic, and byteswap back, which may be worse.

What do you want me to do, call the intrinsic, or keep it like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep it like this, I guess.

@gingerBill gingerBill merged commit fbee045 into odin-lang:master Feb 8, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Absolute value of negative zero returns negative zero should be positive zero
2 participants