-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new unsigned proposal #341
base: master
Are you sure you want to change the base?
Conversation
* `bits`: bit operations are defined, arithmetic operations do not | ||
wraparound or are not defined | ||
* `modular`: arithmetic operations wraparound using modular 2^n | ||
arithmetic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was also discussing of clamped or saturating semantics. Which would be a bad default, but if you're going to list all of the choices, you missed that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Btw, I added the appropriate intrinsics in 0e6544d.
|
||
An UNSIGNED with n bits has a value range between 0 and 2^n-1. | ||
(Note that Fortran model integers have values between -2^(n-1)+1 and | ||
2^(n-1)-1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in practice, the range is
(Note that Fortran model integers have values between -2^(n-1)+1 and | ||
2^(n-1)-1). | ||
|
||
## 2.2 Arithmetic overflow is undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is contrary to actual hardware behavior (truncation), 50+ years of expectations from C/C++, and current implementations of UNSIGNED (Sun, gfortran 15, LLVM Flang, and XLF's unsigned vector elements).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal still leaves the door open to later add default wraparound on overflow, if the committee/community wishes to do that.
(Note that you literally just implemented a prototype yesterday into LLVM Flang as a Draft PR: llvm/llvm-project#113504, so I don't know how much this qualifies as "current implementation".)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time your glacial process gets around to making any kind of progress, it will have been available for years.
Weren't you the guy who wanted the committee to validate concepts with prototypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time your glacial process gets around to making any kind of progress, it will have been available for years.
It's not my process and I don't agree with it. I tried to change things, including this very repository or running for committee's leadership or starting a compiler. I don't know what else you want me to do.
Weren't you the guy who wanted the committee to validate concepts with prototypes?
Yes, I've been pushing for that for years. Every proposal needs to have a prototype.
But you should implement the right prototype, and try to reach a consensus with it. So far you implemented a design that you like, and are hoping that if LLVM Flang is successful it will become the de-facto standard. That is no different to how the committee has been operating, not always, but sometimes the decision is effectively done by one person without input from others on the committee or wider community. You rightfully object to that, and I agree. But if your solution to this is problem is to simply push things to LLVM Flang because you have been one of the main authors, without feedback from others, then this is no better than what the committee does, and I object to that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you should implement the right prototype, and try to reach a consensus with it. So far you implemented a design that you like, and are hoping that if LLVM Flang is successful it will become the de-facto standard.
Note that LLVM is not alone in implementing this, this is coordinated (and compatible) with what gfortran does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- you do not arbitrarily add modulo operation (wraparound) in arithmetic, so that you can safely use
size - 1
and other common operations. If you make a mistake, the compiler will tell you, instead of being required to wraparound and not tell you.
I think that you're assuming two things here: that compilers will always catch over/underflow in constant expression if the language doesn't define the behavior to be modular, and that they can't support options to emit a warning in those cases if the language does define the behavior to be modular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that compilers will always catch over/underflow in constant expression if the language doesn't define the behavior to be modular
I am assuming the compiler can always catch it, and indeed that seems to be possible to implement, if you play with such expressions in Rust, you will always get either compile-time or runtime errors. However in "Release" mode users would typically disable such checks (unlike in Rust where I think it's not possibly to disable it). Also some compilers can decide not to implement such checks and it would not break anything.
that they can't support options to emit a warning in those cases if the language does define the behavior to be modular.
The underflow size-1
check is a fundamentally runtime check. If we standardize that size-1
must wraparound, how can the compiler insert runtime checks for this? The KISS example would stop working then. If the wraparound is standardized, I think it is not possible for the compiler to tell if you want modular arithmetic or not for expression like size-1
. Please correct me if I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underflow
size-1
check is a fundamentally runtime check. If we standardize thatsize-1
must wraparound, how can the compiler insert runtime checks for this?
Introduce a derived type like the cu
below.
module checked_unsigned
implicit none
private
public :: uc, operator(+), assignment(=)
type uc
unsigned :: u
end type uc
interface operator(+)
procedure checked_add_uc
procedure checked_add_cu
procedure checked_add_cc
end interface operator(+)
interface assignment(=)
procedure set_uc
procedure set_cu
end interface assignment(=)
contains
recursive elemental function checked_add_uc(a,b) result(c)
unsigned, intent(in) :: a
type(uc), intent(in) :: b
type(uc) :: c
c%u = a + b%u
if (c%u < a) error stop
end function checked_add_uc
recursive elemental function checked_add_cu(a,b) result(c)
type(uc), intent(in) :: a
unsigned, intent(in) :: b
type(uc) :: c
c%u = a%u + b
if (c%u < a%u) error stop
end function checked_add_cu
recursive elemental function checked_add_cc(a,b) result(c)
type(uc), intent(in) :: a, b
type(uc) :: c
c%u = a%u + b%u
if (c%u < a%u) error stop
end function checked_add_cc
recursive elemental subroutine set_uc(a,b)
type(uc), intent(in) :: b
unsigned, intent(out) :: a
a = b%u
end subroutine set_uc
recursive elemental subroutine set_cu(a,b)
unsigned, intent(in) :: b
type(uc), intent(out) :: a
a%u = b
end subroutine set_cu
end module checked_unsigned
program main
use checked_unsigned
implicit none
type(uc) :: a
a = huge(a%u)
print *,a + 1u
end program main
That took me a about 10 minutes to type up, because Fortran is a verbose language. The above could also easily be extended as a PDT, which I didn't fell like doing tonight.
To be really effective for checking, this could use two intrinsics: One more easily check for overflow (that would also greatly help integers) and something like gcc_unreachable, let's say UNREACHABLE()
. Depending on the settings, the compiler could ignore it completely, make optimizations on the assumption that it will never be reached, or treat it as an ERROR STOP
. This could even be an intrinsic module for people who are concerned about users getting it wrong.
And my point of laying traps for people who are used to C still stands. Yes, we have deviations from what C does, but these are caught at compile-time. A compiler doing time-travel optimization based on the assumption that overflow cannot happen will bite people, and believe me - that will happen.
You keep bringing up Rust. Rust is a bad example - Fortran is not rust, its behavior of trapping on debug and prescrbing wrapping on non-debug builds is silly, and Fortran should never be allowed to use unsigned integers for array indices, like Rust does. Everybody has to convert to an INTEGER
anyway before using an UNSIGNED
as an array index or for a character substring, so the danger of users falling into that particular trap is not something that I am concerned about at all. Everything which counts things in intrinsics etc is left well alone, for that reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You keep bringing up Rust. Rust is a bad example - Fortran is not rust,
And you keep bringing up C ("Fortran unsigned should be modular becasue C unsigned is"), but Fortran is not C. Whether the Rust approach isgood or not, at least they didn't try just copying C.
Introduce a derived type like the
cu
below [...] That took me a about 10 minutes to type up,
It took you 10mn because you just wrote the addition part. One also need the subraction, the multiplication, the division, and the DT versions of all the intrinsics that work with unsigned
. Granted, once written it's OK (*). But if we want to avoid everyone reinventing the wheel by writing their own version (and incompatible with each other version) it really should be an intrinsic module iso_checked_unsigned
that provide type(uc)
and all the needed routines. Then at this point:
- This is now very close to natively providing two intrinsic types
modular unsigned
andunchecked unsigned
, that was the option 1 of @certik. Why having only one of them as intrinsic and the other one as derived type ? - Why not the opposite, the intrinsic type being the (un)checked one, and the derived type being for the wrapped behavior?
About the performances: what you wrote above is a derived type that always check the overflows, which is not what people want most of time. One need something as fast as possible in "release mode", without any check, and possibly enabled checks by the compiler in debug mode. For that you are proposing new intrinsics that would be dedicated to overflow checking (and that should be standardized first, which is definitely not granted), and that would be active or not depending on some compiler switches... Now it gets extremely confusing IMO, especially if it's buried inside an intrinsic module: one has to choose between a type with checked overflows or unchecked overflows, it cannot be "processor dependent".
(*) which doesn't mean it's as convenient as an intrinsic type anyway... e.g. it's not possible to directly write a literal constant with a derived type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PierUgit It seems you did not read the entirety of my post. Please re-read it in its entirety, and you will find that two possible solutions to the "always checking" part are already contained in there.
Arguments must still be added.
@certik : You are not addressing the point that making this undefined does, in fact, prohibit use of modular arithmetic for portable code. In the guise of allowing both solutions, you are in fact only allowing one. Do you fail to understand this point? |
@tkoenig1 It's addressed here: https://github.com/j3-fortran/fortran_proposals/pull/341/files#diff-93e619e2d42ff60c769ace351e37bb16d99221ea09ee27f907d88cc71920cfd8R205:
I should probably add examples of the random number generator and other examples to make it clear that this proposal in fact does allow you to use the modular arithmetic in a portable way. Here is an example of code that I wrote in the past: subroutine lcg(x)
! Linear congruential generator https://en.wikipedia.org/wiki/Linear_congruential_generator
use iso_fortran_env, only: int32
integer(int32), intent(out) :: x
integer(int32), save :: s = 26250493
s = modulo(s * 48271, 2147483647)
x = s
end subroutine I think this is not portable, since it depends on Using subroutine lcg(x)
! Linear congruential generator https://en.wikipedia.org/wiki/Linear_congruential_generator
use iso_fortran_env, only: uint32
integer(uint32), intent(out) :: x
integer(uint32), save :: s = 26250493u
s = modulo(mul_wrapping(s, 48271u), 2147483647u)
x = s
end subroutine This should be portable. I think one can write lcg to be as simple as |
|
||
- Add the following new intrinsics (names to be decided): | ||
- ADD_WRAPPING(UNSIGNED, UNSIGNED) (or ADD_MODULAR) for modular | ||
arithmetics; the same for SUB, MUL, DIV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why DIV
? unsigned division, unlike signed, can't "overflow".
|
||
- We can consider also adding the following intrinsics: | ||
- Possibly add ADD_SATURATING: saturating at the numeric bounds | ||
instead of overflowing; the same for SUB, MUL, DIV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned division never results in a value greater than the numerator
instead of overflowing; the same for SUB, MUL, DIV | ||
- Possibly add ADD_OVERFLOWING (or ADD_CHECKED): indicates with a | ||
flag that overflow occurred, wrapped value is returned; the same | ||
for SUB, MUL, DIV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this intrinsic handle the case of division by zero? if not, it's redundant
@kargl thank you again for the code, this is exactly the kind of discussion that moves us forward. This thread is a PR with the proposal itself that I plan to submit to the committee, after incorporating all your feedback. |
@certik : Please also mention that gfortran and f18 have already implemented (are about to implement) J3/24-116.txt, and that your proposal will create incompatibility if the name |
This would actually be wise to use the name |
Completely disagree. gfortran and flang implemented the feature as proposed in 24-116.txt. These are prototypes. Should @certik change his prototype implementation for templates to use |
Nope, this is about much more. Consider
Run it through the development version of gfortran, and you will get
(note that |
The situation is different: 24-116.txt has been rejected at the WG5 level, so there's a chance that, should an And after all, if it's "just" a prototype, why warning the committee about a possible incompatibility?
OK, it's also about C interoperability. But since the unsigned types in C are modular, using the name |
flang and gfortran are two different compilers. So, instead of implementing 24-116.txt as written With regards to |
@kargl your example with templates is a good one: the very paper 24-159r4.txt you quoted actually standardized a different syntax than LFortran implemented, as we implemented a previous proposal. It's not a big deal to change or support both. I was planning to create some poll, and maybe I'll still do it, but at least there is still the door open to later do the syntax that I think is better, if we want to. No codes will break, it would be a smooth "upgrade" like the With this unsigned type, I am trying to achieve something similar, so that we can have a smooth upgrade later (if we want to), and yet at the same time we get the job done today to actually have an unsigned type in Fortran. Couple points that came above:
|
The new proposal is in the
unsigned.txt
. I also committed the J3/24-116 proposal for reference.Go ahead and suggest improvements.