-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: change int to be arbitrary precision #19623
Comments
I'm a big fan of this, myself. It would elevate In many cases (loop iteration variables) the compiler may be able to prove that an |
Let's put this and #19624 in the Thunderdome! Two proposals enter, one proposal leaves... |
A minor related point about this: The |
Representing an int in a single machine word will be tricky. We run across the same problem we had with scalars being in direct interfaces - the GC sees a word which is sometimes a pointer and sometimes isn't. |
Actually, I think they're completely compatible. I recommend both! |
Could we fix that with better stack or heap maps? Instead of each word being "either a pointer or a non-pointer", it would be "a pointer, a non-pointer, or an int". I suppose that would require two bits instead of one per address, though ― might bloat the maps a bit. FWIW, the ML family of languages worked around that issue by making the native types |
@bcmills Yes, two bits per word should work. That just buys us more flexibility to put the distinguishing bits somewhere other than the top bits of the word - not sure if that is worth it. |
I love this proposal in abstract, but I'm very concerned about the performance impact. I think it's not by chance that "no language in its domain has this feature". If we use bound-checking elimination has a similar problem, Go compiler isn't very good at it even nowadays, it basically just handles obvious cases, and doesn't even have a proper VRP pass (the one proposed was abandoned because of compile time concerns). Stuff like a simple multiplication would become a call into the runtime in the general case, and I would surprised if the Go compiler could avoid them in most cases, if we exclude obvious cases like clearly bounded for loops. |
@rasky Languages likes Smalltalk and Lisp (and more recently, JavaScript) have pioneered the implementation of such integers - they can be implemented surprisingly efficiently in the common case. A typical implementation reserves the 2 least significant bits as "tags" - making an One way of using the tag bits is as follows: Given this encoding, if we have two It might be worthwhile performing a little experiment where one generates this additional code for each integer addition, using dummy conditional branches that will never be taken (or just jump to the end of the instruction sequence) and see what the performance impact is. |
Not a fan of this proposal - currently its quite simple to argue about resulting code and its performance characteristics when doing simple arithmetics. Also - even if losing two bits on 64 bit platform is not important, on 32 bit one it is. Maybe we could have an arbitrary precision ints implemented in new built-in type (like we do with complex currently)? |
Can you discuss how such Encoding to JSON should be easy and map really well. As far as I know, JSON spec does not place restrictions on size of numbers, so a really large {"number": 12312323123131231312312312312312321312313123123123123123123} Would map to an What about something like |
Re: 3, note that the compiler's import/export format already encodes arbitrary-sized integer values because of precise constants. |
@shurcooL @griesemer I believe encoding/gob already uses a variable-length encoding for all integer types. |
Go should have an arbitrary precision number type that is more convenient than math.Big. That type should not attempt to masquerade as int/uint, as these aren't just used semantically as "number" but more so as "number compatible with c/foolang code that uses the natural local word size". The root problem here is that golang's design prevents a library from defining an arbitrary precision type that is as semantically and syntactically convenient as a type provided by the runtime/compiler with internal knowledge and cludges. Solve that problem with golang 2.0, or instead we will find ourselves years from now with countless ad hoc accretions like this. Edit: I'm a fan of this design/feature in scripting languages. I don't see how it works as the base int type in a systems language to replace c/c++/java. I absolutely think we should have a great and convenient arbitrary precision number type, and think the road to that is a golang where library defined types are not second class to ad hoc boundary flaunting extensions of the runtime and compiler provided types. |
It's true that one of the roles of Perhaps more importantly, C APIs don't actually use |
It does matter. People make design decisions concerning the size of the address space and how indexes into it can be represented in serializations or other transformed values. Making the type that spans the address space arb precision offloads many complexities to anyone that wants to ship data to other systems. What does a c abi compatible library consumer of golang look like in a world where int/uint is arb precision? Is it really better if the golang side is blind to any issue at the type level and the c side has no choice but panic? I do see the value of these types, I just don't want them conflated with int/uint. I'm entirely in favor of a numeric tower in golang being the default numeric type, I just don't want it to pretend to have the same name as the 32bit/64bit machine/isa determined types. |
Can you give some examples? I'm having a hard time thinking of anything other than a syscall that spans a process boundary but should be sized to the address space, and programs using raw syscalls generally need to validate their arguments anyway.
The same thing it looks like today: using |
@robpike |
I want to capture that in the types, not in a dynamic value range check. I don't think this is an unreasonable expectation of a language that markets itself as a replacement for c/c++/java for systems programming. |
I honestly thought it's 11 days more than it really is. I don't want to lose normal I have nothing against adding an arbitrary precision |
What about floats? JSON can have values like Personally, I would keep |
Please do not make int operation slower by default. I do not need some like |
I don't understand the performance concerns - even if the performance hit would be noticeable in some scenarios, could you not use sized types like |
I am 100% for this proposal. Worrying about the performance impact of arbitrary precision ints is like worrying about the performance impact of array bounds checks, in my opinion. It's negligible if implemented correctly. I also like this because of how inconvenient the big package is and how I always fall back to Python when I'm dealing with big integers. |
Re |
Is this proposal still being considered? How does it play with the decision that there will be no Go2? |
In order to be backwards compatible, it would require adding a new type ( |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
How does this statement impact this proposal?
I'm wondering the same in light of Go 1.21's new forward compatibility features. A few people have suggested a distinct For cases where arbitrary precision math is important, what @griesemer outlined with 2 reserved bits should be faster than the current If that is possible to build as a library, we would be one step closer to making this proposal a reality. The further step of providing a built-in type in the language would make arbitrary precision math more ergonomic for complex calculations. As a thought experiment, let's imagine we do that.
Then, would it be possible, down the road, to modify the type inference to make If that change was restricted to a module based on If this is feasible, the transition could be done if/when we are satisfied with the performance of the If it turned out to work really well, then perhaps we could even deprecate I'm sure there are things I haven't thought of, so I look forward to your thoughts. |
It means it is extremely unlikely to happen. The value of the proposal only really becomes realized if we'd change how
I don't think it is practical to build that as a library. It requires interaction with the garbage collector, which has to know which memory contains pointers and which doesn't. So, at the very least, the compiler has to "magically" recognize if you use that type and insert instrumentation for the GC. Given that you could then also do
I don't think so. For example func main() {
for v := (1<<63)-1; v > 0; v++ {
}
} Currently (when
I agree with @robpike that a lot (probably most) of the value comes from redefining what |
Thanks @Merovius. I think there may still be some value in arbitrary precision integers being first class as a separate type (e.g. If this proposal isn't feasible, here's hoping for checked integers. #30613 |
We can't change Finding a name is hard, though. It should be short, memorable, etc. I give you: |
FWIW, I believe this has been mentioned before, but adding u128/i128 and u256/i256, and maybe u512/i512 would satisfy 99% of what folks really want here. |
Borrow from the Pascal/Modula world, and call it |
We could borrow from math/German and call it Zahlen, or ( |
This is a really interesting idea, but the change as proposed is very much an incompatible one: today, users can assume that x + y has finite fields semantics, with an implicit modulo operation, but if this proposal were implemented, any code that depends on that assumption would no longer work correctly. More practical objections include:
Adding a new Therefore this is a likely decline. Leaving open for four weeks for final comments. @adonovan, for the language change review committee. |
I don't fully agree with this assessment:
In short, I believe this change is possible; the primary unknown is the performance impact. That impact may be small with sufficient engineering effort. The latter is perhaps the crux: we don't have the bandwidth/capacity for this engineering effort, so this may be our limiting factor. |
The size of int may vary across architectures, but it's easy to know which size it is, and existing code may assume that the value is representable in a field of that size.
Any function may be called from inside the loop of another function (and perhaps inlined into it), so I don't really understand this argument. My point was that any arithmetic expression formerly as cheap as a single cycle now costs a conditional branch. The choice of looping construct is not important.
I agree; still, it would require an audit and a clean-up of existing correct code.
Agreed; let's ignore this objection since the runtime is entirely within our control. |
Had we but world enough and time, we could check some of these things empirically. Knowing the shape and frequency of the behavior and performance changes might provide clarity, even with a naive, relatively unoptimized implementation. We even have the GOEXPERIMENT mechanism to enable it. (But at my back I always hear…) |
In particular, you can currently assume that |
As far as loop efficiency, independent of this proposal it would be good if you could declare the type of a loop index, like you can in other languages. e.g. in Java you can |
You can achieve that desire by adding a type cast package main
import "fmt"
func main() {
var sum uint8 = 0
for i := uint8(0); i < 10; i++ {
sum += i
}
fmt.Println(sum)
} |
Thanks, I don't think I've ever seen that done, though it's obvious in retrospect. (It would probably be worth mentioning in A Tour of Go or Effective Go.) So that suggests that the possible loop performance issue doesn't have to be a deal-killer for this proposal. |
Question- is this proposal for |
This proposal is for the type
|
Although there are reasons for making this change, it could break existing Go packages. Any package that serializes an integer value, that takes an |
An idea that has been kicking around for years, but never written down:
The current definition of
int
(and correspondinglyuint
) is that it is either 32 or 64 bits. This causes a variety of problems that are small but annoying and add up:int
typeint
values can overflow silently, yet no one depends on this working. (Those who want overflow use sized types.)I propose that for Go 2 we make a profound change to the language and have
int
anduint
be arbitrary precision. It can be done efficiently - many other languages have done so - and with the new compiler it should be possible to avoid the overhead completely in many cases. (The usual solution is to represent an integer as a word with one bit reserved; for instance if clear, the word points to a big.Int or equivalent, while if set the bit is just cleared or shifted out.)The advantages are many:
int
(anduint
, but I'll stop mentioning it now) become very powerful typeslen
etc. can now capture any size without overflowint
without ceremony, simplifying some arithmetical calculationsMost important, I think it makes Go a lot more interesting. No language in its domain has this feature, and the advantages of security and simplicity it would bring are significant.
The text was updated successfully, but these errors were encountered: