Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: use unsigned integers for all lengths #27460

Closed
ProhtMeyhet opened this issue Sep 3, 2018 · 24 comments
Closed

proposal: Go 2: use unsigned integers for all lengths #27460

ProhtMeyhet opened this issue Sep 3, 2018 · 24 comments
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Milestone

Comments

@ProhtMeyhet
Copy link

I propose for a future major update to use unsigned integers for all lengths. Values <0 had been used in other languages to indicate an error, but go has multiple return values and uses this extensively to return an error{} value besides actual function returns. As such negative values for indicating an error are not only unnecessary, but often plainly wrong. len() for example can't ever return -1 or <0:

hello := "multiverse"
length := len(hello)
t := reflect.TypeOf(length)
fmt.Println(t.Name())  // prints: int

yes, this will require some rethinking, but maybe a few language additions can solve these cases. an easy corner case if len() would return uint is going through lists/arrays backwards:

currently if len() were uint, this would not print index 0:

	a := []string{ "1", "2" }
	for i := len(a)-1; i >= 0; i-- {
		fmt.Println(a[i])
	}

directly fixing it, however in a now unexpected way is possible:

a := []string{ "1", "2" }
for i := len(a)-1; ; i-- {
	fmt.Println(a[i])
	if i == 0 { break }
}

but since there is already range, go could then support reverse range:

a := []string{ "1", "2" }
for _, value := range a {
	fmt.Println(value)
}

a := []string{ "1", "2" }
for _, value := reverse a {
	fmt.Println(value)
}

i understand this proposal is controversial and would make go at a key, low level area behave vastly different then probably any other language, yet this would be correct and in order with go's other language tools (mostly multiple return values) and would make go more robust against errors caused by wrong return values and wrong assumptions.

@DisposaBoy
Copy link

FWIW, I've read this a couple times and I don't think you discuss what problem this would fix... but you do mention some of the obvious drawbacks?

Yes, you mention len(), but presumably it returns int just because that's the default integer type, and if it returned uint, you'd end up having to convert it to int everywhere.

@komuw
Copy link
Contributor

komuw commented Sep 3, 2018

@gopherbot add label Proposal

@ghost
Copy link

ghost commented Sep 3, 2018

FYI, unsigned number caused so many bugs in c languge, especially when used for bounds checking. and i think so does it will to golang.

@ianlancetaylor ianlancetaylor changed the title [Language Change] Use unsigned integers for all lengths proposal: Go 2: use unsigned integers for all lengths Sep 3, 2018
@gopherbot gopherbot added this to the Proposal milestone Sep 3, 2018
@ianlancetaylor ianlancetaylor added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Sep 3, 2018
@ianlancetaylor
Copy link
Member

Fixed size computer integers always have odd behavior at boundaries. For signed integers the odd behavior occurs at values with very large absolute values. For unsigned integers the odd behavior occurs at zero. In real programs zero is much more common. Therefore it is normally better to use signed than unsigned integers.

@DemiMarie
Copy link
Contributor

@ianlancetaylor SML, Ada, and (in debug mode) Rust solve this problem by trapping on overflow (Ada only does this for signed integers, IIRC). IIRC, Ada only takes a penalty of a few % in performance by doing so, though the error messages on trapping are bad.

@bcmills
Copy link
Contributor

bcmills commented Sep 6, 2018

I certainly wouldn't want to do this without #19624: the overflow bugs that @Ctriple mentions are subtle and often difficult to diagnose.

@bcmills
Copy link
Contributor

bcmills commented Sep 6, 2018

I don't think #19624 would suffice either, though. It's often useful to add a signed offset to a length, and Go does not allow expressions to mix signed and unsigned values.

I believe that it would be possible to expand the Go spec to do that safely, but that's a much broader and more invasive change to the language. It would need to provide a huge benefit, and the benefit of this proposal is already not entirely clear to me.

@ProhtMeyhet
Copy link
Author

FWIW, I've read this a couple times and I don't think you discuss what problem this would fix... but you do mention some of the obvious drawbacks?

i tried to be brief and i really can't be more to the point: lengths shouldn't be represented by a type that can be negative. even c, after decades, acknowledged this and introduced size_t.

Fixed size computer integers always have odd behavior at boundaries. For signed integers the odd behavior occurs at values with very large absolute values. For unsigned integers the odd behavior occurs at zero. In real programs zero is much more common.

having a length represented by a type that can be <0 isn't "odd" behavior, it's simply and plainly wrong.

Therefore it is normally better to use signed than unsigned integers.

i disagree.

@dominikh
Copy link
Member

dominikh commented Sep 8, 2018

Both lengths less than zero and lengths that wrap around are wrong. However, make([]byte, len(x) - 1) fails a lot more sanely with signed lengths.

@ProhtMeyhet
Copy link
Author

Both lengths less than zero and lengths that wrap around are wrong.

no. a length less then zero cannot happen. 0 -1 ist still actually 0.

i am arguing on the grounds of length cannot be <0 and -1;0;1 is counterproductive in go and should be discouraged, because <0 cannot happen and we've got multiple return values.

@dominikh
Copy link
Member

dominikh commented Sep 8, 2018

I feel like we're getting our wires crossed here. 0 - 1 is not "still actually 0" in the way integers work. uint(0) - 1 wraps around to the largest number representable by uint. If len were changed to return uint, then the following code

n := len(y)
_ = make([]byte, n - 1)

would potentially allocate gigabytes (on 32 bit) or a ridiculous number (on 64 bit) of bytes if len(y) == 0. That's not any different for C's size_t, either.

That fact won't change unless you redefine how integers work, or invent a special type just for sizes. But in either case, preventing wrap around would add a cost to all size computations, both in terms of CPU time and program size.

@ProhtMeyhet
Copy link
Author

ProhtMeyhet commented Sep 8, 2018

I feel like we're getting our wires crossed here.

we are, and we are mainly because you did not acknowledge the second paragraph i wrote, whereas i was not, and hereby am not, wanted to redefine how integers work. all i am saying is, that go has outlived the need of using negative return values to indicate errors by supporting multiple return values. i hope i made this clear here, but it should've been clear from my previous comment:

i am arguing on the grounds of length cannot be <0 and -1;0;1 is counterproductive in go and should be discouraged, because <0 cannot happen and we've got multiple return values.

@dominikh
Copy link
Member

dominikh commented Sep 8, 2018

Sizes being signed in Go has absolutely nothing to do with indicating errors.

Edit: edited for clarity.

@ProhtMeyhet
Copy link
Author

Sizes being signed in Go has absolutely nothing to do with indicating errors.

which is why they shouldn't be signed as i have pointed out numerous times.

@ianlancetaylor
Copy link
Member

i tried to be brief and i really can't be more to the point: lengths shouldn't be represented by a type that can be negative. even c, after decades, acknowledged this and introduced size_t.

That is a statement but it's not an argument. Some of us think that size_t was a mistake. Clearly lengths can not be negative. But it does not follow that lengths shouldn't be represented by a type that can be negative. Those are two separate statements.

What is the advantage of representing lengths in a type that can not be negative?

@zigo101
Copy link

zigo101 commented Sep 9, 2018

@dominikh

Sizes being signed in Go has absolutely nothing to do with indicating errors.

It is not so absolutely. Sometimes a negative size can indicate an error (as a result).

@ProhtMeyhet

Beside a negative size can indicate an error, sometimes a negative size can be used as an argument to indicate something.

@ProhtMeyhet
Copy link
Author

i tried to be brief and i really can't be more to the point: lengths shouldn't be represented by a type that can be negative. even c, after decades, acknowledged this and introduced size_t.

That is a statement but it's not an argument. Some of us think that size_t was a mistake. Clearly lengths can not be negative. But it does not follow that lengths shouldn't be represented by a type that can be negative. Those are two separate statements.

honestly, i've never faced a more hostile environment by just suggesting a change. why would it matter, if my answer is a statement or an argument? an answer is an answer...

i my opinion size_t was a great addition, but too little, too late and lacking a recognizable name.

What is the advantage of representing lengths in a type that can not be negative?

as with many approaches in go: do the right and logical thing for the purpose of correct code instead of choosing the easy way out (panic vs. exceptions, duck typing etc.). a length <0 is simply illogical, as such it is prone to confuse and generate errors. for that matter: this is only in go, because multiple return values were added much much later to the language specification and as such, as in c, negative values were actually required. this is no longer the case.

@dominikh
Copy link
Member

dominikh commented Sep 9, 2018

because multiple return values were added much much later to the language specification

Multiple return values have been part of Go since at least early 2008, in the earliest versions of the Go specification, before Go had even been announced to the public.

The "hostile environment" you may be experiencing stems from the fact that you have repeatedly ignored the majority of feedback. You have not once provided useful responses to any of of the commenters who pointed out that sizes are signed to avoid bugs. Not because of multiple return values, not because "sizes can be negative", not to return -1 from a function, but to avoid harmful underflow. How does your proposal address this fact? Rote repetition of "I'd like sizes to be unsigned" does not make for a proposal, an argument or a discussion.

I'll summarize the main reason against your proposal, again. Consider the following piece of code:

n := len(y)
_ = make([]byte, n - 1)

and consider its behavior with signed vs unsigned lengths when len(y) == 0.

Feel free to also respond to other counter arguments, such as the effect of the combination of unsigned lengths and signed offsets in a language that lacks implicit conversions.

@ProhtMeyhet
Copy link
Author

The "hostile environment" you may be experiencing stems from the fact that you have repeatedly ignored the majority of feedback.

well, the best thing is, that at least you are acknowledging that there is a hostile environment here. that's sad.

and no, i have not ignored the majority of feedback, i have simply not given each feedback the same amount of counter-feedback, because i already said in my opening statement that this proposal is not without flaws. i actually hoped this would turn into a serious discussion between professionals, but it seems i was mistaken.

I'll summarize the main reason against your proposal, again. Consider the following piece of code:

n := len(y)
_ = make([]byte, n - 1)

and consider its behavior with signed vs unsigned lengths when len(y) == 0.

consider the following piece of code:

n := 18446744073709551615
_ = make([]byte, n - 1)

together with the following piece of knowledge:

UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
— Doug Gwyn

@golang golang locked and limited conversation to collaborators Sep 9, 2018
@dominikh
Copy link
Member

dominikh commented Sep 9, 2018

Apparently everything that can be said has been said. Multiple times.

@ianlancetaylor
Copy link
Member

i tried to be brief and i really can't be more to the point: lengths shouldn't be represented by a type that can be negative. even c, after decades, acknowledged this and introduced size_t.

That is a statement but it's not an argument. Some of us think that size_t was a mistake. Clearly lengths can not be negative. But it does not follow that lengths shouldn't be represented by a type that can be negative. Those are two separate statements.

honestly, i've never faced a more hostile environment by just suggesting a change. why would it matter, if my answer is a statement or an argument? an answer is an answer...

I apologize for my response appearing to be hostile. That was in no way my intent.

Here is what I was trying to say. I hope that this response seems less hostile.

@DisposaBoy asked you to explain the problem that you are trying to solve with this suggestion. Your reply was to say, as quoted above, "lengths shouldn't be represented by a type that can be negative." What I was trying to say, too briefly, was that that does not answer the question. The question is: what is the problem? Your reply simply restated the suggestion: the type of a length should not be negative. But that is not a problem. It is what you are suggesting, but it is not in itself a problem.

I am still trying to understand what problem you are solving.

I hope that this response does not seem hostile. Thanks.

I understand your example of using a very large number when calling make, and I understand that that does not work today. But it also will not work if we adopt your suggested change; it will simply fail in a different way. So that too does not seem to me to be a clear expression of the problem.

@golang golang unlocked this conversation Sep 9, 2018
@bcmills
Copy link
Contributor

bcmills commented Sep 12, 2018

n := len(y)
_ = make([]byte, n - 1)

It's not the 0-length cases I'm worried about: those are easy enough to handle with checked overflow (#19624), and if C had gone that route then the size_t problem would at least be detectable with a fuzzer and -fsanitize=undefined.

My bigger concern with Go is in producing and applying signed offsets, where the individual operands are nonnegative but the overall expression can still be negative:

if delta := len(x) - len(y); delta > 0 {
	[…]
}

@ianlancetaylor
Copy link
Member

Checked overflow helps in some ways but hurts in others. You have to be careful to avoid writing len(p) - b + a and instead write len(p) + (a - b).

@ianlancetaylor
Copy link
Member

This proposal would be a very subtle change to existing semantics. Even if this were a good idea in itself, and there are reasonable questions raised about that in the discussion above, it is too late to make this change without potentially silently breaking existing Go code. Even for Go 2, where we are permitting ourselves to break code, we do not want to break code silently.

It's interesting to note #19113, which proposes moving in the opposite direction: using a signed type for a value that must be non-negative.

@golang golang locked and limited conversation to collaborators Oct 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests

9 participants