Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native-sized integers proposal #2833

Merged
merged 18 commits into from
Apr 14, 2020
Merged

Native-sized integers proposal #2833

merged 18 commits into from
Apr 14, 2020

Conversation

cston
Copy link
Member

@cston cston commented Sep 27, 2019

No description provided.

@cston cston changed the title Native int proposal Native-sized integers proposal Sep 27, 2019
@cston cston marked this pull request as ready for review September 27, 2019 15:34

### Constants

There is no direct syntax for native int literals. Explicit casts of other integral constant values can be used instead: `(nint)42`.
Copy link
Member

@tannergooding tannergooding Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there is an implicit conversion from int to nint, nint x = 42 and nint x = (nint)42 would both be valid, correct?

I presume const nint x = 42 would also work? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, const nint x = 42 would work.


In reply to: 329172165 [](ancestors = 329172165)

New contextual keywords `nint` and `nuint` represent native signed and unsigned integer types.
The identifiers are only treated as keywords when used in a type context and when the identifier does not otherwise bind to a symbol at that program location.

The types `nint` and `nuint` are essentially aliases to underlying types `System.IntPtr` and `System.UIntPtr`, where the compiler can surface additional conversions and operations for native ints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably a secondary motivation is to reduce confusion? I have always found it confusing that IntPtr is often not representing a pointer.


### Metadata

`nint` and `nuint` are represented in metadata as `System.IntPtr` and `System.UIntPtr`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that there is no longer any reason to use IntPtr and UIntPtr? Even public API can be changed, it seems. We could search and replace our entire codebase?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be some reasoning to continue using IntPtr if users want to clearly differentiate something that is meant to be an "opaque handle" vs something that is meant to be a "native integer". Something that is meant to be a "pointer" could fall into either category, depending on what operations you need to perform on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An opaque handle is often not a pointer, even under the covers. So the confusion still exists (possibly in my mind only)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, opaque handles are often something you never want to perform arithmetic with; and so they are good candidates to stay as IntPtr.

Things which are actually pointers you may want to make nint, if performing arithmetic is something needed.

Things which are actually numbers you probably want to make nint, since you want to treat them as numbers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, hence we're continuing to recommend something that contains Ptr in the name for representing things that are not necessarily pointers. That's a shame, but perhaps it isn't a problem this set out to solve.

```C#
public static readonly IntPtr Zero;
public static int Size { get; }
public static IntPtr Add(IntPtr pointer, int offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth calling out that these members have the behavior of that for IntPtr, so some (like Add and Subtract) may differ in behavior from the nint operators?

Copy link

@marek-safar marek-safar Oct 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that confusing. What is the rationale for Add not to return native type? #ByDesign

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are existing members on Sustem.IntPtr with a particular behavior, changing that behavior is a breaking change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding nint operator +(nint, nint) operator is included in the Operators section above.


In reply to: 330888788 [](ancestors = 330888788)


### Miscellaneous

`native int` types cannot be used as enum underlying types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Given the constraints for constants (values must be less than int.MaxValue and greater than int.MinValue) and that the runtime explicitly supports native int enums, it seems like this would be beneficial for some interop code?

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

[design]: #design

New contextual keywords `nint` and `nuint` represent native signed and unsigned integer types.
The identifiers are only treated as keywords when used in a type context and when the identifier does not otherwise bind to a symbol at that program location.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the identifier does not otherwise bind to a symbol at that program location [](start = 77, length = 80)

This needs to be specified as a modification to name lookup. There is no existing spec concept of when an identifier "does not otherwise bind" to a symbol. #Resolved

New contextual keywords `nint` and `nuint` represent native signed and unsigned integer types.
The identifiers are only treated as keywords when used in a type context and when the identifier does not otherwise bind to a symbol at that program location.

The types `nint` and `nuint` are essentially aliases to underlying types `System.IntPtr` and `System.UIntPtr`, where the compiler can surface additional conversions and operations for native ints.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

essentially aliases [](start = 33, length = 19)

The first half of this sentence and the second half contradict each other (if it is an alias then it has precisely the same set of operations). I think you want to say that they are represented by the underling types in the implementation, and that there is an identity conversion in both directions between these types. #Resolved


### Constants

There is no direct syntax for native int literals. Explicit casts of other integral constant values can be used instead: `(nint)42`.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direct syntax for native int literals [](start = 12, length = 37)

Do you intend that constants of these new types exist? If so, the spec for "constant expressions" would need to be amended. #Resolved

There are no `MinValue` or `MaxValue` fields on `nint` or `nuint` because, other than `nuint.MinValue`, those values cannot be emitted as constants.

Constant folding is supported for all operators: { (unary)`-`, `~`, `+`, `-`, `*`, `/`, `%`, `==`, `!=`, `<`, `<=`, `>`, `>=`, `&`, `|`, `<<`, `>>` }.
Constant folding operations are evaluated with `Int32` and `UInt32` operands rather than native ints for consistent behavior regardless of compiler platform.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are evaluated with Int32 and UInt32 operands [](start = 28, length = 48)

In other words, they produce different values than are produced at runtime on some platforms? #Resolved

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such combinations that could produce different values should be disallowed without explicit unchecked context like int.MaxValue + 1 is disallowed today for int constants.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the decision in the last LDM meeting was that nint x = int.MaxValue + 1 would be illegal, even with unchecked due to the platform specific differences.

Instead, constant folding would only be allowed if the operation stayed within the 32-bit range limitation put forth by the compiler.

It would probably be worth explaining this limitation or elaborating on how folding works if I'm misremembering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant expressions that exceed 32 bits will be reported as errors.


In reply to: 330750070 [](ancestors = 330750070)

Copy link

@Zenexer Zenexer Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant expressions that exceed 32 bits will be reported as errors.

In reply to: 330750070 [](ancestors = 330750070)

I don't see how unary ~ could ever be valid given that constraint.
#Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.


In reply to: 386612190 [](ancestors = 386612190)


### Operators

The following operators are supported.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are supported [](start = 24, length = 13)

Do you mean to say that these are built-in operators considered when seeking to determine the semantics of an operation in source, like the built-in ones described in https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#addition-operator ? #Resolved

| `>=` | `native` | `native` | `native` | `bge` / `bge.un` |
| `&` | `native` | `native` | `native` | `and` |
| `\|` | `native` | `native` | `native` | `or` |
| `<<` | `native` | `int` | `native` | `shl` |
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<< [](start = 2, length = 4)

What are the semantics of the shift operators? For comparison, the spec in https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#shift-operators has text that begins "For the predefined operators, the number of bits to shift is computed as follows" The compiler produces code to mask off any "extra" bits because the IL instructions that perform the shift are undefined outside the valid range of shift values. What mask do you imagine the C# compiler would produce for the shift operators? #Resolved

## Unresolved questions
[unresolved]: #unresolved-questions

What parts of the design are still undecided?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What parts of the design are still undecided [](start = 0, length = 44)

What is the common type between a value of type nint and a value of type IntPtr? The common type between G<nint> and G<IntPtr>? When both are bounds in type inference, which wins?

Constant folding operations are evaluated with `Int32` and `UInt32` operands rather than native ints for consistent behavior regardless of compiler platform.

### Conversions
Types that differ only by `nint` and `IntPtr` and by `nuint` and `UIntPtr` are considered equivalent. That applies to the primitive types as well as arrays, `Nullable<>`, constructed types, and tuples.
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are considered equivalent [](start = 75, length = 25)

I presume you mean there is an identity conversion between them. #Resolved

| `>` | `native` | `native` | `native` | `bgt` / `cgt` / `bgt.un` / `cgt.un` |
| `>=` | `native` | `native` | `native` | `bge` / `bge.un` |
| `&` | `native` | `native` | `native` | `and` |
| `\|` | `native` | `native` | `native` | `or` |
Copy link
Member

@gafter gafter Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\| [](start = 2, length = 4)

This should be <code>&#124;</code> to get it to render properly #Resolved

@gafter
Copy link
Member

gafter commented Oct 2, 2019

I don't have any further comments (Iteration 5). #Resolved

@gafter
Copy link
Member

gafter commented Oct 3, 2019

From #435 (comment) : please specify that these types are atomic. #Resolved

```

The optional attribute argument contains a bit for each primitive type in the type reference.
If there is a single primitive type, the parameter-less constructor can be used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs better wording to be clear for cases like (IntPtr, nint) B

@nietras
Copy link

nietras commented Oct 3, 2019

Good to see a proposal for adding this to C# 👍

Just to plug my own exercises in futility you might find https://github.com/DotNetCross/NativeInts interesting with regards to API. This lists all operators and includes tests that can be viewed for inspiration.


(The IL for each operator includes the variants for `unchecked` and `checked` contexts if different.)

| Unary | Operator Signature | IL |
Copy link
Member

@gafter gafter Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the shape of the language specification, this should be something like

nint operator +(nint value); // nop
nuint operator +(nuint value); // nop
nint operator -(nint value); // neg
nint operator ~(nint value); // not
nuint operator ~(nuint value); // not

They are definitely not considered members of the types as indicated here. #Resolved


### Operators

The following operators are provided by the compiler.
Copy link
Member

@gafter gafter Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following operators are provided by the compiler [](start = 0, length = 52)

The predefined operators are as follows. #Resolved

| `nuint` | `char` | ExplicitNumeric | `conv.u2` / `conv.ovf.u2.un` |
| `nuint` | `float` | ImplicitNumeric | `conv.r.un conv.r4` |
| `nuint` | `double` | ImplicitNumeric | `conv.r.un conv.r8` |
| `nuint` | `decimal` | ExplicitNumeric | `ulong decimal.op_Explicit(decimal) UIntPtr UIntPtr.op_Explicit(ulong)` |
Copy link
Member

@gafter gafter Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExplicitNumeric [](start = 24, length = 15)

Since the conversion from ulong to decimal is an implicit conversion, this should be too. #Resolved

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| `nint` | `char` | ExplicitNumeric | `conv.u2` / `conv.ovf.u2` |
| `nint` | `float` | ImplicitNumeric | `conv.r4` |
| `nint` | `double` | ImplicitNumeric | `conv.r8` |
| `nint` | `decimal` | ExplicitNumeric | `long decimal.op_Explicit(decimal) IntPtr IntPtr.op_Explicit(long)` |
Copy link
Member

@gafter gafter Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExplicitNumeric [](start = 23, length = 15)

Since the conversion from long to decimal is an implicit conversion, this should be too. #Resolved

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a future-proofing perspective, I don't think that makes sense. long and decimal both have fixed sizes, and a long can always be represented as a decimal without any loss in precision--that's a strong guarantee. That isn't the case for nint/nuint -> decimal. If .NET Core ever lands on a platform with 128-bit native ints, that decision is going to be hard to correct.

Constant folding operations are evaluated with `Int32` and `UInt32` operands rather than native ints for consistent behavior regardless of compiler platform.

### Conversions
There are identity conversions between native ints and the underlying types in both directions.
Copy link
Member

@gafter gafter Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

between native ints and the underlying types [](start = 31, length = 44)

Please make these explicit. Specifically, please say that "There is an identity conversion between nint and IntPtr, and between nuint and UIntPtr." #Resolved

}
```

`nint` and `nuint` can be used as an `enum` base type.
Copy link
Member

@gafter gafter Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum [](start = 38, length = 4)

What is the reason for this change? How it is helpful to users? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision at https://github.com/dotnet/csharplang/blob/master/meetings/2019/LDM-2019-10-23.md was to allow as enum underlying type if it is "not too much implementation work".


In reply to: 365039199 [](ancestors = 365039199)

@crowo
Copy link

crowo commented Mar 2, 2020

Also there is need to specify FieldOffset in explicit StructLayout classes/structs which contain pointer sized fields. So we need compiler support to specify word size's in attributes and/or constants. also something like const void* invalidHandle = (void*)-1; should be valid as IntPtr isnt treated like primitive.

@jcouv
Copy link
Member

jcouv commented Apr 8, 2020

@cston Can this be merged, so that I can link to it?

Constant folding is supported for all unary operators { `+`, `-` } and binary operators { `+`, `-`, `*`, `/`, `%`, `==`, `!=`, `<`, `<=`, `>`, `>=`, `&`, `|`, `^`, `<<`, `>>` }.
Constant folding operations are evaluated with `Int32` and `UInt32` operands rather than native ints for consistent behavior regardless of compiler platform.
If the operation results in a constant value in 32-bits, constant folding is performed at compile-time.
Otherwise the operation is executed at runtime and not considered a constant. (The unary operator `~` in particular cannot be used in constant expressions.)
Copy link
Member

@gafter gafter Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unary operator ~ in particular cannot be used in constant expressions [](start = 79, length = 75)

I think this is not correct. ~ can be used in constant expressions for all nint values (e.g. ~(nint)0) is (nint)-1; in general ~(nint)(int)N is (nint)(~(int)N)). This sentence is correct for nuint, as the result value always depends on sizeof(nuint). #Closed

### Operators

The predefined operators are as follows.
These operators are considered during overload resolution based on normal rules for implicit conversions of arguments.
Copy link
Member

@gafter gafter Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on normal rules for implicit [](start = 58, length = 34)

These operators are only considered if one of the operands is of type nint or nuint. #Closed

Compound assignment operations `x op= y` where `x` or `y` are native ints follow the same rules as with other primitive types with pre-defined operators.
Specifically the expression is bound as `x = (T)(x op y)` where `T` is the type of `x` and where `x` is only evaluated once.

The shift operators should mask the number of bits to shift appropriately
Copy link
Member

@gafter gafter Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appropriately [](start = 60, length = 13)

You should clarify that "appropriately" means to five bits if sizeof(nint) is 4, and to six bits of sizeof(nint) is 5. We cannot get that from the quoted section of the spec. #Closed

```C#
nint x = 3;
var y = nameof(nuint);
var z = nint.Zero;
Copy link
Member

@gafter gafter Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero [](start = 13, length = 4)

Disagrees with line 214, which says that Zero is not a member. #Closed

public string ToString(string format);
```

Interfaces implemented by `System.IntPtr` and `System.UIntPtr` _are implicitly included_ in `nint` and `nuint`.
Copy link
Member

@gafter gafter Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

included_ [](start = 79, length = 9)

...with occurrences of the underlying type replaced by the corresponding native type. #Closed


### Miscellaneous

`nint` and `nuint` expressions used as array indices are emitted without conversion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emitted without conversion [](start = 57, length = 26)

It isn't clear what "emitted without conversion" means. The specification for array indexing requires a conversion, and the array indexing IL instructions have specific requirements for the type of the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array indexing IL instructions have specific requirements for the type of the index

IL instructions like ldelem and ldelema take either a native int or an int32


### Constants

There is no direct syntax for native int literals. Explicit casts of other integral constant values can be used instead: `(nint)42`.
Copy link
Member

@gafter gafter Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

direct syntax for native int literals [](start = 12, length = 37)

Do you intend that constant expressions can be of these new types? If so, you need to explicitly say so, as the spec for "constant expressions" excludes these types from the list of types for which constants exist. #Closed

@gafter
Copy link
Member

gafter commented Apr 14, 2020

Finished reviewing (Commit 17)

Copy link
Member

@gafter gafter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@gafter
Copy link
Member

gafter commented Apr 14, 2020

Open question: What is the common type between a value of type nint and a value of type IntPtr? The common type between G and G? When both are bounds in type inference, which wins?

Copy link
Member

@gafter gafter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@cston cston merged commit 583bdd2 into dotnet:master Apr 14, 2020
@cston cston deleted the NativeInt branch April 14, 2020 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.