Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion "Native-Sized Number Types" (VS 16.8, .NET 5) #435

Open
3 of 5 tasks
Tracked by #829
gafter opened this issue Apr 13, 2017 · 81 comments
Open
3 of 5 tasks
Tracked by #829

Champion "Native-Sized Number Types" (VS 16.8, .NET 5) #435

gafter opened this issue Apr 13, 2017 · 81 comments
Assignees
Labels
Design Review Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion
Milestone

Comments

@gafter
Copy link
Member

gafter commented Apr 13, 2017

Summary: nint i = 1; and nuint i2 = 2;

Shipped in preview in 16.7p1.


/cc @KrzysztofCwalina @jaredpar

[jcouv update:] as part of this feature, we should consider what Interlocked overloads the BCL/runtime should provide for those new types.

@jkotas
Copy link
Member

jkotas commented Apr 17, 2017

Proposal: dotnet/corefxlab#1471

@jkotas jkotas mentioned this issue Apr 17, 2017
5 tasks
@CyrusNajmabadi
Copy link
Member

nuint... :)

@jnm2
Copy link
Contributor

jnm2 commented Apr 17, 2017

Actually, could C# add nint, nunit, and nfloat?

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 17, 2017

As i mentioned in the other proposal, i don't see a lot of value in C# adding any keywords here. IntN, FloatN and the like are totally reasonable names to just use as is. And if you really want a all-lower name, then just add using nint = System.IntN.

@jnm2
Copy link
Contributor

jnm2 commented Apr 17, 2017

Do you foresee people opting to use IntN over int as iteration variables?

@CyrusNajmabadi
Copy link
Member

@jnm2 i'm not sure what you mean? Can you give an example? I don't generally think about iteration as being related to native sized ints... so i'm not sure what the connection is. Thanks!

@jnm2
Copy link
Contributor

jnm2 commented Apr 18, 2017

@CyrusNajmabadi Just a plain for loop for example. Everyone defaults to int for the iteration variable type, but would IntN potentially eke out more performance than int on x64?

@CyrusNajmabadi
Copy link
Member

I would really hope not :) But i'll let the CLR guys weigh in on that.

@jkotas
Copy link
Member

jkotas commented Apr 18, 2017

would IntN potentially eke out more performance than int on x64?

native int is more performant than int32 for loops that iterate over memory on 64-bit platforms. int32 tend to require int32->native int casts that are unnecessary overhead. E.g. Check how some methods on Span<T> have been optimized to avoid the extra overhead: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.byte.cs#L76

@jveselka
Copy link

jveselka commented Apr 18, 2017

I wonder, should nint/nuint be allowed as underlying type of enums as well? And nint/nuint consts?

@bbarry
Copy link
Contributor

bbarry commented Apr 18, 2017

@zippec that would be a breaking change in struct layouts wouldn't it?

@tannergooding
Copy link
Member

@bbarry, how would that be a breaking change? Only new enums would support being subclassed from nint/nuint and the runtime already supports said functionality.

The runtime also supports native sized constants (although they are actually 32-bit constants that are implicitly converted)

@bbarry
Copy link
Contributor

bbarry commented Apr 18, 2017

I misread that comment as changing the default underlying type.

@gafter
Copy link
Member Author

gafter commented Apr 18, 2017

If we add this feature, how is the compiler supposed to know if the following code is valid or not (because it would overflow nint during compile-time constant folding)?

nint t = (nint)int.MaxValue + (nint)1;

@tannergooding
Copy link
Member

tannergooding commented Apr 18, 2017

@gafter , I would think the compiler could determine the constant is larger than 32-bits and spit a warning that this will result in different behavior on 32-bit vs 64-bit platform.

Essentially the compiler just does all folding as the largest size (long or ulong) then explicitly downcasts to 32-bits for the store. If the downcast causes an overflow/underflow the compiler can warn/error.

@sharwell
Copy link
Member

sharwell commented Apr 19, 2017

@gafter

how is the compiler supposed to know if the following code is valid or not (because it would overflow nint during compile-time constant folding)?

For compile-time constant folding, my instinct is to define the feature in the following way:

When evaluating a constant value of native size at compile time in checked mode, the computation is performed as though the native size is the preferred size for intermediate values. In this mode, it is an error for an intermediate value to overflow the range of the preferred size. In unchecked mode, the computation is performed as though the native size is the maximum allowable value for native-size integers for the compilation at runtime.

The preferred and maximum size for native integers is implementation-dependent. Examples of the preferred and maximum size for native integers for csc.exe are shown in the following table:

Argument Preferred Size Maximum Size
Default/unspecified 64-bit 64-bit
/platform:anycpu 64-bit 64-bit
/platform:anycpu32bitpreferred 32-bit 64-bit
/platform:x64 64-bit 64-bit
/platform:x86 32-bit 32-bit

@sharwell
Copy link
Member

I added some additional thoughts in the GitHub issue for the original proposal. For the case of native int (IntPtr) and native unsigned int (UIntPtr), I am in favor of defining and implementing this feature as a change to the language specification and the C# compiler, with no changes made to the standard library.

@tannergooding
Copy link
Member

@sharwell, that would be my existing proposal #48

However, it suffers from the back-compat concerns listed on the other thread, due to:

  • The constructor and cast operators having explicit checked behavior on 32-bit platforms
  • The existing add/sub operators defined in the standard library

@gafter
Copy link
Member Author

gafter commented Apr 19, 2017

@gafter , I would think the compiler could determine the constant is larger than 32-bits and spit a warning that this will result in different behavior on 32-bit vs 64-bit platform.

Essentially the compiler just does all folding as the largest size (long or ulong) then explicitly downcasts to 32-bits for the store. If the downcast causes an overflow/underflow the compiler can warn/error.

I don't see how that works:

const nint a = unchecked((nint)uint.MaxValue + (nint)1);
const bool c = a == (nint)0;

The bool c must be true if run on a 64-bit platform, and false if run on a 32-bit platform. But since it is a compile-time constant, it must be fixed to one of those values before it is ever run on any platform.

@sharwell
Copy link
Member

@gafter Interesting example. Outside of situations where the bit size is known at compile time (e.g. /platform:x86 or /platform:x64), I don't see how the use of const would be allowed on native-sized integers. I would not be opposed to disallowing it altogether. Since you could inline the entire expression in the initializer for bool c, I think this mandates a restriction that the outcome of constant folding as a compiler optimization (allowed but not required) is not allowed to be dependent on the bit width of the target machine.

@gafter
Copy link
Member Author

gafter commented Apr 19, 2017

@sharwell Constant folding is not currently an "optimization" - it is required by the specification to occur under certain circumstances, and under no other circumstances. It is observable without const declarations:

void M(nint value)
{
    switch (value)
    {
        case (nint)0:
        case unchecked((nint)uint.MaxValue + (nint)1): // allowed, or duplicate case?
            break;
    }
}

@tannergooding
Copy link
Member

@gafter, The CLI spec explicitly states that native int constants are only allowed to be 32-bit values. So the problem is already solved, you can't have a native int constant with a value of 4294967296.

@tannergooding
Copy link
Member

tannergooding commented Apr 19, 2017

The specific section is (I.12.1.4 CIL instructions and numeric types):

The load constant (ldc.*) instructions are used to load constants of type int32,
int64, float32, or float64. Native size constants (type native int) shall be created by
conversion from int32 (conversion from int64 would not be portable) using conv.i or conv.u.

@tannergooding
Copy link
Member

I would assume that means any native int constant would have to be created as two instructions:

ldc.i4 <num>
conv.i

The language spec would then need to determine whether or not (nint)uint.MaxValue + 1 is folded to be 0 (not portable) or if folding was not done and it was emitted as a constant plus an add:

ldc.i4 4294967295
conv.i
ldc.i4.1
add

It might worth noting that (nint)uint.MaxValue + 1 and (nint)uint.MaxValue + (nint)1 should produce the same result, but the former requires one less instruction (since you do not need to run conv.i on the secondary value).

@sharwell
Copy link
Member

sharwell commented Apr 19, 2017

@gafter If nint is not allowed to be used where a constant expression is required, then it would be an optimization to fold constants anyway.

@MadsTorgersen MadsTorgersen modified the milestone: 7.2 candidate May 16, 2017
@tannergooding
Copy link
Member

That is just an elegance issue, you still can compile for AnyCPU, the exposed structure just isn't as nice to use.

FieldOffset is only required if the type has a union and if it has a union, you can workaround things by declaring an explicit struct that just has the members at offset 0. It certainly isn't as "nice" to use, but it easily solves the problem without needing to worry about cross-compiling.
For anything that doesn't contain a C style union, you shouldn't need to use FieldOffset unless you are using some non-standard over alignment which can't be expressed via Pack=?

That is, for something which has a different offset for 32-bit vs 64-bit:

struct S {
    int32_t x;

    union {
        void*    a;
        intptr_t b;
    };

    int32_t y;
};

You can just define:

public struct S
{
    public int x;
    public AnonymousStruct Anonymous;
    public int y;

    [StructLayout(LayoutKind.Explicit)]
    public unsafe struct AnonymousStruct
    {
        [FieldOffset(0)]
        public void* a;

        [FieldOffset(0)]
        public nint b;
    }
}

You can see (and experiment with) the struct layouts for different struct types here: https://godbolt.org/z/NDsS3h

@pylorak
Copy link

pylorak commented Jun 13, 2020

Yes, this is about unions. While the way you propose works, it stupidifies FieldOffsetAttribute, because your solution is the same as recommending:
"You should only ever use zero as the FieldOffset parameter, otherwise you might be required to compile separate binaries for every architecture you support, depending on the embedding type of your union."

Also, even if you classify this as an elegance issue, what's wrong with that? There are countless language features already in C# that are purely elegance: auto-generated properties, null-conditional operator, discards, local functions, switch expression, and the list goes on for long...
This would also be a feature that simplifies code and improves readability.

@CyrusNajmabadi
Copy link
Member

This would also be a feature that simplifies code and improves readability.

Tanner's code seems very readable to me...

@pylorak
Copy link

pylorak commented Jun 13, 2020

Tanner's code seems very readable to me...

That's only because of the specific example. It is easy to show that in other cases, supporting nint-sized FieldOffsets would result in simpler code. Here's a minimal counterexample:

Tanner's version:

public struct S
{
    public nint a;
    public U u;
}

[StructLayout(LayoutKind.Explicit)]
public struct U
{
    [FieldOffset(0)]
    public int b;
    [FieldOffset(0)]
    public char c;
}

My proposal:

[StructLayout(LayoutKind.Explicit)]
public struct S
{
    [FieldOffset(0)]
    public nint a;
    [FieldOffset(nint.Size)]
    public int b;
    [FieldOffset(nint.Size)]
    public char c;
}

@tannergooding
Copy link
Member

Also, even if you classify this as an elegance issue, what's wrong with that?

Absolutely nothing. It's just important to clarify whether something is truly blocking or if it is mainly a "nice to have".

Function pointers, for example, are the only way to remove allocations and overhead for a good deal of interop code. It is impactful enough that dev's went and did post-compillation IL rewriting or other hacks to workaround the issue.

While adding support for declaring non-constant field offsets is really a nice to have. It doesn't block anything, it just means you have to write more code to achieve the same thing and otherwise the perf, allocations, etc are all the same.
It isn't exclusively a language feature either. You have to get buy off from the both the language team and the runtime team to make it work.

Strictly speaking it also isn't correct. An anonymous struct/union is still a new struct/union and is surfaced as such by the C/C++ AST, even if the members in source appear as though they are part of the containing struct/union. Exposing it as an actual nested struct is the technically correct thing to do, even if in most (or possibly all) scenarios they result in the same ultimate layout.
A more interesting feature, IMO, would be expanding the support for ref returns so that you can return a reference to a field of a struct. This would allow the "correctness" around having the anonymous struct/union be an actual struct/union, would not require runtime support/changes, would allow the ease of use and mutability of the property as if it were a direct member, and is also more generally applicable for other scenarios (such as supporting fixed sized buffers of arbitrary types, value arrays, or ref fields). It, however, requires more thought on how ref lifetimes work as the current rules aren't valid for such a scenario.

Here's a minimal counterexample

At the same time, that doesn't work for more complex scenarios, such as if the nint isn't at offset 0 or if you have multiple nint.

struct S
{
    nint x;

    [FieldOffset(nint.Size)]
    char y;

    // How do you define field offset for this, which will be 8 or 16
    nint z;

    // What about here, which will be 12 or 24
    char w;
}

To cover all applicable scenarios, the runtime either needs to support saying this is the 32-bit field offset and this is the 64-bit or for some way to say "these members are part of union 1" and "these members are part of union 2".
You then get into more complex scenarios with things like C long which is 32-bits on Windows and ptr-bits (same size as nint) on Unix which this wouldn't correctly handle either.

@pylorak
Copy link

pylorak commented Jun 13, 2020

At the same time, that doesn't work for more complex scenarios, such as if the nint isn't at offset 0 or if you have multiple nint.
[...]
To cover all applicable scenarios, the runtime either needs to support saying this is the 32-bit field offset and this is the 64-bit

Actually, it does work and you could express that too very simply with my proposal. With default packing and your previous example:

struct S
{
    [FieldOffset(0)]
    nint x;
    [FieldOffset(nint.Size)]
    char y;
    [FieldOffset(2 * nint.Size)]
    nint z;
    [FieldOffset(3 * nint.Size)]
    char w;
}

Sure, if the struct with the union is complex enough, this can also get ugly. Nobody wants to see an offset specified as [FieldOffset(2*nint.size+sizeof(some-blittable-type))]. In my experience such cases are uncommon though, and the overwhelming majority of the native Windows API would be well-served without having to resolve to complex expressions in FieldOffsets.

Of course, if you ask me, the whole idea of FieldOffset is just an ugly hack from the early days of C# that we have to live with today. Optimally, C# would support real unions like C, even if they'd only serve native-interop scenarios. Of course "support for native unions" is completely off-topic in the issue here for "Native-Sized Number Types", but I also don't think pushing unions even in their own issue is worth it. I'm 100% sure the C#-Team has already discussed extensively years ago whether they want unions in the language, and they obviously arrived at a "NO". So I thought supporting nint-sized FieldOffsets would be a way to fill in the most common gaps while making both sides happy.

@pylorak
Copy link

pylorak commented Jun 13, 2020

I do understand this would need runtime support. But also language support. Obviously here we're talking about design and general acceptance from the language side, as it couples tightly with nint-sized integers.

@tannergooding
Copy link
Member

FieldOffset itself is a magic attribute and doesn't actually get a .custom attribute slot, instead it is encoded inline as part of the field metadata: https://sharplab.io/#v2:EYLgtghgzgLgpgJwDQxASwDYB8ACAGAAhwEYA6AJQFcA7GNMOUgSVsQHsAHAZUQDc0AxnCgBuALAAoSQG0uMBJQEwAMhACebSjAAUqjVoDSaagBNSAUQAeHDILQwAlAF1JOAMwFYCpQS4EA3pIEwQTSAGJocBgmAPIAZnFQcDrEzkEh7gTAavAEvBAYlHDiEgC+QA===

.field [1] public uint8 'value'

Assuming this was done with some other attribute instead (which would likely be a must), you'd end up having to support encoding arbitrary expressions into the metadata to support that mechanism.
It seems "simple" at first, but quickly explodes into a much more complex feature once you examine how things work today and how they might need to work in order for things to be encoded and decoded in a recoverable mechanism.

This is also probably getting out of scope for this issue in particular and I'd recommend opening a new proposal (which would be required anyways) to cover this.
I'd also suggest spending some time looking at how things work today so that the proposal can start off from a standpoint of "here is how it could work and how the runtime would interpret such data".

This might also be more appropriate for the dotnet/runtime repo, as it is primarily a runtime feature. I'm not even sure it is a language feature and is instead more likely a compiler feature (in the same way that FieldOffset is a compiler feature, not a language feature).

@pylorak
Copy link

pylorak commented Jun 13, 2020

Ah, okay, obviously that's an implementation detail that I (as a humble user) had no idea about. I just presented my wish that I thought (purely based on the user-visible code) would integrate most easily with the least amount of changes to the current ecosystem. I see now it's not so simple. Thanks for the pointers, I'll drop the topic here and will think about if and how I could raise the issue the best to the compiler or runtime team.

@CyrusNajmabadi
Copy link
Member

Tanner's version:
...
My proposal:
...

Completely honest here: the former is more readable and clearer to me.

@pylorak
Copy link

pylorak commented Jun 14, 2020

Considering all the feedback I received, I'm withdrawing my proposal to support nint-sized FieldOffsets. After searching around, I've found #687, which addresses all aspects of a preferred solution discussed here:

  • Allows to define unions with less code and no clutter
  • Is purely a language feature, no runtime-support needed
  • Mirrors closely the C/C++ AST
  • Can be used in structures of any complexity
  • Cannot be used in an "incorrect" way such that the resulting binary is made architecture-dependent

@jmaine
Copy link

jmaine commented Jun 18, 2020

Is this going to include half-native int types also?

An example of this is sem_t type defined by headers in some Unix(R) like systems.

@gafter
Copy link
Member Author

gafter commented Jun 18, 2020

This includes nint and nuint in C# 9.0 and nothing else.

@rolfbjarne
Copy link
Member

The spec proposal says:

The identifiers nint and nuint are new contextual keywords that represent native signed and unsigned integer types. The identifiers are only treated as keywords when name lookup does not find a viable result at that program location.

But is there a way to use them if there is a viable result at that program location for another identically named type?

Example:

struct nint {
     public string NoIntegerHere;
}

enum MyEnum : nint
{
    MyValue = 1
}
class MyClass {
    const nint MyConstant = 2;
    nint GetConstant (nint parameter = 3)
    {
         return 123;
    }
}

According to the spec, this wouldn't compile, because the nint references would resolve to the custom nint struct.

Would it be possible to use IntPtr in these scenarios? Or would @nint work?

@PathogenDavid
Copy link

Would it be possible to use IntPtr in these scenarios?

No, IntPtr has the same semantics it had before.

Or would @nint work?

Looking at this test, it's specifically a non-solution. (The test ensures @nint still refers to that class nint.)

This would be backwards of how verbatim identifiers normally work. While maybe useful as a workaround for this issue, I think you end up flipping the issue on its head and you'd break anyone using @nint today (there's no requirement that the @ prefix is only used with language keywords. -- I can definitely see someone writing their code generator to emit all identifiers as as verbatim identifiers out of laziness.)


This is in line with how other contextual keywords work. So the solution is "Don't make a type called nint/nuint." Although I do worry this isn't a viable solution for anyone who might be consuming a library that exposes its own nint type, which I don't think is all that unrealistic or even unreasonable.

Unfortunately as far as I can find, this particular issue was never addressed in the LDM notes.

@rolfbjarne
Copy link
Member

Although I do worry this isn't a viable solution for anyone who might be consuming a library that exposes its own nint type

That includes everybody using Xamarin.iOS or Xamarin.Mac...

@nietras
Copy link

nietras commented Jul 17, 2020

consuming a library that exposes its own nint type, which I don't think is all that unrealistic or even unreasonable.

https://github.com/DotNetCross/NativeInts 🙄 I don't think any actually use though 😅

@PathogenDavid
Copy link

@nietras In the case of that library I'd expect the users to just stop using the library or omit the using DotNetCross.NativeInts. I was referring more to @rolfbjarne's Xamarin.iOS example where there's a type System.nint that is basically impossible to avoid.

@NN---
Copy link

NN--- commented Aug 20, 2020

According to specification there are no native integer MinValue and MaxValue since they are not constant.

Is there any solution for this ?
For instance can they be declared as static readonly or any other class introduced such as NIntLimits ?

@Zenexer
Copy link

Zenexer commented Aug 20, 2020

There are no MinValue or MaxValue fields on nint or nuint because, other than nuint.MinValue, those values cannot be emitted as constants.

That doesn't seem like a great reason to avoid them. They can still be calculated at runtime, but it's a bit tedious, and I can definitely see use cases for wanting to know the minimum and maximum when dealing with native interop.

nuint.MinValue = (nuint)0;
nuint.MaxValue = nuint.MinValue - (nuint)1;
uint.MinValue = (uint)1 << IntPtr.Size;
uint.MaxValue = uint.MinValue - (uint)1;

@jkotas
Copy link
Member

jkotas commented Aug 20, 2020

MinValue/MaxValue were added as static properties to IntPtr/UIntPtr in dotnet/runtime#21943

@NN---
Copy link

NN--- commented Aug 20, 2020

Why they cannot be added to nint and nuint ?
IntPtr.MaxValue has type of IntPtr and not nint.

@jkotas
Copy link
Member

jkotas commented Aug 20, 2020

nint and nuint are backed by IntPtr and UIntPtr. They are not real types. Adding these properties to IntPtr/UIntPtr makes them show up on nint and nuint too.

@jcouv jcouv changed the title Champion "Native-Sized Number Types" Champion "Native-Sized Number Types" (VS 16.8, .NET 5) Sep 1, 2020
@MadsTorgersen MadsTorgersen modified the milestones: 9.0 candidate, 9.0 Sep 9, 2020
@333fred 333fred added the Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification label Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Review Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion
Projects
None yet
Development

No branches or pull requests