-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: InlineArrayAttribute #61135
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsBackground and motivationThere were multiple proposals and ideas in the past asking for a no-indirection primitive for inline data. Example1(inline strings), Example2(inline arrays), Example3(generalized fixed-buffers) In the recent hackathon we have explored a possibility of introducing a primitive for efficient, type-safe, overrun-safe indexable/sliceable inline data and it appears to be possible with relatively modest investments in the runtime. API Proposalnamespace System.Runtime.CompilerServices
{
[AttributeUsage(AttributeTargets.Struct, AllowMultiple = false)]
public sealed class InlineArrayAttribute : Attribute
{
public InlineArrayAttribute (int length)
{
Length = length;
}
public int Length { get; }
}
} When Unlike "filed0; field1; field2;..." approach, the resulting layout would be guaranteed to have the same order and packing details as elements of an array with element
API Usage// runtime replicates the layout of the struct 42 times
[InlineArray(Length = 42)]
struct MyArray<T>
{
private T _element0;
public Span<T> SliceExample()
{
return MemoryMarshal.CreateSpan(ref _element0, 42);
}
} For more details on how this attribute could be used on the Roslyn side see Alternative DesignsIn hackathon we have also explored an approach where the length is specified via a type parameter and arrays of concrete lengths are created by the way of generic instantiation - Such approach moves some of the effort in creation of concrete struct arrays from the user to the runtime. Risks
As such the use of the attribute should be versioned with a mechanism such as runtime feature flags. FAQ:Why do we put the attribute on the struct and not on the field? Allowing the attribute on individual fields introduces numerous additional scenarios and combinations that make the feature considerably more complex.
All the above issues can be solved, but at a cost to the implementation complexity while most additional scenarios appear to be less common and easily solvable by providing a wrapper struct.
|
I think this is ready for API review. |
|
|
I do not think we have a choice. We cannot remove the field that is already there, thus we cannot follow the directive to have a completely empty array.
It must be a struct with a single instance field and not having an explicit layout I am almost 50%/50% on how we handle nonconforming situations - whether we ignore or throw an exception.
[InlineArray(Length = 10)]
struct Tenner<T> where T : class {
public T element0;
} How about: [InlineArray(Length = 10)]
struct Tenner<T> {
public T element0;
} Yes. |
Why do you have to remove the field? The intent behind 0 length fixed sized buffers in C/C++ is:
We should be able to support and special case both
I think this is better stated as "must be a sequential layout struct with a single instance field", otherwise that leaves room for interpreting "auto layout" as being ok.
I do think this is somewhat unfortunate. There are numerous patterns we could allow, particularly in the numerics space (and therefore with opportunities to improve the ecosystem for ML, Tensors, and Vector acceleration) by actually having this. I thought the idea of using multidimensional array metadata + a special recognized constraint was a clever solution here and would allow this without having to revise IL metadata.
|
I think field layout errors are usually TLE or BIFE when you actually try to use the type - but I don't think we're totally precise about when the exception is raised. I guess the alternative is that any noncomforming use of the attribute is just ignored? In that case it would be nice to have a C# analyzer flag the bad attribute usage. Unexpected layout in conjunction with pointer-ful code seems like a source of hard to diagnose bugs. |
@tannergooding - I think auto layout is ok. Since we require only one instance field, only explicit could be a problem. |
Right. Invalid uses of the attribute should produce TLE. |
I'm a bit skeptical. C# defaults to sequential structs so most users would have to go out of there way to define |
I am ok with TLE when attribute cannot work. I assume |
The intent behind length 0 isn't to remove the field, just to remove its from size calculations when it isn't the only field.
That is, for: #include <stddef.h>
#include <stdint.h>
struct S0 { };
size_t SizeOf_S0() { return sizeof(S0); } // 1
struct S1A { int32_t x[0]; };
struct S1B { int64_t x[0]; };
size_t SizeOf_S1A() { return sizeof(S1A); } // 1 on Win32; 4 on Win64; 0 on Clang Unix; 1 on GCC Unix
size_t SizeOf_S1B() { return sizeof(S1B); } // 1 on Win32; 8 on Win64; 0 on Clang Unix; 1 on GCC Unix
struct S2A { size_t l; int32_t x[0]; };
struct S2B { size_t l; int64_t x[0]; };
size_t SizeOf_S2A() { return sizeof(S2A); } // 4 on 32-bit; 8 on 64-bit
size_t SizeOf_S2B() { return sizeof(S2B); } // 8 on Win on Unix64; 4 on Unix32
struct S3A { size_t l; int32_t x[1]; };
struct S3B { size_t l; int64_t x[1]; };
size_t SizeOf_S3A() { return sizeof(S3A); } // 8 on 32-bit, 16 on 64-bit
size_t SizeOf_S3B() { return sizeof(S3B); } // 16 on Win and Unix64; 12 on Unix32 There should be nothing preventing the type loader from handling this, other than potential complexity, although I'd expect the complexity isn't that bad. |
And you would like this attribute to behave like clang on Unix for |
Not quite. Clang's behavior here is a bug for The layout algorithm already special cases scenarios where the computed size is We then later take into account the native size being
I expect we will be tracking that it's a That leaves us with:
This ends up:
Given the typical usage scenario for
Which would also end up with:
|
As I understand the suggestion is that I think it would be ok to allow 'Length==0` on structs with no fields. + the bizarreness with native size computation. Are we sure this is not a UB in C ? |
The Windows SDK and various GNU/Linux/MacOS native libraries use it fairly extensively, so I'm not really sure it matters if the language says "well defined" or "not" (not quite as much as There are real type definitions in A list of the found 0-length inline arrays in the Windows SDK v10.0.20348.0 (from a basic grep, may not be exhaustive)
|
One part that worries me is that the pattern here of slicing the inline array is hard to write correctly. It's very easy though to write as the example in the issue does and that creates a convenient yet large type safety hole. Span<int> Oops() {
MyArray<int> array = default;
return array.SliceExample();
} This will compile just fine and it's very unsafe. In general I think if our API reviews need As we look to approve this in API review I think we should also be looking to promoting the correct and safe usage patterns around it. It's tricky though because the nature of this API essentially goes against the protections in the span safety rules. We may need to consider pairing this with a language change. |
My assumption was that this is enabling feature for https://github.com/dotnet/csharplang/blob/main/proposals/low-level-struct-improvements.md#safe-fixed-size-buffers and the buffer will by typically accessed by auto-generated accessor that returns |
I agree the primary use would be to enable that feature. I'm also trying to think of the general use case here. Essentially it's an API we're offering and what will be the implications of direct usage? Other features in that proposal, particularly offering the ability to invert our lifetime defaults, will make usage of this safe. I worry a bit though about taking this without the other though because the inclination is likely to be to go for the unsafe uses. |
FWIW, using Spans with fixed buffers has the lifetime type safety hole today. Example from this repo: runtime/src/libraries/Common/src/Interop/Windows/Kernel32/Interop.WIN32_FIND_DATA.cs Lines 25 to 28 in 57bfe47
I agree that we should commit to take both or neither for the release. |
Note: Asking the below because of my own interest in this, but also because I've seen a number of community members and other people interested in this area asking questions (such as on the C# Community Discord) where the answers don't seem to be clear/concise. Could I get a couple questions answered as to how a few things are impacted.
Another question is specifically around the proposed but "we couldn't agree on an approach" for some The approach of allowing numbers as part of a template is a powerful feature and one that is much more broadly applicable/useful to the .NET ecosystem, especially in the numerics and perf oriented domains. Trying to shuffle in the That is, why aren't we using a multidimensional array or even "Size" parameter: #60519 (comment) -- That is, IL already allows |
Regarding this, several approaches for how to encode constant values in generic parameters were discussed in the C# Discord. To support For primitives that fit in 32-bits or less, it might be worth foregoing using a metadata token and using the array length/rank value directly. A runtime-recognized This still requires agreeing on which method is appropriate for storing these values in IL, though. It would be very interesting to hear more about the discussions that occurred on this subject for |
I agree it's much more powerful, it's also much more expensive. That is a significant revamping of the C# and IL type systems that is simply not necessary to solve the set of challenges that we're looking at right now. The current solution allows us to solve a number of outstanding problems such as safe fixed fields and The proposed solution does not preclude expanding to numeric template arguments in the future. Practically any feature we discuss can be done in a more expanded and powerful form so long as we're willing to up the complexity and resource investment.
This is part of the details we still need to work out. The current proposal suggests that the fields are not directly accessible but rather accessible through
By using the types which are annotated with Note: I'm somewhat ambivalent to the implementation here but this solution has a pretty understood mapping to the key C# scenarios we're trying to solve. |
I assume this means the proposal is still approved as-is? Or is more discussion needed and we should reset it to api-needs-work? |
I think its probably "fine" as-is, but I think there needs to be a FAQ explaining the "why" this route is being done (ideally as part of the top post so it's easily findable). This may particularly be the case given the additional number of hearts/thumbs up on my comment above. Ideally such a FAQ would also clearly cover things like how this will be exposed so that it doesn't prevent migration to some future |
I just have a question on this point specifically. I understand that adding generic constants would be more work, but given that that would completely supersede the proposed |
It would be close to two orders of magnitude more work all up. The throw-away work for |
Just putting a note that by using InlineArray approach you are accumulating tech debt that will need to be paid for eventually. |
@VSadov, there was a small discussion on Could you remind me on what was the outcome of that? I gave some examples on how it is spec'd to work for the Windows vs Linux ABI and listed some examples of Win32 APIs that utilize it. I don't, however, see any follow up on whether we decided to explicitly say that we're going to keep the restriction that |
There was no change for that. The purpose of the attribute is to replicate the storage of the existing field and it will require
|
It won't work "as expected" from an ABI perspective if the runtime doesn't special case it. I think that's fine, just wanted to confirm there was no change and that we're explicitly considering it out of scope. |
Allowing InlineArray to be exposed using In assembly A: class Foo1
{
public static void Use(int buffer[4]) { }
} and in assembly B you want to call class Foo2
{
public static void Test(int buffer[4])
{
Foo1.Use(buffer); // error: because the buffer[4] here is lowered to a compiler-generated
// InlineArray type defined in the assembly B, while the buffer[4] in Foo1.Use is
// lowered to a compiler-generated InlineArray type defined in the assembly A,
// so, they are not the same type and are not compatible.
}
} This is leading to critical problem to the ABI and type system and is completely unacceptable. Definitely, you can predefine some limited types in the BCL, such as If we indent to allow to expose fixed buffer types in the public API, we must consider a more general solution, for example. supporting integer values in the type parameters so we can make a // in assembly A
class Foo1
{
public static void Use(int buffer[4]) { } // lowered to FixedBuffer<int, 4>
}
// in assembly B
class Foo2
{
public static void Test(int buffer[4]) // lowered to FixedBuffer<int, 4>
{
Foo1.Use(buffer); // ok
}
} |
If a parameter of a method has an inline buffer type, and the corresponding argument has a different inline buffer type, but the element type and the size are the same, then could the compiler solve the type mismatch by emitting a call to a method like Unsafe.As<TFrom, TTo>(ref TFrom), except modified to verify binary compatibility at JIT compilation time? |
I believe you won't want this workaround. For example, how to handle reflection? Should all existing dependency injection frameworks introduce exceptions for InlineArray types? And not only the runtime, but all existing and new code which need to resolve types at runtime will need to special casing the InlineArray types. Or you want to special case it in the runtime so that we can pass a value of |
If a method of an interface takes an inline array parameter, and DI injects an implementation of this interface, then the implementing method must already use the same inline array type; otherwise the implementing type could not have been compiled. If a constructor takes an inline array parameter and DI is supposed to provide an inline array instance as an argument, then a conversion could be needed; but that seems an unlikely scenario, as the element type and size of the array would not sufficiently identify the purpose of the instance. If a constructor takes a Func<SomeCustomType, byte[32]> parameter (where SomeCustomType is distinctive enough) and DI is supposed to provide a delegate instance as an argument, then a conversion could be needed; and to implement that conversion, DI might have to wrap the delegate behind another delegate. But the developer can replace the delegate type with an interface and then it's the same as the first case. So no, I don't think DI would need to support the conversion. It would be a bit like casting int[] to uint[], which is supported by the runtime but AFAIK not by DI frameworks.
No, I'm only thinking about special-casing it in the compiler. Although I don't know if such a typecast hack would still cause some problem in the runtime, even when the element type and the size match. |
DI is only a use case of reflection. This behavior will affect all reflection usage. My point is simple: if we can introduce integer const support in generic parameter in IL directly, all those problems are automatically gone and no workaround is needed here. Rust also didn't have const generics a while ago and it used macros to generate types just like what C# is doing now, this approach has been proved by Rust that it's limited and has issues around ABI. |
Did Rust get binary compatibility or source compatibility between the macro-based system and the generics-based system, or do libraries have to choose which system they support in their APIs? Thinking about how, if .NET libraries first deploy InlineArrayAttribute-based APIs and a generics-based alternative becomes available later, the libraries might be unable to migrate because of compatibility requirements. |
We currently don't have the proposed fixed buffer language syntax support so it's okay that the InlineArray is for internal use. But if we expose InlineArray to the public and ship the syntax in the language, it will be an uncorrectable mistake and we end up with living with the issue forever. |
Closing as this was implemented and is in .NET 8 |
Background and motivation
There were multiple proposals and ideas in the past asking for a no-indirection primitive for inline data. Example1(inline strings), Example2(inline arrays), Example3(generalized fixed-buffers)
Our existing offering in this area – unsafe fixed-sized buffers has multiple constraints, in particular it works only with blittable value types and provides no overrun/type safety, which considerably limits its use.
In the recent hackathon we have explored a possibility of introducing a primitive for efficient, type-safe, overrun-safe indexable/sliceable inline data and it appears to be possible with relatively modest investments in the runtime.
API Proposal
When
InlineArray
attribute is applied to a struct with one instance field, it is interpreted by the runtime as a directive to replicate the layout of the structLength
times. That includes replicating GC tracking information if the struct happens to contain managed pointers.Unlike "filed0; field1; field2;..." approach, the resulting layout would be guaranteed to have the same order and packing details as elements of an array with element
[0]
matching the location of the single specified field.That will allow the whole aggregate to be safely indexable/sliceable.
Length
must be greater than 0.struct must not have explicit layout.
In cases when the attribute cannot have effect, it is an error case handled in the same way as the given platform handles cases when a type layout cannot be constructed.
Generally, it would be a
TypeLoadException
thrown at the time of layout construction.API Usage
For more details on how this attribute could be used on the Roslyn side see
Low Level Struct Improvements
document.Alternative Designs
During hackathon we have also explored an approach where the length is specified via a type parameter and arrays of concrete lengths are created by the way of generic instantiation -
ValueArray<T, Size>
.Such approach moves some of the effort in creation of concrete struct arrays from the user to the runtime.
However in the current .NET IL/metadata standard, the
Size
parameter would need to be an integer constant encoded as a type. While such encodings are possible, finding one that everybody could comfortably accept has been an issue that we could not resolve.Risks
InlineArrayAttribute
requires runtime cooperation to function correctly. On downlevel runtimes the attribute will not have any effect, contrary to the assumptions in the code that implements slicing or indexing.FAQ:
Why do we put the attribute on the struct and not on the field?
Allowing the attribute on individual fields introduces numerous additional scenarios and combinations that make the feature considerably more complex.
All the above issues can be solved, but at a cost to the implementation complexity while most additional scenarios appear to be less common and easily solvable by providing a wrapper struct.
The text was updated successfully, but these errors were encountered: