Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Span view of embedded resource data #26101

Open
yaakov-h opened this issue May 8, 2018 · 44 comments
Open

Get Span view of embedded resource data #26101

yaakov-h opened this issue May 8, 2018 · 44 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Reflection tenet-performance Performance related issue
Milestone

Comments

@yaakov-h
Copy link
Member

yaakov-h commented May 8, 2018

Background and Motivation

I'd like to be able to get a ReadOnlySpan into an Assembly's Embedded Resources, in order to minimise memory allocations and copying when dealing with such resources.

At the moment the only way to get at an Embedded Resource is either via the Stream APIs, or by using unsafe code to read from the UnmanagedMemoryStream's pointer.

Proposed API

namespace System.Reflection
{
    public class Assembly
    {
        // ...
        public virtual Stream GetManifestResourceStream(string name);
        public virtual Stream GetManifestResourceStream(Type type, string name);
+       public virtual ReadOnlySpan<byte> GetManifestResourceSpan(string name);
+       public virtual ReadOnlySpan<byte> GetManifestResourceSpan(Type type, string name);
        // ...
    }

Usage Examples

Usage mirrors existing GetManifestResourceStream use, but with spans:

ReadOnlySpan<byte> ros1 = GetType().Assembly.GetManifestResourceSpan("Namespace.ResourceName.txt");
// do something with ros1
ReadOnlySpan<byte> ros2 = typeof(X).Assembly.GetManifestResourceSpan(typeof(X), "ResourceName.txt");
// do something with ros2

Alternative Designs

An alternative was discussed below where the ReadOnlySpan could be created from the UnmanagedMemoryStream, but this was deemed to be unsafe as the pointer from the UnmanagedMemoryStream does not contain a live reference to the assembly. If a Span is created directly from the pointer and then the assembly is unloaded, the application can crash when accessing the Span.

Risks

None known.


Original post

Hi,

Would it be possible to add an API to Assembly that allows applications to get a ReadOnlySpan<> view of an embedded resource, rather than a stream?

If this has been discussed before, please just point me at that issue.

Thanks.

@ghost
Copy link

ghost commented May 10, 2018

What is the scenario that is driving this?

I have several concerns about this:

  1. Assembly is a user-subclassable type. Any time you add an api that cannot have a default implemention based on existing members, you're creating another api whose functionality depends on implementation types that we don't control.

  2. On UWP scenarios, the resource bytes are in files, not in memory. An api returning ReadOnlySpan would have to read the entire resource into memory. In another words, a giant allocation which goes against the implicit promise of a Span returning api.

  3. Resources are not actually part of metadata and ideally are not the responsibility of Reflection. While we've had to carry over this burden from .NETFX, I'm not eager to add to it.

@jkotas
Copy link
Member

jkotas commented May 10, 2018

An api returning ReadOnlySpan would have to read the entire resource into memory.

It does not need to read it. It can just map the file.

There is a ton of code out there that expects the Stream returned by GetManifestResourceStream to be UnmanagedMemoryStream, gets a pointer out of it and parties on it. I do not think we would be ever able to change this implementation detail.

I see this suggestion as a desire to make this flow more Span friendly. I agree that adding Span returning GetResource method on Assembly does not sound right. Maybe adding Span returning property on UnmanagedMemoryStream would help?

@ghost ghost changed the title Get Span view of Embedded Resource Get Span view of UnmanagedMemoryStream May 10, 2018
@ghost
Copy link

ghost commented May 10, 2018

It can just map the file

In order to have uniform API between UWP and non-UWP, does it make sense to propose MemoryMappedFile MemoryMappedFile.CreateFromMemory (in addition to CreateFromFile) API addition?

@ghost
Copy link

ghost commented May 10, 2018

Let's pursue this as a potential UnmanagedMemoryStream property then. (a CreateFromMemory should probably be tracked separately.)

Though it appears, the flow can already be simplified by a simple user extension method:

ReadOnlySpan<byte> GetManifestResourceStreamAsSpan(this Assembly a, string name)
{
     UnmanagedMemoryStream s = (UnmanagedMemoryStream)a.GetManifestResourceStream(name);
     return new ReadOnlySpan<byte>(s.PositionPointer, checked((int)s.Length)));
}

@panost
Copy link

panost commented Dec 28, 2018

@atsushikan
The UnmanagedMemoryStream should be disposed, according to this

so the extension method should be

        public static unsafe ReadOnlySpan<byte> GetManifestResource(this Assembly a, string name) {
            using(var s = (UnmanagedMemoryStream)a.GetManifestResourceStream(name)) {
                return new ReadOnlySpan<byte>( s.PositionPointer, checked((int)s.Length) );
            }
        }

@stephentoub
Copy link
Member

according to this

That comment is stale or wrong. Stream.Dispose(bool) is a nop:
https://source.dot.net/#System.Private.CoreLib/shared/System/IO/Stream.cs,230

@GrabYourPitchforks
Copy link
Member

This hasn't been touched in over a year. What's the actual API suggestion here?

@panost
Copy link

panost commented Jan 29, 2020

@GrabYourPitchforks To add a method to the assembly that returns directly a ReadOnlySpan of a Manifest Resource, without using hacks (or creating UnmanagedMemoryStream), but most importantly safe, so you dont have to mark your project as unsafe, just for using the proposed hack

@GrabYourPitchforks
Copy link
Member

but most importantly safe, so you dont have to mark your project as unsafe, just for using the proposed hack

I don't know if it's possible to do this "safely". Imagine a scenario where an assembly is loaded, the resource is returned as a ReadOnlySpan<byte>, then the assembly is unloaded. Attempting to access the ReadOnlySpan<byte> after this point could have undefined behavior, including AVing the process. There might be some potential workarounds we could do from the runtime, such as forbidding unload of the assemblies once a resource stream has been projected to a span, but these workarounds might not be desirable in all situations.

In order to move the proposal forward somebody (anybody, really) would need to write down a proposed API signature and the behaviors that we'd expect that API to have. Once that's down we can move this forward to the next step.

@brian-reichle
Copy link

@GrabYourPitchforks What if the method returned an IMemoryOwner<byte> which kept the assembly loaded at least until disposed?

@GrabYourPitchforks
Copy link
Member

Neither Memory<T> nor Span<T> is "safe" when pointer-backed. The reason UnmanagedMemoryStream is "safe" (aside from the scenario where you write unsafe code to call the pointer getter manually) is that all read / write operations go through a memcpy operation, so they can be guarded appropriately. Such a guard cannot be implemented if the consumer is given a raw projection of the backing memory.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@JeremyKuhne JeremyKuhne removed the untriaged New issue has not been triaged by the area owner label Mar 3, 2020
@KevinCathcart
Copy link
Contributor

@GrabYourPitchforks said:

Neither Memory<T> nor Span<T> is "safe" when pointer-backed. The reason UnmanagedMemoryStream is "safe" (aside from the scenario where you write unsafe code to call the pointer getter manually) is that all read / write operations go through a memcpy operation, so they can be guarded appropriately. Such a guard cannot be implemented if the consumer is given a raw projection of the backing memory.

Unfortunately, your statement that all read / write operations go through a memcpy operation for pointer based UnmanagedMemoryStream, is not true (Specifically the ReadByte method uses the pointer directly to read the byte, and does not use memcpy/memmove).

Also the memcopy does not seem to actually be preventing access violations in the first place. Sure perhaps it would be possible to add checks for that, but right now it does not seem to be doing anything of the sort.

This has also led me to discover a way to crash .NET 5 with an AV without using unsafe code, and without framework methods that are obviously unsafe (Marshal classes, and or Unsafe class, etc). This is by getting the framework to return an UnmanagedMemoryStream pointing into an assembly that gets unloaded. This was fixed for RuntimeAssembly.GetManifestResourceStream back in #22925, by making the stream reference the RuntimeAssembly. But it was not fixed for ResourceManager.GetStream, when returning the default version of a localizable resource, which then points inside the already mapped dll file. I've filed #52061 for this part.


Back to the current issue: I think the bottom line for this issue is that pointer based Span cannot keep an assembly alive via a pointer into the memory mapped PE file. Technically Memory could keep the assembly alive via a custom IMemoryOwner, but that would be unsafe, as nothing will ensure the spans created from It's .Span property won't outlive the Memory instance. If one really needs to to avoid an extra copy while reading a manifest resource using Span APIs, then something like the following is actually safe, despite the use of unsafe, and is probably the best that would actually happen:

var s = (UnmanagedMemoryStream)a.GetManifestResourceStream(name))
unsafe {
var span =ReadOnlySpan<byte>( s.PositionPointer, checked((int)s.Length) );
//use span here. in a way that it cannot outlive this method.
}
GC.KeepAlive(s); // ensure the stream remains alive. 

I really want to say that a using block could function as an alternative to GC.KeepAlive, but I'm not 100% sure that a combination of aggressive inlining (allowing the exact stream type to be discovered), de-virtualization, and escape analysis to conclude that the field writes in dispose are safe to omit (as the object cannot be visible to other threads), allowing the lifetime to end earlier. This would obviously be more aggressive than the current JIT, but potentially a sufficiently aggressive AOT compiler (think a .NET Native like heavily optimizing AOT compiler) might do something like that, unless I'm missing something. The fact that the ECMA spec severely underspecifies what optimizations are legal does not help this analysis.

@panost
Copy link

panost commented May 3, 2021

I am still using embedded resources instead of inline them in the source code (a large byte array with some thousands lines of source code). It might be stupid or pointless, but I have the feeling that they slow down the compilation and they mess with the version control.

I stopped commenting on this issue because I realized after the response of @GrabYourPitchforks that I do not know many low level details. It is not clear to me the nature of the pointer returned by the UnmanagedMemoryStream. It is a read only pointer to the memory mapped PE file ? Is that the case also in Linux and the other platforms? I did try to follow the source code to find answers for this, but it was hard to me.

Also reading the source code of the net core, I see several examples (ie here) where it seems that in-lining a large resource and getting a ReadOnlySpan<byte> is the preferred method.

How this pointer differs from the pointer returned by the UnmanagedMemoryStream, is still unclear to me.
Also the argument that the assembly might be unloaded, is a valid point in .net core (no AppDomains here) and that, affects the GetManifestResourceStream pointers but not the pointers obtained by this new trick

private static ReadOnlySpan<byte> CategoryCasingLevel1Index => new byte[2176]

@GrabYourPitchforks
Copy link
Member

The compiler and runtime together ensure that the syntax ReadOnlySpan<byte> span = new byte[...]; will keep the assembly from being unloaded. The GC essentially sees the ROS<byte> as a live reference to the assembly, so you can think of this as equivalent to "we won't let a reachable object be collected." It is completely safe from the consumer's perspective.

Unfortunately there is no generalized way to accomplish the same thing for an arbitrary UnmanagedMemoryStream instance. The closest you can get is to call UnmanagedMemoryStream.PositionPointer to get the byte* and to call new ReadOnlySpan<byte>(byte* ptr, int length), but you are manually responsible for keeping the underlying stream alive. The runtime cannot guarantee that the GC will see the resulting ROS<byte> as a live reference to the stream, and it will not be able to detect if you've disposed of the stream but are still holding on to the ROS<byte> and later try to dereference it.

Note: as an implementation detail, the particular stream returned by GetManifestResourceStream might be able to make stronger statements about lifetime management. But that guarantee would absolutely not extend to arbitrary UnmanagedMemoryStream instances, which means that exposing a span property on UnmanagedMemoryStream would undoubtedly lead to a pit of failure.

If the very particular scenario is "I want to be able to read an embedded resource as a ROS<byte>," then propose an API specifically for that scenario. Something like Assembly.GetManifestResourceSpan(string resourceName) : ReadOnlySpan<byte>. Trying to slap this behavior on to UnmanagedMemoryStream (which GetManifestResourceStream doesn't even guarantee as its return value!) seems like a losing proposition.

@KevinCathcart
Copy link
Contributor

Ah, from the sounds of things (per #40346), the entirely PE file gets treated specially for spans pointing into them. That should make such a Assembly.GetManifestResourceSpan(string resourceName) : ReadOnlySpan<byte> method rather feasible to implement.

Ideally there would a similar API for localizable resources accessed via ResourceManager, too. I'm not sure what name makes sense there. Currently GetStream is used. GetSpan would be misleading, because it only works on resources of "stream" type.

For the case where the resource has fallen back to the main assembly it works exactly the same was as ManifestResourceSpan would. For the satellite assembly case, the code already reads into an array. For the stream version it pins the array, since GetStream requires returning an UnmanagedMemoryStream. That code could be simplified by using the pinned object heap for this array, but I'm not sure if these streams typically stick around long enough for that to make sense. In any case since the underlying memory is a managed array, returning a ReadOnlySpan for those is not a problem at all.

@yaakov-h
Copy link
Member Author

yaakov-h commented May 4, 2021

If the very particular scenario is "I want to be able to read an embedded resource as a ROS," then propose an API specifically for that scenario.

Well that's what I started off by asking for, until ghost (whoever that was, I don't recall) redirect the discussion towards the already-existing UnmanagedMemoryStream object and renamed this issue...

@KevinCathcart
Copy link
Contributor

Its your issue, so I'm pretty sure you should be able to rename it back. And we have now determined this is actually implementable and safely, which is also good. This also only became obviously possible to implement safely when the byte array case become safe for the unload case back in August.

Jan favored the something like a property on UnmanagedMemoryStream, but Levi has pointed out that is unsafe.

I'm not sure if there is a better place for the method to live than Assembly. I mean RuntimeAssembly would be correct, but that is internal by design. I think proposing it as a method on Assembly, and letting FXDC decide if there is some better place for it is the best we can do unless somebody else comes along with a clever idea.

So creating a formal API proposal sounds like the next step. You can update your first post to use the template from https://github.com/dotnet/runtime/issues/new?template=02_api_proposal.md to make this easy for the reviewers. If you do that this can potentially move forward. For an example of another simple API proposal that add members to an existing type, see #49407.

Hope this helps.

@yaakov-h yaakov-h changed the title Get Span view of UnmanagedMemoryStream Get Span view of embedded resource data May 4, 2021
@yaakov-h
Copy link
Member Author

yaakov-h commented May 4, 2021

Done and done.

@yaakov-h
Copy link
Member Author

yaakov-h commented May 4, 2021

Potentially the name parameter(s) could be ReadOnlySpan<char> rather than string but I don't personally see the need for it.

@jkotas
Copy link
Member

jkotas commented May 4, 2021

Jan favored the something like a property on UnmanagedMemoryStream, but Levi has pointed out that is unsafe.

I do not think it is worth it to add a new Span returning method to Assembly for this scenario. I think that doing nothing and recommending that anybody who needs this writes a bit of unsafe code is better than adding a new method to Assembly.

Note that Span returning virtual method on Assembly would not solve the scenarios where the data are needed more than once: The slow lookup of the data stream by name would have to be done each time the data is needed; or the code would have to cache the stream and use the unsafe code to convert it to Span.

@KevinCathcart
Copy link
Contributor

KevinCathcart commented May 4, 2021

Ok, but now that we understand that it is possible to implement such a feature safely in some cases, but it is not possible to do so for a general UnmanagedMemoryStream, there are potentially more options. Both ManifestResourceStream and PinnedBufferMemoryStream (and WPF has its own version of the latter) are safe to return spans from, which means we have a situation in which it is not uncommon for this to be safe, but it is not always safe. So a TryGetSpan on UnmanagedMemoryStream, that returns false when it is not safe is one option.

The other two subclasses in the framework at the moment are safe if you keep the stream alive for the duration of using Span.

Of the uses of just plain UnmanagedMemoryStream, we have two right now that can destabilize the runtime, even when only used as a stream. (Amusingly both of which would be completely safe if exposed as a span.) The last usage I'm not entirely sure about, but I'd not be shocked if it was also at risk of causing access violations.

So to summarize:

  • 2 (+1 variant) subclasses that could safely expose span. These seem like they could be used nicely with a TryGetSpan() that only returns true when safe.
  • some more subclasses that cannot do so safely, but can be used as spans if you keep the streams alive. (could be candidates for a DangerousGetSpan()).
  • 2 non-subclasses uses of UnmanagedMemoryStream that can definitely crash the runtime even when used only as streams due to assemblies getting unloaded. (Once fixed, these would be in the "can safely expose spans category".)
  • 1 use that I'm not sure about, but am concerned is at risk of crashing the runtime even when used as a stream, since It is not obvious that the memory will always outlive the stream.

@panost
Copy link

panost commented May 4, 2021

How we returned to streams?
If the PE image is memory mapped, then a resource (as any other object of the PE file) has a starting address and a length.
Using just that, the runtime found a safe way to return that pointer to in-lined byte arrays, as we see above.

Can use the same trick, to return other objects?
For example is this feasible ?

private static ReadOnlySpan<byte> CategoryCasingLevel1Index => Resources.System.Globalization.CategoryCasingLevel1.bin;

where Resources is a pseudo static class that has every embedded resource of the current assembly and "System.Globalization.CategoryCasingLevel1.bin" denotes an embedded resource with the file name "CategoryCasingLevel1.bin" in the /System/Globalization folder.

Also, no wasted search time to find a named resource, since all those pointers can be resolved once, at mapping time.

@KevinCathcart
Copy link
Contributor

KevinCathcart commented May 4, 2021

That approach would basically be a C Sharp compiler feature. The compiler certainly could do something like that, where it generates an RVA static that points into the bytes of the embedded resource. This might be slightly risky if anything tried to edit the assembly afterwards, since it might not understand RVA statics pointing into .mresource data, but that ought to be possible to overcome for things like the trimmer. I’m not certain if it is possible to make that safe for an Ildasm/Ilasm round trip.

Also, starting the pseudo class name “Resources” is basically guaranteed to clash with the code behind helper for the resx file used by the resources tab in the project level property page. It also just generally gives the impression of being related to “resources” which annoyingly is the complete official name of the localizable resources generated from resx files, rather than being about manifest resources. (Which is what a build action of “Embedded Resource” results in).

A different alternative is not to have compiler magic, but to have a code generator that can provide a similar experience. The problem with that though, is that at best it could implement the unsafe code pattern for you. It could cache the pointer and length safely though, since by being embedded in the same assembly, it cannot be accessed unless the assembly is reachable, and thus there is no risk that the cached pointer is stale, since it property that reads from it would go away as part of assembly unloading.

Such a code generator would also rely on the fact that right now the runtime implements the whole PE file as a valid target for interior pointers (for assemblies loaded from disk, as opposed to dynamically generated ones). While I think it is fine for System.Private.CoreLib to to rely on such details, I’m not sure it is reasonable to generate code into user assemblies that rely on such runtime details.

@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented May 4, 2021

Jan had a good comment about caching (see #26101 (comment)). Assume for the sake of argument that we want to add a new API to Assembly. Would it make sense to return an object instead of a span and to have that object fully responsible for lifetime management?

public class Assembly
{
    public virtual EmbeddedResourceInfo? GetEmbeddedResourceInfo(string resourceName);
}

public abstract class EmbeddedResourceInfo : IDisposable
{
    public abstract Stream GetResourceStream(); // creates a new Stream instance on each call
    public abstract ReadOnlySpan<byte> GetResourceContents();
}

The EmbeddedResourceInfo type could be cacheable by the caller. I don't expect the Assembly-derived class to cache these instances.

For a RuntimeAssembly, the internal EmbeddedResourceInfo-derived class could forward to Assembly.GetManifestResourceStream() and could also make sure that the returned ROS<byte> points to within the assembly image itself, which would keep the GC from unloading it. For any other Assembly, we provide a default implementation which turns the embedded resource into a byte[] and returns a span into that array. That way, the span returned by GetResourceContents will always be valid, even if somebody called EmbeddedResourceInfo.Dispose or tried to unload the assembly.

(EmbeddedResourceInfo.Dispose wouldn't actually dispose of any unmanaged resources. The only thing it would do is break the link between the ERI instance and the Assembly, so holding on to the ERI wouldn't prevent the Assembly from being unloaded. I'm not sold on this interface even being necessary to be honest.)

@jkotas
Copy link
Member

jkotas commented May 4, 2021

Or make the existing ManifestResourceStream type public and expose the Span returning method that way.

public class ManifestResourceStream : UnmanagedMemoryStream
{
    public ReadOnlySpan<byte> GetContents();
}

Example of use: ReadOnlySpan<byte> span = ((ManifestResourceStream)a.GetManifestResourceStream(name)).GetContents()

@GrabYourPitchforks
Copy link
Member

@jkotas Do you know if Mono also guarantees that the returned unmanaged memory stream points to a GC-trackable region of the image?

@jkotas
Copy link
Member

jkotas commented May 4, 2021

Mono does not support unloadable code today.

@KevinCathcart
Copy link
Contributor

KevinCathcart commented May 4, 2021

#40394 (despite its title) tracks the various known cases that must keep an assembly alive that we don't yet have tests in the test suite for for, so that when mono implements that, they can ensure compatibility. So if we take advantage of this behavior we should document it over there.

@GrabYourPitchforks
Copy link
Member

If we were to expose a new type for this, should we name it ModuleImageMemoryStream? The name indicates "this is a stream over an in-memory module image." We can also expose the Module instance as a first-class property on the type. The behavior would be that any reference to the stream (or a span retrieved from it) will keep the underlying Module alive. Dropping "manifest resource" from the name also allows us to repurpose this stream type for later non-resource use if we so choose.

@xoofx
Copy link
Member

xoofx commented Dec 20, 2022

Hey stumbled on this issue, as I was looking for a code generator based solution that could bake the resource into the PE file directly for NativeAOT scenarios and this library EmbeddingResourceCSharp does the job nicely.
Though, not for the dotnet/runtime, but it would be great to have this built-in in C# 12+!

@ghost
Copy link

ghost commented Dec 20, 2022

Tagging subscribers to this area: @dotnet/area-system-reflection
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

I'd like to be able to get a ReadOnlySpan into an Assembly's Embedded Resources, in order to minimise memory allocations and copying when dealing with such resources.

At the moment the only way to get at an Embedded Resource is either via the Stream APIs, or by using unsafe code to read from the UnmanagedMemoryStream's pointer.

Proposed API

namespace System.Reflection
{
    public class Assembly
    {
        // ...
        public virtual Stream GetManifestResourceStream(string name);
        public virtual Stream GetManifestResourceStream(Type type, string name);
+       public virtual ReadOnlySpan<byte> GetManifestResourceSpan(string name);
+       public virtual ReadOnlySpan<byte> GetManifestResourceSpan(Type type, string name);
        // ...
    }

Usage Examples

Usage mirrors existing GetManifestResourceStream use, but with spans:

ReadOnlySpan<byte> ros1 = GetType().Assembly.GetManifestResourceSpan("Namespace.ResourceName.txt");
// do something with ros1
ReadOnlySpan<byte> ros2 = typeof(X).Assembly.GetManifestResourceSpan(typeof(X), "ResourceName.txt");
// do something with ros2

Alternative Designs

An alternative was discussed below where the ReadOnlySpan could be created from the UnmanagedMemoryStream, but this was deemed to be unsafe as the pointer from the UnmanagedMemoryStream does not contain a live reference to the assembly. If a Span is created directly from the pointer and then the assembly is unloaded, the application can crash when accessing the Span.

Risks

None known.


Original post

Hi,

Would it be possible to add an API to Assembly that allows applications to get a ReadOnlySpan<> view of an embedded resource, rather than a stream?

If this has been discussed before, please just point me at that issue.

Thanks.

Author: yaakov-h
Assignees: -
Labels:

api-suggestion, area-System.Reflection

Milestone: Future

@DeafMan1983
Copy link

DeafMan1983 commented Oct 17, 2023

@xoofx great idea but problem with byte or sbyte for ClangSharpPInvokeGenerator

Example embedded texture for DeafMan1983.Interop.SDL2

       [EmbedResourceCSharp.FolderEmbed("../data/", "*.png")]
        private static partial System.ReadOnlySpan<sbyte> GetResouceFileContent(System.ReadOnlySpan<char> path);
        public static void Main()
        {
            var tex_data = GetResouceFileContent("data/tex_data.png");
            ...
            SDL_Texture* tex_data_tex
            fixed (sbyte* tex_data_ptrs = tex_data)
            {
                tex_data_tex = IMG_LoadTexture_RW(..., SDL_RWFromMen(tex_data_ptrs, ...), ..);
            }
            ...
        }

Is it correct or wrongly? But my library DeafMan1983.Interop.SDL2 is working in progress.

@steveharter steveharter added the tenet-performance Performance related issue label Oct 18, 2023
@steveharter
Copy link
Member

The code-gen approaches mentioned are a feasible approach for these high-performance cases.

I don't see how the ability to return the raw backing memory of the assembly file on disk will work with NativeAOT and trimming.

Closing since this issue is > 5 years old with no concrete proposal that addresses the original ask (safe; no Streams; no cache needed; need to expose raw memory safely).

@pentp
Copy link
Contributor

pentp commented Nov 3, 2023

I don't understand the reasons for closing this - there is a concrete API proposal (although no full agreement on that and multiple alternatives also).

How is having a span directly to the embedded resource in the assembly fundamentally different from having a span to a .field/.data? Why would NativeAOT have any trouble with that? AFAIK the open questions here have been about unloadable assemblies and I wouldn't really mind if this API throwed for unloadable assemblies for example.

The fact that this is 5 years old just means that it hasn't been resolved in 5 years, not a reason for closing this.
People have used hacks (UnmanagedMemoryStream.PositionPointer) to work around this issue for 5 years and will continue to do so until a proper API is implemented.

@xoofx
Copy link
Member

xoofx commented Nov 3, 2023

I don't understand the reasons for closing this - there is a concrete API proposal (although no full agreement on that and multiple alternatives also).

What is not working with EmbeddingResourceCSharp for your use case?

@pentp
Copy link
Contributor

pentp commented Nov 3, 2023

I'm not going to add an additional package that generates an inefficient solution to replace a 3 line hack (using UnmanagedMemoryStream.PositionPointer) that already works. I'm just asking for a proper API so that this hack wouldn't be needed.

@xoofx
Copy link
Member

xoofx commented Nov 3, 2023

I'm not going to add an additional package that generates an inefficient solution to replace a 3 line hack (using UnmanagedMemoryStream.PositionPointer) that already works. I'm just asking for a proper API so that this hack wouldn't be needed.

The package is only used at compile time, it doesn't flow at runtime. Also, not sure to follow what do you mean by inefficient solution? The ROS<byte> static property is the fastest solution, doesn't allocate any object, is trimmable/compatible with NativeAOT, resolves directly to a constant address at JIT/AOT compilation time when using it, which is definitely not the case for GetManifestResourceStream + UnmanagedMemoryStream.

@DeafMan1983
Copy link

DeafMan1983 commented Nov 3, 2023

I tried file with EmbedResouceCSharp and it works fine for me because I have tested with Half-Life sky inside my SDL2 Wrapper and it works fine in NativeAot because I already pack to native executable. Look example:
image
That's using data for EmbedResouceCSharp. And NativeAot is okay. You understand that 💯

@steveharter
Copy link
Member

I don't understand the reasons for closing this - there is a concrete API proposal (although no full agreement on that and multiple alternatives also).

I appreciate the feedback here about closing since there has been little activity or progress in the last year, and there is still not an API provided that addresses the requirements along with assembly unloading concern. I'll re-open the issue here for discussion purposes.

Why would NativeAOT have any trouble with that?

I was thinking about trimming in general. If there's not a reference to either a generated resource property name or a resource name as a literal string passed to well-known method(s), then the linker\trimmer would (or could) trim the resource since it wouldn't detect usage of it. However, at this time I don't think resources are trimmed but I can see wanting this in the future. In any case, a new API should consider being trimmer-friendly in these regards.

So moving forward, IMO a nice approach would be to leverage the C# work to reference raw memory which supports trimming:

by creating an RVA static field for each embedded resource. This was also mentioned above in #26101 (comment) and #26101 (comment) and also has the advantage of not having to scan for a resource name.

A RVA static field would work nicely with source generation of binary resources, although I imagine huge resources might slow down compilation time.

@steveharter steveharter reopened this Nov 3, 2023
@DeafMan1983
Copy link

DeafMan1983 commented Nov 3, 2023

@steveharter why do you open issue? Check my proof of my picture that's embedding tga file inside data as byte[] ( from Span<byte> and I have written in fixed statement with embedded data (tga) byte array then I have tested dotnet build and dotnet run and output directory doesn't have directory data/desertbk.tga ( Half-Life's Sky ) and my app runs fine without errors.

@steveharter
Copy link
Member

I re-opened this issue since I think there is a path forward with a built-in source-gen feature and to continue discussion on that. Typically, we don't have discussions on closed issues.

@yaakov-h
Copy link
Member Author

yaakov-h commented Nov 4, 2023

A built-in source generator for files that are already marked as embedded resources sounds like a much more intuitive system than the third-party source generator referenced here, which (as far as I can tell) requires adding an attribute to C# source code with a relative file path that isn't necessarily included in any csproj.

@DeafMan1983
Copy link

DeafMan1983 commented Nov 4, 2023

@yaakov-h that's correct like I tell about packing to native aot executable. Embedded resources load inside in native executable like read data.

Native executable means after dotnet publish -c Release -r <rid> -p:PublishAot=true -p:StripSymbols=true --self-contained=true

If you want know to load embedded resources like texture or picture from native library.

You can try out AppWithPlugin for NativeAot.

That's all. But I never embed resources in native library. We could test with dotnet's native library ( static or shared )

// UPDATE:
Great news: I have tested with load image from native library = It works fine like in C/C++.

image

See my new repository!

@PJB3005
Copy link
Contributor

PJB3005 commented Apr 30, 2024

The existing source-gen as shown, which AFAICT just emits a public static ReadOnlySpan<byte> Data => new byte[] { ... }, would be extremely bad to have in the runtime. Embedding large binary files would nuke compile performance easily. This really is not a path forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Reflection tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests