Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unused generic type parameter should not cause loader failure #6924

Closed
gafter opened this issue Nov 2, 2016 · 50 comments · Fixed by #83995
Closed

Unused generic type parameter should not cause loader failure #6924

gafter opened this issue Nov 2, 2016 · 50 comments · Fixed by #83995

Comments

@gafter
Copy link
Member

gafter commented Nov 2, 2016

Consider the following code:

static class Program
{
    static void Main()
    {
        var m = new M();
    }
}

struct N<T> { }
struct M { public N<M> E; }

This code will compile with C# 6.0 and is legal according to the CLI spec. The type definition is recursive but the layout of the struct is not because the field involved here is an empty struct. The CLR is unable to handle this though and fails at runtime with a TypeLoadException:

'M:Program.Main' failed: Could not load type 'M' from assembly 'ConsoleApplication1, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.
System.TypeLoadException: Could not load type 'M' from assembly 'ConsoleApplication1, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.
at Program.Main()

See also #5479

@kouvel kouvel removed their assignment Oct 31, 2017
@MartinJohns
Copy link

Stumbled over exactly this issue the last days. 🙁 Really thought it would work, and it would have been a nice complement to the new System.Memory<T> type.

@verelpode
Copy link

verelpode commented Aug 9, 2018

because the field involved here is an empty struct.

People might reply, "True but nobody needs an empty struct in practice", therefore here is an example that throws TypeLoadException with a non-empty struct. i.e. a more severe example of the failure.

class Program
{
	static void Main(string[] args)
	{
		Test();
	}
	static SNode Test()
	{
		return new SNode();
	}
	struct ArrayElementReference<T>
	{
		public T[] Array;
		public int Index;
	}
	struct SNode
	{
		ArrayElementReference<SNode> x;
		int y;
	}
} 

@improbable-nickkrempel
Copy link

Suggest changing the title from "Unused generic type parameter ..." to "Self-referencing generic type parameter ...", as it's not related to whether the type parameter is used or unused - see @verelpode's example above, and also https://github.com/dotnet/coreclr/issues/20220, which now seems to be a duplicate of this.

@Korporal
Copy link

Korporal commented May 3, 2019

This issue becomes less academic and much more significant when one begins to leverage recently added support for generic pointers, the unmanaged generic constraint and ref.

Here is a concrete and realistic example (which comes from actual code I've developed) dotnet/roslyn#35324.

@Korporal
Copy link

Korporal commented Jun 1, 2019

Still no progress on this bug. I can only assume that a few users are raising this, because a truly fundamental failure to process completely legal generated code strikes me as something that would get a little more attention.

@GSPP
Copy link

GSPP commented Jun 1, 2019

@Korporal the 3.0 release is getting closer and the team is becoming more selective in what issues to take on. That might be the reason. (I'm not a team member.)

@karelz
Copy link
Member

karelz commented Jun 5, 2019

Also, this issue was reported by 3 people so far. So while your realistic example may be bad experience (I didn't check it yet), the reality is not too many people hit it yet, which means lower priority. Hardly something as must-have for 3.0 (just my opinion).

@Korporal
Copy link

Korporal commented Jun 5, 2019

It's been around for three and a half year though!

@gafter
Copy link
Member Author

gafter commented Jun 6, 2019

the reality is not too many people hit it yet

We've had to work around this issue multiple times in Roslyn. Did you need separate reports each time we hit this to realize that it is a persistent problem?

@Korporal
Copy link

Korporal commented Jun 19, 2019

@karelz I understand this seems to be relativly small impact but it is very fundamental in nature, a very foundational capability. In my case I'm working on a C# rewrite of a custom memory allocator that sits on top of an application's unmanaged memory space. Recent advances to C# in areas like type constraints and the ref keyword have made this possible. However this bug is a solid roadblock because we have a generic struct type SmartPointer<T> which supports lists and trees that cannot be implemented because code compiles but fails at runtime.

The struct type SmartPointer<T> when being used in a list, appears as a member field in a struct type T that can "point" to another instance of a T, this is where the recursive reference comes up. For reasons I can't go into here, we cannot use a conventional unsafe pointer.

(Just to be clear these struct instances are not in managed memory but allocated from unmanaged area of a process's addess space using the custom allocator).

Is there any possibility the bug could be given to a competent intern to see if they can do anything with it? I dont care about it being in 3.0, but currently its going nowhere and this could continue for years more if Microsoft continue to treat it as low importance.

Leaving it unfixed for year after year could also make it harder to fix because the CLR may get refactored or restructured in ways that make the fix more difficult. This feature really is a basic expectation.

Id love to look at it myself even but will not pretend I have the internals knowledge or experience to actually deliver.

@grandseiken
Copy link

grandseiken commented Jun 25, 2019

FWIW, this is an issue we hit repeatedly, particularly in the context of generated C# code, where it is often not easy or feasible to work around (changing output of the code generator wholesale to avoid this problem would be API breaking or have unacceptable knock-on effects, detecting and avoiding on a case-by-case basis would lead to inconsistent or incompatible APIs in some cases).

@Korporal
Copy link

Korporal commented Oct 21, 2019

@karelz
I see there's no movement on this, is anyone in a position to say that this will or won't be addressed and if so when approx it might be fixed?

@karelz
Copy link
Member

karelz commented Oct 22, 2019

No idea - maybe @davidwrighton or @jkotas can comment?

I still believe it is not serious enough to prioritize it. If you have a fix, I think we would be open to a contribution (if it is not overly complex, it is high-quality and not breaking).
General rule is Future = maybe one day.

@davidwrighton
Copy link
Member

This has been a known issue in the CLR typesystem since generics were added many years ago. The general problem is that the field type of fields is computed at type load time during initial creation of the type data structures, and that would have to change as part of fixing this issue. In general, it is believed that changes to logic in this system are somewhat high in risk, and without a clear justification for risky work, we can't justify the work. There is a related similar issue with static fields, where some ECMA permitted static fields are not loadable by the runtime.

To the point that @Korporal was making that leaving this issue around will make it harder to fix in the future, unfortunately that isn't really the case. The reasons for this being difficult in the CLR have been part of the type loader design since the late 90's when the runtime was first implemented before generics were even designed in the first place, and so I don't believe we are in significant risk of making it a more difficult fix.

@Korporal
Copy link

Korporal commented Nov 1, 2019

@davidwrighton

Thanks for elaborating.

I'm certainly pleased that the perceived risk isn't increasing over time, I'm also extremely interested to read "The reasons for this being difficult in the CLR have been part of the type loader design since the late 90's when the runtime was first implemented before generics were even designed in the first place".

So the fundamental flaw is not a result of generic support but something much more fundamental?

Can anyone tell me what I must basically do to be able to begin exploring this and perhaps attempting to address it on a fork of this repo?

Thanks

@improbablejan
Copy link

@Korporal

If it's at all helpful (and it might not be since I saw the codebase exactly once), when we ran into this I did try to understand where the issue comes from and how difficult it would be to fix: https://github.com/dotnet/coreclr/issues/20220#issuecomment-430148594

@Korporal
Copy link

Korporal commented Nov 2, 2019

@improbablejan - That is helpful any information like that is very helpful. Is it possible to point me to the source files specifically? I know it's not a simple matter of editing a few files and moving on but seeing the code will help me.

From what you wrote in that earlier comment (20220) it may be that a fundamental change is needed to enable this, and that may be why it's deemed very risky.

Thanks

@jkotas
Copy link
Member

jkotas commented Nov 2, 2019

Is it possible to point me to the source files specifically?

This is the specific line that is the root cause of problem: https://github.com/dotnet/coreclr/blob/0a80d79fb64a5878a5163345ac182917f12c95c5/src/vm/methodtablebuilder.cpp#L4035

it may be that a fundamental change is needed to enable this

Yep.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@kekekeks
Copy link

kekekeks commented Feb 9, 2020

Apparently the "full-fledged" CLR is the only runtime where it doesn't work. Mono and experimental CoreRT have no issues with running this code.

@mangod9
Copy link
Member

mangod9 commented Sep 16, 2020

Closing since there doesnt seem to be any particular reasons to be fix in the near term.

@mangod9 mangod9 closed this as completed Sep 16, 2020
@MartinJohns
Copy link

MartinJohns commented Sep 16, 2020

@mangod9 Is it really a good approach to simply close issues regarding bugs, just because you don't plan to fix them in the near time? Especially considering that even the Roslyn team had to workaround this bug several times. Personally I'm still waiting and hoping for this to be fixed, and having an open issue is a nice way to at least track it.

I understand that it can be discouraging to have eternally open bug issues... But simply closing them without any resolution just seems weird and the wrong way to handle things.

edit: One of my most frustrating situations in recent times is to look up a bug... Only to find an issue closed due to no activity, while the bug still remains to the present day. Depending on the project this happens fairly frequently. It saddens me to see .NET go the same way.

@mangod9
Copy link
Member

mangod9 commented Sep 16, 2020

I understand your concern @MartinJohns. From the discussion above its unlikely it will be fixed in 6 (or 7) given the risk. Perhaps CoreRT is a possible way around? But happy to keep it open given the interest in this issue :)

@mangod9 mangod9 reopened this Sep 16, 2020
@Korporal
Copy link

This is just not going away is it! I've raised this several times over the past year or two. We've created code that absolutely relies on this mechanism and that work is just stuck, not going anywhere. The system in question works very closely with mapped memory on Windows and this one bug causes huge headaches for what could be short, sweet, elegant logic.

I wish I had the time and skills to dig into this but I simply don't.

@pengweiqhca
Copy link

pengweiqhca commented Jul 28, 2022

I resolved.

var node = (Node*)Marshal.AllocHGlobal(sizeof(Node)).ToPointer();

public unsafe struct Node
{
    private StructList _children;

    public ref StructList<Node> Children => ref Unsafe.AsRef<StructList<Node>>(Unsafe.AsPointer(ref _children));

    [StructLayout(LayoutKind.Sequential)]
    private struct StructList
    {
        private IntPtr _array;
        private int _size;
        private int _length;
    }
}

[StructLayout(LayoutKind.Sequential)]
public struct StructList<T>
{
    private IntPtr _array;
    private int _size;
    private int _length;
}

@teo-tsirpanis
Copy link
Contributor

To give an update on the demand to fix this one year after @davidwrighton's last message, the number of 👍s has more than doubled from 24 to 60, making this issue by far the most upvoted open or closed bug of this repo, and three duplicate issues have been opened since that time.

@DaZombieKiller
Copy link
Contributor

@pengweiqhca That code contains a GC hole and isn't safe to use. Unsafe.AsPointer will cause the GC to lose track of the reference, leading to memory corruption. This is how you can avoid the GC hole:

public ref StructList<Node> Children => ref Unsafe.As<StructList, StructList<Node>>(ref _children);

The general rule is that you should almost never convert a managed ref to an unmanaged pointer without fixed, unless the ref is referring to unmanaged or already-pinned memory.

@pengweiqhca
Copy link

@DaZombieKiller Yes, I use native memory + pointer.

@DaZombieKiller
Copy link
Contributor

@pengweiqhca In your example you're using a class, and the ref is pointing to one of its fields. So in this scenario, the ref is not pointing to unmanaged/native memory. As a result it's not safe to convert it to an unmanaged pointer because the GC can relocate it between the AsRef and AsPointer calls.

@pengweiqhca
Copy link

@DaZombieKiller Sorry, I forgot to modify it, but it has been fixed now. In my actual code I use struct.

@tokizr
Copy link

tokizr commented Feb 8, 2023

Whether this issue will be resolved or not, the user experience is currently really bad. We just ran into this issue because a colleague wanted to keep a LUT inside a value type using ImmutableArray<T> (company coding rules say no mutable static members so T[] would be forbidden).

All I get is a TypeLoadException with no hint as to what is going wrong. When it first showed up it was in the middle of a massive file that was affected by several different source generators and it was very hard to know what it was causing the problem.

I even tried using dotnet-trace in hopes of getting a better idea, but it just crashed and produced a truncated trace.

I eventually built runtime to debug it myself and found the exception thrown in clsload.cpp:3451

if (PendingTypeLoadHolder::CheckForDeadLockOnCurrentThread(pLoadingEntry))
{
    // Attempting recursive load
    ClassLoader::ThrowTypeLoadException(pTypeKey, IDS_CLASSLOAD_GENERAL);
}

So even here there is not indication as to what is actually going wrong.

If the runtime will not support this then Roslyn should not allow the code to compile in the first place. Internally we should be able to add this case to our analyzers so people are not caught off guard in the future, but we shouldn't be having to to do that ourselves.

@sunkin351
Copy link

sunkin351 commented Feb 13, 2023

I decided, seeing as it's been forever and likely gonna take forever more, that I'd put careful thought into an analyzer of my own design to help deal with this issue.

https://github.com/sunkin351/GenericStructCyclesAnalyzer

Using this would deal with any user experience issues revolving around this. The analyzer itself is incredibly simple, single file and catches every situation I could think of.

I would like to propose adding it to the existing suite of .NET Analyzers, but I'm not sure it would stand up to their guidelines. Maybe some of you could check that for me.

@jarnmo
Copy link

jarnmo commented Mar 21, 2023

I run into this again when using a nested struct of a generic class, not immediately realizing I was dealing with this issue. I guess this happens because a nested type of a generic type is essentially a generic type as well.

So, the following code throws a TypeLoadException:

var tmp = new Other();

public class Container<T> {
    public struct Nested { }
}

public struct Other {
    public Container<Other>.Nested Nested;
}

davidwrighton added a commit to davidwrighton/runtime that referenced this issue Mar 27, 2023
- Take advantage of work done a few years ago to simplify the interaction with the interop subsystem
- In the problematic case, simulate loads with two different types as instantiations which are unrelated in field layout, and see if they match up. Only enable this code in a few very small isolated parts of the runtime
- Filter more of the logic through the byvalue class cache which should improve performance a bit
- Similarly when considering blittability, tweak the logic to use the special load path

- The current fix is expected to break Unix X64, but I believe the rest of the OS/Architecture pairs should work
- Handling of static fields also needs examination for correctness
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 28, 2023
davidwrighton added a commit that referenced this issue Apr 4, 2023
…canonical type (#83995)

- Take advantage of work done a few years ago to simplify the interaction with the interop subsystem
- In the problematic case, simulate loads with two different types as instantiations which are unrelated in field layout, and see if they match up. Only enable this code in a few very small isolated parts of the runtime
- Filter more of the type loader logic through the byvalue class cache which should improve performance a bit
- Similarly when considering blittability, tweak the logic to use the special load path

- Support for self-recursive generics is not enabled for static fields, as that requires a somewhat different tweak, and there is less apparent demand. (For that scenario self-referential generics really should support having fields of type T.)
- Support for indirect self-recursive generics is also not enabled. The approach taken here is not practical for that, and there does not appear to be significant demand for that either.

Fixes #6924
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 4, 2023
@teo-tsirpanis teo-tsirpanis modified the milestones: Future, 8.0.0 Apr 4, 2023
@DaZombieKiller
Copy link
Contributor

Should this be reopened since there are still cases not covered by the fix, such as the one mentioned here?

struct N<T> { }
struct M { public N<Nullable<M>> E; }

@MichalStrehovsky
Copy link
Member

Should this be reopened since there are still cases not covered by the fix, such as the one #83995 (comment)?

That should probably be a new issue so that it has a new thumbs up counter with people who're blocked by this. We can wait until someone runs into it.

The counter is used to prioritize the bug against any other type loader work - i.e. there's a finite number of people who can fix such bug and working on fixing the bug means other things will not be worked on. It's zero sum.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet