Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete DomainAssembly #108618

Closed
wants to merge 9 commits into from
Closed

Delete DomainAssembly #108618

wants to merge 9 commits into from

Conversation

am11
Copy link
Member

@am11 am11 commented Oct 7, 2024

Closes #104590.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Oct 7, 2024
@am11
Copy link
Member Author

am11 commented Oct 7, 2024

Where disambiguation was necessary, renamed to RootAssembly, and the reset are renamed to just Assembly.

@elinor-fung
Copy link
Member

@dotnet/dotnet-diag for src/coreclr/debug/ changes

src/coreclr/vm/assembly.hpp Outdated Show resolved Hide resolved
src/coreclr/vm/assemblynative.cpp Outdated Show resolved Hide resolved
@@ -541,16 +539,6 @@ class CollectibleAssemblyHolderBase
}

private:
LoaderAllocator * GetLoaderAllocator(DomainAssembly * pDomainAssembly)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is only for Assembly now, we can probably remove the templating from CollectibleAssemblyHolder/CollectibleAssemblyHolderBase and collapse them (in a separate change).

src/coreclr/vm/loaderallocator.hpp Outdated Show resolved Hide resolved
src/coreclr/vm/ceeload.cpp Outdated Show resolved Hide resolved
@@ -4001,18 +4001,18 @@ void DacDbiInterfaceImpl::GetEnCHangingFieldInfo(const EnCHangingFieldInfo * pEn
//- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


void DacDbiInterfaceImpl::GetAssemblyFromDomainAssembly(VMPTR_DomainAssembly vmDomainAssembly, VMPTR_Assembly *vmAssembly)
void DacDbiInterfaceImpl::GetAssemblyFromRootAssembly(VMPTR_Assembly vmRootAssembly, VMPTR_Assembly *vmAssembly)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just remove this, since it is just returning the same assembly it is given.

@@ -416,7 +416,7 @@ CordbAssembly * CordbAppDomain::CacheAssembly(VMPTR_Assembly vmAssembly)
{
INTERNAL_API_ENTRY(GetProcess());

RSInitHolder<CordbAssembly> pAssembly(new CordbAssembly(this, vmAssembly, VMPTR_DomainAssembly()));
RSInitHolder<CordbAssembly> pAssembly(new CordbAssembly(this, vmAssembly, VMPTR_Assembly()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove CacheRootAssembly above and remove the second VMPTR_Assembly parameter and member on CordbAssembly, since it should be the same as the first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too sure about vmAssembly->SetHostPtr(pAssembly);, since later it reads from GetDacPtr/GetRawPtr as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing the concern, but setting ptr2->SetHostPtr(ptr1.GetDacPtr()) should get you a ptr2 with the same target address as ptr1, so GetDacPtr/GetRawPtr should be the same with ptr1 or ptr2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elinor-fung, I think the confusion is if we delete CacheRootAssembly, then where would we call vmAssembly->SetHostPtr(pAssembly); (which basically what GetAssemblyFromRuntimeAssembly does)? Going by the number of places GetDacPtr and GetRawPtr are called suggests that we would need to do more refactoring to straighten it out. Shall we keep this DAC ptr 'self reference' for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove CacheRootAssembly above and remove the second VMPTR_Assembly parameter and member on CordbAssembly, since it should be the same as the first.

Even without DomainAssembly existing any more, CacheAssembly and CacheRootAssembly still have intentionally different behavior because ICorDebug's domain-neutral assembly concept hasn't been removed yet. CacheRootAssembly creates the normal AppDomain-bound CordbAssembly object that is observed in most of the debugging API. This CordbAssembly previously has one VMPTR_Assembly field and one VMPTR_DomainAssembly field (m_vmDomainAssembly), neither of which is NULL. Contrast this with CacheAssembly() which creates ICorDebug's domain-neutral representation of the same assembly. That CordbAssembly object has a valid VMPTR_Assembly for the first field but it has m_vmDomainAssembly = NULL. The domain-bound and domain-neutral versions of the object have some minor differences in behavior because of m_vmDomainAssembly being NULL or not and we need to preserve it if we are solely claiming this is a refactor.

I think the ideal thing to do is get rid of ICorDebug's domain-neutral assembly concept entirely, but I recommend doing it in a separate PR that we would describe as a behavior change, not a refactor. In particular ICorDebugProcess::GetTypeFromTypeID and GetTypeFromObject should cease producing domain-neutral types and start producing domain affinitized types instead. Once that is done then CacheAssembly should be a dead code path that can be removed.

src/coreclr/debug/di/rspriv.h Outdated Show resolved Hide resolved
src/coreclr/debug/daccess/dacdbiimpl.h Outdated Show resolved Hide resolved
src/coreclr/debug/daccess/dacdbiimpl.h Outdated Show resolved Hide resolved
src/coreclr/vm/ceeload.h Outdated Show resolved Hide resolved
src/coreclr/vm/loaderallocator.hpp Outdated Show resolved Hide resolved
src/coreclr/debug/inc/dbgipcevents.h Outdated Show resolved Hide resolved
src/coreclr/debug/inc/dbgipcevents.h Outdated Show resolved Hide resolved
@@ -2018,13 +2017,13 @@ struct MSLAYOUT DebuggerIPCEvent
struct MSLAYOUT
{
// Module that was just loaded.
VMPTR_DomainAssembly vmDomainAssembly;
VMPTR_Assembly vmAssembly;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on their usage, I think a bunch of these can actually be VMPTR_Module instead of VMPTR_Assembly. Then I think we could get rid of the LookupOrCreateModule overloads that take an Assembly instead of Module. And maybe also the assembly pointer on the CordbModule constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that they probably could be, but if leaving them as Assembly for now reduces the amount of code churn in the PR I'd be fine with that.

Its not clear to me at the runtime level if we still have a reason to maintain separate Module and Assembly concepts? It seems like that might be a whole separate batch of refactoring and consolidation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - definitely fine with deferring to a separate PR.

As far as separate Module and Assembly concepts goes, I agree there's not much difference between them in the runtime now that we don't support multi-module assemblies. However, I think I do like the separation since they do represent different concepts ECMA-wise. In any case, as you mentioned, anything we might want to do there would be a whole separate thing from this PR.

@@ -2021,7 +2021,7 @@ BOOL FileLoadLock::CompleteLoadLevel(FileLoadLevel level, BOOL success)

// Dev11 bug 236344
// In AppDomain::IsLoading, if the lock is taken on m_pList and then FindFileLock returns NULL,
// we depend on the DomainAssembly's load level being up to date. Hence we must update the load
// we depend on the RootAssembly's load level being up to date. Hence we must update the load
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand the terminology. Does 'RootAssembly' mean something different than 'Assembly'? One of the PR comments said "Where disambiguation was necessary, renamed to RootAssembly", but I don't understand what two things are being disambiguated.

Copy link
Member Author

@am11 am11 Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "root assembly" refers to the first assembly in the app domain. In assembly.hpp, the SetNextAssemblyInSameALC function links assemblies together as a linked list, with the root assembly serving as the head node.

I intentionally moved away from the "domain assembly" terminology, as it previously encompassed additional responsibilities and complexity, which have been removed in earlier PRs in the series. Now, the root assembly is simply the first assembly, without the extra overhead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree 'domain assembly' terminology should go away. Since we basically merged the needed parts of DomainAssembly into Assembly, I think runtime/vm can just use 'assembly' now - without any ambiguity.

Looking at some existing terminology, we already have 'root assembly' on the AppDomain as representing the entry point assembly, which is different from what is being referred to here. For the tracking of SetNextAssemblyInSameALC, it is actually only for collectible assemblies - maybe it should assert or have 'collectible' in the name to make it clearer, but I think calling it 'root assembly' instead of just 'assembly' there may more confusing than helpful.

The one area I think might need disambiguation is in the debug APIs for CordbAssembly vs Assembly (and the case you found at #108618 (comment)). For that, I'd propose 'RuntimeAssembly` / 'runtime Assembly'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced root assembly with assembly or runtime assembly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one area I think might need disambiguation is in the debug APIs...

The place I worry about is that ICorDebug represents a CordbAssembly as a tuple of (VMPTR_Assembly, VMPTR_DomainAssembly). For ICorDebug's domain-neutral assembly concept the 2nd component of that tuple must be NULL. This means even if we coerced the two components to both use the same VMPTR_Assembly type, they would still have different semantics. When used as the 1st component the value is never NULL, but when used as the 2nd component it must be NULL sometimes. Any place where VMPTR_DomainAssembly is used throughout ICorDebug and DAC is potentially expected to have that same 2nd component semantic. Although we technically could re-encode the different semantics with a carefully applied naming convention, it seems like a step backwards if we are erasing type information and replacing it with disambiguation via naming convention. I'd rather either:

  • remove ICorDebug's domain neutral concept so there is no longer any need for a 2nd semantic
  • preserve the semantic distinction by keeping the VMPTR_DomainAssembly type for now. As soon we remove ICorDebug's domain neutral concept in the future I'd be fine to get rid of it at that point.

src/coreclr/vm/profilingenumerators.cpp Outdated Show resolved Hide resolved
// DomainAssembly is a base-class for a CLR module, with app-domain affinity.
// For domain-neutral modules (like CoreLib), there is a DomainAssembly instance
// Assembly is a base-class for a CLR module, with app-domain affinity.
// For domain-neutral modules (like CoreLib), there is a Assembly instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these old comments may not be technically wrong, but it would be nice if we could clean up these references to domain-neutral and app-domain affinity at the same time we are deleting the DomainAssembly type.

@@ -2018,13 +2017,13 @@ struct MSLAYOUT DebuggerIPCEvent
struct MSLAYOUT
{
// Module that was just loaded.
VMPTR_DomainAssembly vmDomainAssembly;
VMPTR_Assembly vmAssembly;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that they probably could be, but if leaving them as Assembly for now reduces the amount of code churn in the PR I'd be fine with that.

Its not clear to me at the runtime level if we still have a reason to maintain separate Module and Assembly concepts? It seems like that might be a whole separate batch of refactoring and consolidation.

// for each appdomain the module lives in.
// This is the canonical handle ICorDebug uses to a CLR module.
DEFINE_VMPTR(class DomainAssembly, PTR_DomainAssembly, VMPTR_DomainAssembly);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have places in ICorDebug where VMPTR_DomainAssembly is NULL and VMPTR_Assembly/Module is non-null to indicate ICorDebug's concept of a domain neutral assembly. I'm concerned that if we coerce all DomainAssembly usage to Assembly without first getting rid of ICorDebug's domain neutrality concept it will create a lot of confusion about which of the two types each value represents. I'd suggest doing either:

  1. Eliminate ICorDebug's notion of domain-neutral assemblies. The entire thing stems from CordbProcess::GetTypeForObject and CordbProcess::GetTypeForTypeID which are two spots in the API where we have type information where it was ambiguous (historically) what AppDomain to assign. To address that issue ICorDebug created a sentinel CordbAppDomain that represents the shared domain (GetSharedAppDomain()) and then that shared domain has ICorDebugModules, ICorDebugAssemblies, ICorDebugTypes, etc that are all representations of domain-neutral modules/assemblies/types. Now that coreclr no longer has multiple AppDomains it is straightforward to assume that any type we are creating must belong to the singleton AppDomain and ICorDebug can get rid of the GetSharedAppDomain(). If you go this route you might want to do it as a separate PR that gets merged first, then rebase this one on top of it.

  2. Leave VMPTR_DomainAssembly type in place, but let it store the value of an Assembly* in its internal pointer. If we go this route I'd also leave all the "DomainAssembly" naming intact within DAC and DBI + write a comment that we consider the Assembly* to be a valid VMPTR_DomainAssembly now that the runtime no longer represents them as distinct data structures.

@am11
Copy link
Member Author

am11 commented Oct 10, 2024

While I agree that there’s significant cleanup and refactoring needed in the VM (e.g., the module contains both an assembly and a PE assembly, with the PE assembly being identical to the one tied to the assembly so we can technically fold module into assembly), this PR is already getting quite large. If this change is moving us in the right direction and doesn’t introduce any functional issues, I suggest we merge it as is and tackle further refactoring in follow-up steps. What do you all think?

@elinor-fung
Copy link
Member

elinor-fung commented Oct 10, 2024

Yeah, definitely don't think more major refactoring in the runtime/vm belongs in this change. Aside from functionality, I do want to make sure it is not introducing any new sources of confusion (like #108618 (comment)).

I'll leave it to @noahfalk to chime in on what would be a good state to merge / have follow-ups for the debug side of this change.

@noahfalk
Copy link
Member

If this change is moving us in the right direction and doesn’t introduce any functional issues, I suggest we merge it as is and tackle further refactoring in follow-up steps. What do you all think?

I'm fine with the principle of doing the work in chunks. My main concern is the ICorDebug/DAC conversion of VMPTR_DomainAssembly into VMPTR_Assembly. Right now there is some important information encoded in that distinction and this PR erases the type differences without (yet) handling the underlying domain neutral assembly concept that makes those types matter. Thats why I am suggesting either:

  1. Get rid of that ICorDebug domain-neutral concept first (in a separate PR most likely), then merge this one as-is.
  2. Scale this PR down so that it only refactors out the DomainAssembly in the vm and debug/ee directories but preserve the VMPTR_DomainAssembly in the daccess and di directories.

@am11
Copy link
Member Author

am11 commented Oct 11, 2024

Right now there is some important information encoded in that distinction and this PR erases the type differences without (yet) handling the underlying domain neutral assembly concept that makes those types matter.

What I can tell by reading the code is that the difference comes down to whether GetDacPtr returns NULL. In case of DomainAssembly, it is non-null. This PR is keeping that connection in CacheRuntimeAssembly (renamed from CacheDomainAssembly). The call sites continue to rely on NULL check and getting what they are in main branch, as CacheAssembly variant does not make the DAC ptr connection.

If there is no breakage, and RuntimeAssembly disambiguation is enough, then there should be no risk, correct? Otherwise, if this is breaking something which is not visible with the CI testing, then we shouldn't take this change.

@noahfalk
Copy link
Member

What I can tell by reading the code is that the difference comes down to whether GetDacPtr returns NULL. In case of DomainAssembly, it is non-null.

I don't follow you here. Maybe you are refering to one particular spot where DomainAssembly is used? There are certainly places where we construct NULL valued VMPTR_DomainAssembly such as here and here. GetDacPtr() is also not the only API which is potentially affected by whether the value of a VMPTR_DomainAssembly is NULL. For example there are calls to VMPTR_DomainAssembly.IsNull() and VMmPtrToCookie(VMPTR_DomainAssembly) whose results will change if a VMPTR_DomainAssembly that was supposed to be NULL is incorrectly populated with a non-NULL value.

This PR is keeping that connection in CacheRuntimeAssembly (renamed from CacheDomainAssembly). The call sites continue to rely on NULL check and getting what they are in main branch, as CacheAssembly variant does not make the DAC ptr connection.

I agree that the PR isn't creating a functional problem on its own (at least I don't think it is). My concern is that the PR is making it more confusing to work with the code in the future. As an analogy imagine a large codebase that uses C# nullable reference checking and every type has been correctly annotated with '?' to indicate which reference types are allowed to be NULL and which are not. Then someone opens a PR that deletes all the '?' markings and disables compiler nullability checking. The code still runs correctly after such a change but now going forward none of the devs working on that code can quickly discern where NULL is expected because type information which gave them that clue has been deleted.

If there is no breakage, and RuntimeAssembly disambiguation is enough, then there should be no risk, correct? Otherwise, if this is breaking something which is not visible with the CI testing, then we shouldn't take this change.

My goal is that the semantics of the data should be equally or more clear after the refactor than before. I don't think that is true right now everywhere we coerced VMPTR_DomainAssembly to VMPTR_Assembly.

I like the end-goal of having the VMPTR_DomainAssembly go away, I just don't want to do it while the type still has significance to other portions of the ICorDebug code. As long as ICorDebug has its domain neutral assembly concept then I consider the difference between VMPTR_DomainAssembly and VMPTR_Assembly to be important. So either we can get rid of ICorDebug domain neutral assemblies now (I don't think it would be hard) to unblock the refactoring as-is, or if you decide you don't want to take that on right now then I ask that we hold off on eliminating the VMPTR_DomainAssembly type as well.

@am11
Copy link
Member Author

am11 commented Oct 26, 2024

@noahfalk, thanks for the detailed insights! I attempted to remove the domain-neutral concept from ICorDebug, but it seems we may need to add another API to acquire AppDomain::GetCurrent for process.cpp and module.cpp. I'm not quite sure what the best course of action would be to achieve that, especially since it might require someone more familiar with the intricacies involved.

In the meantime, I’ll put this PR into draft status.

Thanks again for your guidance!

@am11 am11 marked this pull request as draft October 26, 2024 14:16
@noahfalk
Copy link
Member

but it seems we may need to add another API to acquire AppDomain::GetCurrent for process.cpp and module.cpp

Sorry I don't think I followed you here. Is there some spot in the code you could link to where you expect the new API would be used? Could you describe what the problem is not having the new API? I'm happy to help once I've got an understanding of the issue.

I'm not quite sure what the best course of action would be to achieve that, especially since it might require someone more familiar with the intricacies involved.

If you'd like to let this sit until someone with more ICorDebug expertise can dig into it thats fine too. I will warn the team is pretty swamped so it might be waiting a long time in that case.

@am11
Copy link
Member Author

am11 commented Oct 29, 2024

Could you describe what the problem is not having the new API?

e.g. replacing this usage of GetSharedAppDomain:

hr = CordbType::TypeDataToType(GetSharedAppDomain(), &data, &type);

requires access to the AppDomain::GetCurrent(), so we can call LookupOrCreateAppDomain to get the desired type of CorDb* wrapper.

Note that TypeDataToType() then uses it to GetADToken and GetProcess (latter can be satisfied by passing this).

If there is always one appdomain per CordbProcess, then all calls to {type,module}->GetAppDomain() can be simplified to use the shared cached object, but that would be a larger refactoring. Right now CorDb type, module, thread and process, all have their own copy of ICorDebugAppDomain, and eventually all (when non-null) seem to be wrapping the same appdomain instance.

@noahfalk
Copy link
Member

Right now CorDb type, module, thread and process, all have their own copy of ICorDebugAppDomain

I'm not sure what you meant by having their own copy. I'd expect they have their own pointers but all the pointers go to the same object.

Overall I'd expect there are two CordbAppDomain objects that can ever exist right now:

  1. There should be one CordbAppDomain cached inside the CordbProcess::m_appDomains hashtable which maps to the singleton AppDomain* inside the CoreCLR process.
  2. There is a 2nd CordbAppDomain representing ICorDebug's "shared" domain stored in CordbProcess::m_sharedAppDomain.

My suggestion is that we eliminate the 2nd one and make all references point to the first. I think a minimal change that would do that is to replace the logic in CordbProcess:GetSharedAppDomain() with something like this:

CordbProcess::GetSharedAppDomain()
{
    // we no longer have a shared domain that is distinct from other app domains, instead we just
    // have the singleton default app domain. Anything that used to reference the shared domain now
    // references the singleton. The only reason we invented the shared domain was to handle cases
    // where we couldn't resolve the actual app domain and that problem no longer exists in .NET Core.
    ULONG defaultAppDomainId = 1;
    VMPTR_AppDomain vmAppDomain = pDAC->GetAppDomainFromId(appDomainId);
    CordbAppDomain * pAppDomain = LookupOrCreateAppDomain(vmAppDomain);
    return pAppDomain;
}

More refactoring could eventually get rid of this API entirely, change the CordbProcess fields, and adjust some other codepaths that are more complicated than they need to be but all of that should be cosmetic and optional.

I do suggest you make the change its own separate PR because it is behavior altering. If issues do arise in debuggers later on it will be helpful to be able to test against just this bit in isolation.

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-AssemblyLoader-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove DomainAssembly concept
3 participants