Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable PDB support for ilasm #37702

Merged
merged 31 commits into from
Jul 14, 2020

Conversation

ivanpovazan
Copy link
Member

@ivanpovazan ivanpovazan commented Jun 10, 2020

This is my take on adding portable PDB support to ilasm.

I understand that this PR is quite large, which is against the contribution guidelines, but adding portable PDB support requires a lot of changes.

Even though the work is not complete, after talking to @noahfalk I have decided to create this draft pull request to start the discussion and get some feedback on the work that has been done so far. Thus I would really appreciate any comments/suggestions/corrections in order to finalize this feature.

The work is based on the thread discussion (#5051) and mainly follows the idea of creating a portable PDB metadata writer based on extensions to IMetaDataEmit interface. It is still WIP, but the current version can generate portable PDB, which can be loaded by VS debugger and I have successfully managed to debug, step, observe local variables of C# programs by round tripping them: compiling -> disassembling (ildasm) -> assembling(ilasm).

Since portable PDB standard seems to be oriented more towards higher level programming languages, I have tried to define minimum viable set of features that are applicable to ilasm, which are currently supported:

  • Portable PDB CodeView debug directory entry
  • Document table
  • MethodDebugInformation table
  • Local scope table
  • Local variable table

Please let me know if some other features are necessary or required, and how would those be expressed in ilasm.

Ivan Povazan added 11 commits June 10, 2020 14:08
- Currently this just makes a distinction between classic PDB format (not supported on CoreCLR) and portable one (to be added)
…d for portable PDB generation

- The idea is to have 2 emitters in ilasm, 1 for system metadata and 1 for pdb metadata
This change extends the IMetaDataEmit2 interface
StateMachineMethod and CustomDebugInformation tables are left out
Extending IMetaDataEmit2 interface
Extending IMetaDataEmit2 interface
…generation

- This can probably be done in a better (more clever) way, check if/why/when this is necessary
@dnfadmin
Copy link

dnfadmin commented Jun 10, 2020

CLA assistant check
All CLA requirements met.

Fixes build failures under gcc -Wcomment -Werror
Copy link
Member

@davidwrighton davidwrighton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the ideas here, and the implementation generally seems reasonable. I have a few concerns though, that I'd like to make sure are handled before we could accept such a change. At

  1. All of the changes in the MD directory need to be under some #ifdef. We could enable that for building ilasm, and probably ildasm, as ildasm should probably be able to parse these things. However, the size of the coreclr binary must not be affected by these changes.
  2. I'd like to avoid changes to in the public headers Cor.h and Corhdr.h. I'd prefer that all new portable pdb handling in the native api is held in a new header or two.
  3. COM rules on interface changing must be respected. This is a minor detail, and easy to fix.
  4. We should support both embedded and non-embedded portable pdbs.
  5. This will need some amount of testing.

@@ -613,6 +631,41 @@ DECLARE_INTERFACE_(IMetaDataEmit2, IMetaDataEmit)

STDMETHOD(ResetENCLog)() PURE; // S_OK or error.

STDMETHOD(GetReferencedTypeSysTables)( // S_OK or error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are COM interfaces, which must never change. You will need to create a new COM interface to hold these methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your insightful feedback.
I am currently fixing some build issues, but will look into introducing new COM interfaces right after that.

Copy link
Member Author

@ivanpovazan ivanpovazan Jun 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton could you please elaborate on how would these new COM interfaces be generated?
I was thinking of creating ppdb.idl file which would define IMetaDataEmit3 interface inheriting from IMetaDataEmit2, but I failed to find an .idl file defining IMetaDataEmit2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These COM interface headers were written by hand. I would recommend following the pattern in cor.h to make another one.

For instance,

// {7DD2F810-CAD1-419F-9964-82912CAC3CD4}
EXTERN_GUID(IID_IMetaDataEmit3, 0x7DD2F810, 0xCAD1, 0x419F, 0x99, 0x64, 0x82, 0x91, 0x2C, 0xAC, 0x3C, 0xD4);
//---
#undef  INTERFACE
#define INTERFACE IMetaDataEmit3
DECLARE_INTERFACE_(IMetaDataEmit3, IMetaDataEmit2)
{
// Insert new api's here!
};

Note, that I have generated a new guid by using the uuidgen tool. Each guid is associated represents a unique interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton could you please check the changes from my last 3 commits?

Based on your previous comments, I have done the following:

  1. Introduced 2 new COM interfaces: IMetaDataEmit3 and IMetaDataDispenserEx2 found in .\md\inc\portablepdbmdi.h
  2. Removed all changes from cor.h and corhdr.h
  3. Introduced new preprocessor directive FEATURE_METADATA_EMIT_PORT_PDB which now surrounds all the changes related to portable PDB metadata generation
  4. Adjusted make files to generate portable PDB metadata support only for ilasm (keeping coreclr intact)

Let me know what do you think about this, and what will be the next steps.

@noahfalk
Copy link
Member

Hi @ivanpovazan - thanks so much for your efforts here! It might be another day or two before I can dig into this properly.
heads up @tmat and @MichalStrehovsky, you guys might be interested in taking a look at this as well?

@noahfalk
Copy link
Member

@janvorli and @davidwrighton - I've had plenty of work continue to come in and I see both of you already reviewed. Let me know if you think it would valuable for me to be a 3rd reviewer here.

@davidwrighton
Copy link
Member

@ivanpovazan I've looked through the change, and I believe it is good enough to checkin, and there are no more meaningful issues to address before checkin with regards to changes to the product code. I am however a bit concerned that it might regress some existing ILasm functionality as we have very few tests for ilasm correctness at this time.

@BruceForstall do you have any ideas on how to regresssion test this? I know we've done some ildasm/ilasm roundtripping testing in the past. Is that a reasonable thing to run?

src/coreclr/src/ilasm/CMakeLists.txt Outdated Show resolved Hide resolved
src/coreclr/src/ilasm/assembler.h Outdated Show resolved Hide resolved
src/coreclr/src/ilasm/main.cpp Outdated Show resolved Hide resolved
memset(m_pdbStream.typeSystemTableRows, 0, sizeof(ULONG) * TBL_COUNT);
time_t now;
time(&now);
m_pdbStream.id.pdbTimeStamp = (ULONG)now;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want the timestamp in the generated PDB file? That makes the generated file not binary equivalent on equivalent source rebuild. I thought we're trying to get away from timestamps in generated files because of this problem.

time_t now;
time(&now);
m_pdbStream.id.pdbTimeStamp = (ULONG)now;
hr = CoCreateGuid(&m_pdbStream.id.pdbGuid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the same this applies to this GUID.

@ivanpovazan
Copy link
Member Author

ivanpovazan commented Jul 2, 2020

@davidwrighton @BruceForstall my last commit includes some functional testing related to portable PDB metadata generation for ilasm.

It is true that testing would be even more complete when ildasm supports portable PDB, but I believe I covered significant part of what has been added for portable PDB support.

What I think would be valuable for you (or someone who will perform manual debugger tests) can be found at:
https://github.com/ivanpovazan/runtime/tree/ilasm_portable_pdb/src/coreclr/tests/src/ilasm/PortablePdb/Resources
where I have added a short description of how you can use one of the test files to verify debugging C# source with a debugger.

@ivanpovazan
Copy link
Member Author

I have also fixed the naming issues reported by @BruceForstall.
Can I mark this as 'Ready for review' so this work can be treated as done?

@davidwrighton
Copy link
Member

Please feel free to mark this as ready for review. I think the testing strategy here is good. I'm not sure the tests are in the right place, but I don't have a better place for them quite yet.

@ivanpovazan ivanpovazan marked this pull request as ready for review July 8, 2020 07:40
@ivanpovazan
Copy link
Member Author

@davidwrighton thank you, I have marked it as ready for review but there is still a change request open.

@noahfalk
Copy link
Member

noahfalk commented Jul 8, 2020

Let me know what do you think about this.

This looks fine to me. I didn't notice anything in the portable pdb spec that said the 20 byte id needed to be formed from a GUID and a timestamp, but that behavior matches SRM's implementation when it is operating in non-deterministic mode.

@ivanpovazan
Copy link
Member Author

@noahfalk yes you are right, GUID and timestamp requirement do not come from portable PDB spec, that is why I linked the spec of debug directory as well, which states:

Version Major=any, Minor=0x504d of the data format has the same structure as above. The Age shall be 1. The format of the .pdb file that this PE/COFF file was built with is Portable PDB. The Major version specified in the entry indicates the version of the Portable PDB format. Together 16B of the Guid concatenated with 4B of the TimeDateStamp field of the entry form a PDB ID that should be used to match the PE/COFF image with the associated PDB (instead of Guid and Age). Matching PDB ID is stored in the #Pdb stream of the .pdb file.

@noahfalk
Copy link
Member

noahfalk commented Jul 8, 2020

that is why I linked the spec of debug directory as well...

I have failed at reading it appears, sorry about that : )

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me, thanks for your all your hard work here! Just one comment inline about the verification macros I'd like to fix up.

@@ -554,6 +563,24 @@ HRESULT CLiteWeightStgdbRW::GetSaveSize(// S_OK or error.
}
}

#ifdef FEATURE_METADATA_EMIT_PORTABLE_PDB
// [IMPORTANT]:
// Apparently, string pool must exist in the portable PDB metadata in order to be recognized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregg-miskelly @tmat - fyi for you guys. This workaround seems fine to me but let us know if you have any concerns.

src/coreclr/src/md/inc/VerifyLayouts.inc Outdated Show resolved Hide resolved
Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ivanpovazan
Copy link
Member Author

Any idea how I could track down the unsuccessful checks?

.packages\microsoft.dotnet.helix.sdk\5.0.0-beta.20330.3\tools\Microsoft.DotNet.Helix.Sdk.MultiQueue.targets#L76
.packages\microsoft.dotnet.helix.sdk\5.0.0-beta.20330.3\tools\Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(76,5): error : (NETCORE_ENGINEERING_TELEMETRY=Test) Work item e58d5f52-74bf-4657-85be-7893ea3587d9/System.Collections.Immutable in job e58d5f52-74bf-4657-85be-7893ea3587d9 has failed.
Failure log: https://helix.dot.net/api/2019-06-17/jobs/e58d5f52-74bf-4657-85be-7893ea3587d9/workitems/System.Collections.Immutable/console

@BruceForstall
Copy link
Member

@ivanpovazan It appears your change hit the CI in a period of particularly poor reliability. My suggestion is to rebase and repush your change to trigger a new set of runs. (I've not had good luck simply re-triggering from the UI, or close/re-open to attempt to retrigger in cases like this.)

@ivanpovazan
Copy link
Member Author

Thank you for the explanation, I will try to retrigger the build

@davidwrighton
Copy link
Member

I've manually verified that debugging works, and this scheme for testing looks to be good too. Thank you @ivanpovazan for all your hard work.

@davidwrighton davidwrighton merged commit 99bbb19 into dotnet:master Jul 14, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
@ivanpovazan ivanpovazan deleted the ilasm_portable_pdb branch December 19, 2020 10:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants