Phase 1 of refactoring pgo data pipeline #46638

davidwrighton · 2021-01-06T18:06:07Z

Phase 1 of replacing existing infrastructure around handling of pgo data with more flexible schema based approach.

The schema based approach allows the JIT to define the form of data needed for instrumentation.

The schema associates 4 32bit integers with each data collection point (ILOffset, InstrumentationKind, Count, and Other)
- Rich meaning is attached to InstrumentationKind, and Count
  - InstrumentationKind defines the size and layout of individual instrumentation data items
  - Count allows a single schema item to be repeated
- ILOffset and Other are not processed in any specific way by the infrastructure

Changes part of this phase

PgoManager holds arbitrary amount of pgo data instead of a slab
- Aware of collectible assemblies
- Match with pgo data utilizes hash of IL body in addition to IL size information for greater accuracy in match
JIT no longer uses block count apis, and instead uses schema based apis
- JIT now explicitly defines the shape of data collected for both basic block and type probes
- The rest of the system handles that without deep knowledge of what those formats are
Text file format for pgo data updated
Existing IBC infrastructure adjusted to speak in terms of schema concept
Uncompressed and binary encoded implementation of Pgo schema handling
Update SuperPMI to handle new apis

Future Changes for static Pgo

Move Pgo type handle histogram processing into JIT
Extract Pgo data from process using Event infrastructure
Triggers for controlling Pgo data extraction
Instrumented Pgo processing as part of dotnet-pgo tool
Pgo data flow in crossgen2

Improve stable hash to match the other R2R hashes Use method IL body hash to replace token in pgo data Reworked data storage for pgo to allow for more efficient lookup Checkin before rebase Jit no longer uses block count apis

…emaForAot api

- Handle large integers correctly

AndyAyersMS

Overall this looks really good.

Left a few notes.

src/coreclr/jit/compiler.cpp

src/coreclr/jit/flowgraph.cpp

AndyAyersMS · 2021-01-06T21:34:47Z

src/coreclr/inc/corjit.h

+    //  4. Each data entry shall be laid out without extra padding.
+    //
+    //  The intention here is that it becomes possible to describe a C data structure with the alignment for ease of use with 
+    //  inst5rumentation helper functions


Suggested change

// inst5rumentation helper functions

// instrumentation helper functions

AndyAyersMS · 2021-01-06T21:39:20Z

src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs

+                return HRESULT.E_NOTIMPL;
+            }
+
+            // Only allocation of PGO data for the current method is supported.


We will want to support this for inlinees, eventually (#44372).

That limitation will remain only be for IBC style instrumentation in R2R dlls. The general purpose instrumentation support will only be present in the runtime. If we drop support for IBC style R2R instrumentation as we will likely do, this code will be removed entirely from crossgen2.

AndyAyersMS · 2021-01-07T01:01:51Z

src/coreclr/vm/pgo.h

+#include "shash.h"
+
+
+enum class PgoInstrumentationKind


Is there some way this can be shared with the version in corjit.h?

My original thought was that this one would only have the minimal number of enum values describing marshalling so that it was clear that the JIT defined the meanings of stuff, but honestly, that's not a major benefit. I'll merge the two.

AndyAyersMS · 2021-01-07T01:56:45Z

src/coreclr/jit/flowgraph.cpp

                        }

-                        return Compiler::WALK_CONTINUE;
+                        // Restore the stub address \on call, whether instrumenting or not.


Suggested change

// Restore the stub address \on call, whether instrumenting or not.

// Restore the stub address on the call, whether instrumenting or not.

AndyAyersMS · 2021-01-07T02:03:19Z

src/coreclr/vm/pgo.cpp

@@ -270,7 +307,7 @@ void PgoManager::ReadPgoData()
        return;
    }

-    char     buffer[256];
+    char     buffer[16384];


Is there any symbolic constant we can use here?

AndyAyersMS · 2021-01-07T02:11:46Z

src/coreclr/vm/pgo.cpp

+        HeaderList *currentHeaderList = m_pgoHeaders;
+        if (currentHeaderList != NULL)
+        {
+            if (!ComparePgoSchemaEquals(currentHeaderList->header.GetData(), currentHeaderList->header.countsOffset, pSchema, countSchemaItems))


Does this need to tie into the IL versioning scheme? I don't think the jit knows which IL version is active for a jit request, but presumably the jit interface does (or could know).

I wouldn't expect so. The pgo instrumentation data structures will be the same if the schema's match up, and the JIT already must tolerate data which isn't completely accurate.

Now, the reading side may be more interesting, but again, that gets into the question of how bad is it to present incorrectly structured data, if that's really bad we could easily add the IL code hash into the schema to allow for the JIT to check it for correctness or something.

AndyAyersMS · 2021-01-07T02:12:29Z

src/coreclr/vm/pgo.cpp

-            *pBlockCounts = &s_PgoData[index + 2];
-            *pCount       = header->recordCount - 2;
-            *pNumRuns     = 1;
+            if (!ComparePgoSchemaEquals(existingData->header.GetData(), existingData->header.countsOffset, pSchema, countSchemaItems))


Ditto here, do we need to make sure we're looking at the same IL version the jit is compiling?

I wouldn't expect so. The pgo instrumentation data structures will be the same if the schema's match up, and the JIT already must tolerate data which isn't completely accurate.

AndyAyersMS · 2021-01-07T02:15:28Z

src/coreclr/vm/pgo.cpp

-        index += header->recordCount;
-        methodsChecked++;
+        AllocMemTracker loaderHeapAllocation;
+        pHeaderList = (HeaderList*)loaderHeapAllocation.Track(pMD->GetLoaderAllocator()->GetHighFrequencyHeap()->AllocMem(allocationSize));


Seems like the schema could go in the low frequency heap but maybe that's more trouble than it's worth.

Well, the jit interface api allows for doing things like that (which is why the Offset field is a full size_t), but for now, it would cause problems for the superpmi implementation I've come up with, and its not super clear how valuable the high vs low frequency heap actually is.

- This will be better prepared for the port to managed - No need to include these complex templates and such in all vm files

AndyAyersMS

Thanks for the updates.

The code in zapinfo still looks wrong to me, it won't build a proper schema.

Would be good to run the jit-experimental CI leg, and maybe also a jitstress leg.

src/coreclr/jit/compiler.cpp

davidwrighton · 2021-01-08T01:13:03Z

@AndyAyersMS I've fixed the ZapInfo issues, and verified that traditional IBC continues to work. I've also scheduled the jit-experimental and jitstress runs as requested.

AndyAyersMS · 2021-01-09T02:05:14Z

In case you haven't yet looked at the experimental failures -- they seem to be related.

BruceForstall · 2021-01-14T00:38:31Z

@davidwrighton You didn't update the JIT-EE GUID with this. Can you please follow-up with a GUID change ASAP?

davidwrighton added 8 commits January 4, 2021 17:35

Add InstrumentationData event to the runtime

729602b

Improve stable hash to match the other R2R hashes Use method IL body hash to replace token in pgo data Reworked data storage for pgo to allow for more efficient lookup Checkin before rebase Jit no longer uses block count apis

VM builds

63f7a88

Fixup superpmi and remove not yet ready recordPgoInstrumentationBySch…

b6e29a0

…emaForAot api

Zapper updated

71d2165

It all builds

2754acd

Correct handling for reading pgo data

26208e6

- Handle count schema items correctly

5504b9a

- Handle large integers correctly

Fix Linux build

36a76b9

Dotnet-GitSync-Bot added the area-crossgen2-coreclr label Jan 6, 2021

davidwrighton requested a review from AndyAyersMS January 6, 2021 18:06

davidwrighton added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 6, 2021

davidwrighton added 3 commits January 6, 2021 12:56

Fix gcc build failure

bff1fec

Apply formatting patch

e79e1c4

Initialize m_pgoManager as needed

73e87a6

AndyAyersMS reviewed Jan 7, 2021

View reviewed changes

davidwrighton added 5 commits January 7, 2021 12:58

Fixup comments as requested

8e55edf

Remove unneccessary extra PgoInstrumentationKind enum

0c44c37

Remove unnecessary struct PgoInstrumentationSchema duplication

6bd7641

Extract pgo format processing logic to an independent header

3dd567a

- This will be better prepared for the port to managed - No need to include these complex templates and such in all vm files

Correct !FEATURE_PGO stubs

9a5068f

AndyAyersMS reviewed Jan 7, 2021

View reviewed changes

src/coreclr/jit/compiler.cpp Show resolved Hide resolved

davidwrighton added 2 commits January 7, 2021 16:12

Fix Zap IBC instrumentation path both reading and writing

249156a

Merge branch 'master' into pgo_prototype

d2c9e55

AndyAyersMS approved these changes Jan 8, 2021

View reviewed changes

davidwrighton added 2 commits January 11, 2021 14:29

Fix issues identified by jit experimental run

3de57b1

Needed to be a custom lock now that it actually does something

19c9e55

AndyAyersMS mentioned this pull request Jan 12, 2021

JIT: efficient profiling schemes #46882

Closed

runfoapp bot mentioned this pull request Jan 12, 2021

profiler/elt/slowpatheltenter/slowpatheltenter.sh test failed #46606

Closed

runfoapp bot mentioned this pull request Jan 12, 2021

profiler/elt/slowpatheltenter/slowpatheltenter.sh test failed #46608

Closed

SuperPMI fix 2

474b821

runfoapp bot mentioned this pull request Jan 13, 2021

Inability to unzip assets during build on Unix x64 #32805

Closed

Merge branch 'master' of github.com:dotnet/runtime into pgo_prototype

aac6c8c

davidwrighton merged commit 6ded57b into dotnet:master Jan 13, 2021

ghost locked as resolved and limited conversation to collaborators Feb 13, 2021

davidwrighton deleted the pgo_prototype branch April 20, 2021 17:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 1 of refactoring pgo data pipeline #46638

Phase 1 of refactoring pgo data pipeline #46638

davidwrighton commented Jan 6, 2021

AndyAyersMS left a comment

AndyAyersMS Jan 6, 2021

AndyAyersMS Jan 6, 2021

davidwrighton Jan 7, 2021

AndyAyersMS Jan 7, 2021

davidwrighton Jan 7, 2021

AndyAyersMS Jan 7, 2021

AndyAyersMS Jan 7, 2021

AndyAyersMS Jan 7, 2021

davidwrighton Jan 7, 2021

davidwrighton Jan 7, 2021

AndyAyersMS Jan 7, 2021

davidwrighton Jan 7, 2021

AndyAyersMS Jan 7, 2021

davidwrighton Jan 7, 2021

AndyAyersMS left a comment

davidwrighton commented Jan 8, 2021

AndyAyersMS commented Jan 9, 2021

BruceForstall commented Jan 14, 2021

	// inst5rumentation helper functions
	// instrumentation helper functions

	// Restore the stub address \on call, whether instrumenting or not.
	// Restore the stub address on the call, whether instrumenting or not.

Phase 1 of refactoring pgo data pipeline #46638

Phase 1 of refactoring pgo data pipeline #46638

Conversation

davidwrighton commented Jan 6, 2021

AndyAyersMS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndyAyersMS left a comment

Choose a reason for hiding this comment

davidwrighton commented Jan 8, 2021

AndyAyersMS commented Jan 9, 2021

BruceForstall commented Jan 14, 2021