From f9ec5723df57f580474ae4686b3ccac9aa75225f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 1 Dec 2025 20:30:28 +0000 Subject: [PATCH 1/4] Initial plan From 06f558965b5a80e108d20ba7a84d5c627eab448d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 1 Dec 2025 20:35:56 +0000 Subject: [PATCH 2/4] Add documentation for capturing ETW traces in Kubernetes pods Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com> --- src/PerfView/SupportFiles/UsersGuide.htm | 113 +++++++++++++++++++++++ 1 file changed, 113 insertions(+) diff --git a/src/PerfView/SupportFiles/UsersGuide.htm b/src/PerfView/SupportFiles/UsersGuide.htm index b2fe6eef5..6cca4aeaf 100644 --- a/src/PerfView/SupportFiles/UsersGuide.htm +++ b/src/PerfView/SupportFiles/UsersGuide.htm @@ -6335,6 +6335,119 @@
+ When running Windows containers in Kubernetes using process-isolation mode (as opposed to Hyper-V isolation), + the containers share the host's kernel. While this enables ETW tracing from the host, it requires a specific + workflow to capture and analyze traces for processes running inside these containers. +
++ Important Limitation: Because ETW is a kernel-level feature and process-isolation containers + share the host kernel, you cannot start an ETW session from inside the container. All trace collection + must be initiated from the host node. +
+ ++ Start the trace collection on the Kubernetes host node (not inside the pod). Use the /EnableEventsInContainers + option to ensure that user-mode events from processes inside containers flow to the ETW session on the host. +
++ What /EnableEventsInContainers does: By default, an ETW session on the host only receives + user-mode events from processes running directly on the host. The /EnableEventsInContainers option enables + the ETW session to also receive user-mode events (such as .NET CLR events, custom EventSource events, etc.) + from processes running inside process-isolation containers. +
++ What happens if you don't use /EnableEventsInContainers: You will still capture all kernel + events (CPU sampling, context switches, etc.) for container processes, but you will miss user-mode events + like .NET garbage collection events, JIT events, exception events, and any custom EventSource events from + those processes. +
+ ++ If the container(s) containing the process(es) of interest are still running when you stop the trace, you + can open and analyze the trace directly on the host node. PerfView will be able to find binaries that it + needs both on the host and inside the running containers through the container's file system view. +
++ This is the simplest analysis path since no additional steps are required—just open the trace in PerfView + on the host node. +
+ ++ If you need to analyze the trace after the container has been shut down, or if you want to copy the trace + to another machine for analysis, you need to prepare the trace while the container is still accessible. + This is done using the merge command with the /ImageIDsOnly option. +
++ First, copy the trace file into the container: +
++ Then, inside the container, run the merge command to inject the necessary image identification data: +
++ What /ImageIDsOnly does: When you run merge with /ImageIDsOnly, PerfView reads through + the trace and for each DLL that was loaded by processes in the trace, it looks up the DLL's unique + identifier (signature/timestamp) and injects that information into the trace. This unique identifier + is what allows PerfView to later download the correct PDB symbols from a symbol server. Without this + information, PerfView cannot resolve method names for code in those DLLs. +
++ What happens if you don't run merge with /ImageIDsOnly: If you skip this step and later + try to analyze the trace on another machine after the container is gone, PerfView will be unable to find + the symbol files for DLLs that were loaded inside the container. Your stack traces will show addresses + like "0x7ffe12345678" instead of method names for those DLLs. The managed code (.NET) symbols may still + work since they come from NGEN PDBs or crossgen2 metadata, but native code (including the runtime itself) + will have missing symbols. +
++ Why run merge inside the container: The merge command needs access to the actual DLL + files to read their unique identifiers. Running merge inside the container ensures it can access the + DLLs that were loaded by the container's processes. If you run merge on the host or on a different + machine, those container-specific DLLs may not be accessible. +
+ ++ After running merge with /ImageIDsOnly, copy the trace out of the container: +
++ You can now open this trace on any machine with PerfView installed. With the image identification + information embedded in the trace, PerfView can download symbols from symbol servers as needed. +
+ ++ Here is the complete workflow: +
++ Note: If you analyze the trace on the host while the container is still running, you + can skip the copy and merge steps entirely. +
+- When running Windows containers in Kubernetes using process-isolation mode (as opposed to Hyper-V isolation), + When running Windows containers in Kubernetes using process-isolation mode (the default mode, as opposed to Hyper-V isolation), the containers share the host's kernel. While this enables ETW tracing from the host, it requires a specific workflow to capture and analyze traces for processes running inside these containers.
- Important Limitation: Because ETW is a kernel-level feature and process-isolation containers - share the host kernel, you cannot start an ETW session from inside the container. All trace collection + Note: If you are running containers in Hyper-V isolation mode, these instructions are not required. + In Hyper-V mode, each container has its own kernel, so you can capture traces directly inside the container + using the normal PerfView workflow. +
++ Important Limitation: In process-isolation mode, kernel ETW sessions cannot be started from + inside the container. Since PerfView almost always captures a kernel session, all trace collection must be initiated from the host node.
-Start the trace collection on the Kubernetes host node (not inside the pod). Use the /EnableEventsInContainers option to ensure that user-mode events from processes inside containers flow to the ETW session on the host. @@ -6364,9 +6369,10 @@
What happens if you don't use /EnableEventsInContainers: You will still capture all kernel - events (CPU sampling, context switches, etc.) for container processes, but you will miss user-mode events - like .NET garbage collection events, JIT events, exception events, and any custom EventSource events from - those processes. + events (CPU sampling, context switches, etc.) for container processes, and you will still receive user-mode + events from processes running directly on the host node (outside of containers). However, you will miss + user-mode events like .NET garbage collection events, JIT events, exception events, and any custom + EventSource events from processes inside containers.
+ Note: PerfViewCollect needs to be built from source at + https://github.com/microsoft/perfview. + It is not currently shipped as a binary. See the "Windows Nanoserver and PerfViewCollect" + section above for build instructions. +
What /ImageIDsOnly does: When you run merge with /ImageIDsOnly, PerfView reads through - the trace and for each DLL that was loaded by processes in the trace, it looks up the DLL's unique - identifier (signature/timestamp) and injects that information into the trace. This unique identifier - is what allows PerfView to later download the correct PDB symbols from a symbol server. Without this - information, PerfView cannot resolve method names for code in those DLLs. + the trace and for each DLL that was loaded by processes in the trace, it looks up the DLL's PDB signature + and injects that information into the trace. This unique identifier is what allows PerfView to later + download the correct PDB symbols from a symbol server. Without this information, PerfView cannot resolve + method names for code in those DLLs.
What happens if you don't run merge with /ImageIDsOnly: If you skip this step and later
try to analyze the trace on another machine after the container is gone, PerfView will be unable to find
- the symbol files for DLLs that were loaded inside the container. Your stack traces will show addresses
- like "0x7ffe12345678" instead of method names for those DLLs. The managed code (.NET) symbols may still
- work since they come from NGEN PDBs or crossgen2 metadata, but native code (including the runtime itself)
- will have missing symbols.
+ the symbol files for DLLs that were loaded inside the container. Your stack traces will show the module
+ name with a question mark (for example: MyAssembly!? instead of MyAssembly!MyClass.MyMethod).
+ Jitted .NET code will still resolve correctly, but nothing else from binaries inside the container will have symbols.
- Why run merge inside the container: The merge command needs access to the actual DLL - files to read their unique identifiers. Running merge inside the container ensures it can access the - DLLs that were loaded by the container's processes. If you run merge on the host or on a different - machine, those container-specific DLLs may not be accessible. + Why run merge inside the container: The merge component does not have access to look inside + of containers when run from the host. Running merge inside the container ensures it can access the DLLs that + were loaded by the container's processes. If you run merge on the host or on a different machine, those + container-specific DLLs will not be accessible.
Start the trace collection on the Kubernetes host node (not inside the pod). Use the /EnableEventsInContainers - option to ensure that user-mode events from processes inside containers flow to the ETW session on the host. + option to ensure that user-mode events from processes inside containers flow to the ETW session on the host. Example capture command:
What /EnableEventsInContainers does: By default, an ETW session on the host only receives @@ -6377,9 +6377,10 @@
- If the container(s) containing the process(es) of interest are still running when you stop the trace, you - can open and analyze the trace directly on the host node. PerfView will be able to find binaries that it - needs both on the host and inside the running containers through the container's file system view. + If the container(s) containing the process(es) of interest are still running when you stop the trace, you + can open and analyze the trace directly on the host node. PerfView will be able to find binaries that it + needs both on the host and inside the running containers through the container's file system view. NOTE: This only works for as long + as the container is running.
This is the simplest analysis path since no additional steps are required—just open the trace in PerfView