Skip to content

Comments

Detect DotNet processes with IPC mmap operation#229

Merged
brianrob merged 6 commits intomicrosoft:mainfrom
brianrob:dev/brianrob/ipc-mapping
Feb 5, 2026
Merged

Detect DotNet processes with IPC mmap operation#229
brianrob merged 6 commits intomicrosoft:mainfrom
brianrob:dev/brianrob/ipc-mapping

Conversation

@brianrob
Copy link
Member

One-collect currently depends upon the existence of an mmap call in the .NET runtime for a mapping called doublemapper in order to detect that the process is a .NET process. This is fragile as doublemapper is an implementation detail used by the W^X implementation.

Rather than depending on this mapping, depend upon the existence of a mapping called dotnet_ipc_created, which will be created by the .NET runtime once the IPC channel has been created.

For the purposes of backwards compatibility, we'll keep checking for doublemapper, though ideally the new mapping is backported.

Fixes #226

One-collect currently depends upon the existence of an mmap call in the
.NET runtime for a mapping called doublemapper in order to detect that
the process is a .NET process.  This is fragile as doublemapper is an
implementation detail used by the W^X implementation.

Rather than depending on this mapping, depend upon the existence of a
mapping called dotnet_ipc_created, which will be created by the .NET
runtime once the IPC channel has been created.

For the purposes of backwards compatibility, we'll keep checking for
doublemapper, though ideally the new mapping is backported.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could record-trace also look for non-executable mappings to find dotnet_ipc_created? https://github.com/dotnet/runtime/pull/123779/changes#r2743888505

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's technically possible, but right now we filter out non-executable mappings. Are you thinking that you want to keep permissions to be a minimum (just read)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, realized I linked the wrong thing, this is the conversation dotnet/runtime#123779 (comment).

I think the motivation is from a security standpoint, where it would be safer to have fewer executable pages. I don't currently know the vulnerabilities, especially if it's a 0-filled page, but there's currently no reason the runtime needs to make it executable besides record-trace discovery.

Is the main driver for filtering out non-executable mappings that it's more performant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security standpoint

Creating executable mappings that are not backed by a (trusted) binary on disk is a suspect operation. I would not be surprised if it is blank disabled in locked down environments, irrespective of whether the mapping is writeable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's technically possible, but right now we filter out non-executable mappings

If it makes the filtering cheaper on average, we can make the mapping to have some unique size like 42 that gets checked before the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beaubelgrave, do you have any concerns with just removing the protection check all-up for these call sites? If needed, we can benchmark this, but I'm not too concerned. The filename is already preloaded (not lazily) and many mappings will be file-backed and will start with a '/'. I suspect that the filename checks are likely to bail pretty quickly if they aren't going to match.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the main driver for filtering out non-executable mappings that it's more performant?

There are other mmap closures where we filter out non-executable mappings because we only need executable mappings and processing all of them becomes much more costly. I suspect that the protection check here is just because we were looking for doublemapper which needs to be executable (at least in its current implementation).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beaubelgrave, do you have any concerns with just removing the protection check all-up for these call sites? If needed, we can benchmark this, but I'm not too concerned. The filename is already preloaded (not lazily) and many mappings will be file-backed and will start with a '/'. I suspect that the filename checks are likely to bail pretty quickly if they aren't going to match.

mmap's are not high volume events (except at startup). The CLR hook gets mmap events, even if they are not executable. So we can remove that check just for the DotNet part, so the impact I think should be manageable, especially if the mmap name is uncommon, IE: strcmp will bail very quickly within the string.

I'm not concerned with this until we have data indicating we need to do something different. I agree with @jkotas that if we keep the size the same in each release, we could use that if needed to get more perf.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I've just pushed a fix for this.


/* Check if dotnet process */
if filename.starts_with("/memfd:doublemapper") {
if filename.starts_with("/memfd:doublemapper") || filename.starts_with("/memfd:dotnet_ipc_created") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of backward compatibility do we care about here? What are the .NET runtime versions that we expect this to work on?

If we really need to keep this around, it should have a comment that the doublemapper check is just a best effort check for backward compatibility.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NET events will only work on .NET 10, but perfmap support exists much further back and ideally we'd cover older supported versions (e.g. .NET 8+). I think it's reasonable to say that doublemapper is really a best-effort approach in all cases. I'll add a comment to this effect.

@brianrob brianrob requested a review from beaubelgrave February 3, 2026 23:14
@brianrob brianrob force-pushed the dev/brianrob/ipc-mapping branch from 4e1d859 to 60125a2 Compare February 3, 2026 23:34
Dotnet 10+ processes will create a memfd mapping called
dotnet_ipc_created to signal that the diagnostics IPC channel has been
created. The mapping uses protection PROT_NONE and so one_collect needs
to be able to listen for non-executable mappings when dotnet processes
are involved.

Rather than always listening to all mappings, use the presence of the
DotnetHelper as the signal to listen to all mmap events instead of just
executable mmap events.
@brianrob brianrob merged commit 2b083e1 into microsoft:main Feb 5, 2026
11 checks passed
@brianrob brianrob deleted the dev/brianrob/ipc-mapping branch February 5, 2026 19:35
mdh1418 added a commit to dotnet/runtime that referenced this pull request Feb 5, 2026
External tools interested in connecting to the runtime's diagnostic
ports benefit from a low-overhead IO signal that a .NET process is ready
to receive IPC commands, instead of trying to IO over all known temp
file directories looking for diagnostic ports for each process.

Following the discussion in
microsoft/one-collect#226, this PR adds a new
mapping, `dotnet_ipc_created`, that is created once a .NET process'
singular listen port is successfully created.

## Testing

userevents runtime tests now work on NativeAOT with the record-trace
change microsoft/one-collect#229
```
mihw@CPC-mihw-6KMZDM:~/repo/runtime$ ./artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/basic.sh
BEGIN EXECUTION
/home/mihw/repo/runtime/src/tests/Common/scripts/nativeaottest.sh /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/ basic.dll ''
traceeAssemblyPath:
Starting record-trace: sudo -n /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/common/userevents_common/record-trace --script-file /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/native/../basic.script --out /tmp/tmpBoli5K.nettrace --log-mode console --log-filter error,one_collect::helpers::dotnet::os::linux=debug
record-trace started with PID: 1543079
[record-trace] 2026-01-29T22:55:19.967428Z DEBUG one_collect::helpers::dotnet::os::linux: Registered .NET tracepoint: name=OC_DotNet_Microsoft_Windows_DotNETRuntime_1543081_All, callstacks=false, use_names=true
Starting tracee process: /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/native/basic tracee
Tracee process started with PID: 1543083
Waiting for tracee process to exit...
[record-trace] 2026-01-29T22:55:20.093460Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] 2026-01-29T22:55:20.093476Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] 2026-01-29T22:55:20.094446Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] Recording started.  Press CTRL+C to stop.
[record-trace] 2026-01-29T22:55:20.097795Z  INFO one_collect::helpers::dotnet::os::linux: Enabled .NET events for process: pid=1543063
[record-trace] 2026-01-29T22:55:20.098955Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.099085Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.100017Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.104842Z  INFO one_collect::helpers::dotnet::os::linux: Enabled .NET events for process: pid=1543083
Stopping record-trace with SIGINT.
Waiting for record-trace to exit...
[record-trace] Recording stopped.
[record-trace] Resolving symbols.
[record-trace] Finished recording trace.
[record-trace] Trace written to /tmp/tmpBoli5K.nettrace
Expected: 100
Actual: 100
END EXECUTION - PASSED
```
steveisok pushed a commit to dotnet/runtime that referenced this pull request Feb 8, 2026
Backport of #123779 to release/10.0

/cc @mdh1418

## Customer Impact

- [ ] Customer reported
- [x] Found internally

User_events support was added in .NET 10. The officially supported way
to enable user_events is through One-Collect's
[record-trace](https://github.com/microsoft/one-collect) (which
dotnet-trace wraps around). Record-trace relied on an implementation
detail of W^X that NativeAOT doesn't support since it doesn't support
PerfMaps (see microsoft/one-collect#226), so
it was belatedly discovered that user_events doesn't work for NativeAOT.

Through discussions in
microsoft/one-collect#226, this
`dotnet_ipc_created` mapping with no permissions (PROT_NONE) was found
as an acceptable minimal OS interaction to signal when the .NET process'
diagnostic ports are available.

## Regression

- [ ] Yes
- [x] No

## Testing

I tested the [User_events runtime
tests](https://github.com/dotnet/runtime/tree/main/src/tests/tracing/userevents)
against a locally built record-trace based on
microsoft/one-collect#229 in both CoreCLR and
NativeAOT on my WSL2 instance.

## Risk

Low. The mapping being introduced has minimal permissions as it is a
private mapping created with `PROT_NONE`.

---------

Co-authored-by: mdh1418 <mitchhwang1418@gmail.com>
lewing pushed a commit to lewing/runtime that referenced this pull request Feb 9, 2026
External tools interested in connecting to the runtime's diagnostic
ports benefit from a low-overhead IO signal that a .NET process is ready
to receive IPC commands, instead of trying to IO over all known temp
file directories looking for diagnostic ports for each process.

Following the discussion in
microsoft/one-collect#226, this PR adds a new
mapping, `dotnet_ipc_created`, that is created once a .NET process'
singular listen port is successfully created.

## Testing

userevents runtime tests now work on NativeAOT with the record-trace
change microsoft/one-collect#229
```
mihw@CPC-mihw-6KMZDM:~/repo/runtime$ ./artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/basic.sh
BEGIN EXECUTION
/home/mihw/repo/runtime/src/tests/Common/scripts/nativeaottest.sh /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/ basic.dll ''
traceeAssemblyPath:
Starting record-trace: sudo -n /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/common/userevents_common/record-trace --script-file /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/native/../basic.script --out /tmp/tmpBoli5K.nettrace --log-mode console --log-filter error,one_collect::helpers::dotnet::os::linux=debug
record-trace started with PID: 1543079
[record-trace] 2026-01-29T22:55:19.967428Z DEBUG one_collect::helpers::dotnet::os::linux: Registered .NET tracepoint: name=OC_DotNet_Microsoft_Windows_DotNETRuntime_1543081_All, callstacks=false, use_names=true
Starting tracee process: /home/mihw/repo/runtime/artifacts/tests/coreclr/linux.x64.Release/tracing/userevents/basic/basic/native/basic tracee
Tracee process started with PID: 1543083
Waiting for tracee process to exit...
[record-trace] 2026-01-29T22:55:20.093460Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] 2026-01-29T22:55:20.093476Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] 2026-01-29T22:55:20.094446Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543063, nspid=1543063
[record-trace] Recording started.  Press CTRL+C to stop.
[record-trace] 2026-01-29T22:55:20.097795Z  INFO one_collect::helpers::dotnet::os::linux: Enabled .NET events for process: pid=1543063
[record-trace] 2026-01-29T22:55:20.098955Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.099085Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.100017Z DEBUG one_collect::helpers::dotnet::os::linux: Opened diagnostic socket: pid=1543083, nspid=1543083
[record-trace] 2026-01-29T22:55:20.104842Z  INFO one_collect::helpers::dotnet::os::linux: Enabled .NET events for process: pid=1543083
Stopping record-trace with SIGINT.
Waiting for record-trace to exit...
[record-trace] Recording stopped.
[record-trace] Resolving symbols.
[record-trace] Finished recording trace.
[record-trace] Trace written to /tmp/tmpBoli5K.nettrace
Expected: 100
Actual: 100
END EXECUTION - PASSED
```
mdh1418 added a commit to dotnet/runtime that referenced this pull request Feb 20, 2026
…ativeAOT tests (#124616)

## Description

`Microsoft.OneCollect.RecordTrace` `0.1.33304` (from the
`dotnet-diagnostics-tests` feed) contains the fix to detect .NET
processes without perfmaps
([one-collect#229](microsoft/one-collect#229)),
unblocking UserEvents tracing for NativeAOT.

Changes:
- **`eng/Versions.props`**: Bump `MicrosoftOneCollectRecordTraceVersion`
from `0.1.32221` → `0.1.33304`
- **`src/tests/tracing/userevents/Directory.Build.props`**: Remove
`NativeAotIncompatible` (completing the #123697 checklist). The general
`CLRTestTargetUnsupported` disable (#123442) is kept since tests are
still flaky — validated via `/azp run runtime-nativeaot-outerloop` that
NativeAOT UserEvents tests pass with the updated package.

<!-- START COPILOT CODING AGENT TIPS -->
---

🔒 GitHub Advanced Security automatically protects Copilot coding agent
pull requests. You can protect all pull requests by enabling Advanced
Security for your repositories. [Learn more about Advanced
Security.](https://gh.io/cca-advanced-security)

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdh1418 <16830051+mdh1418@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[record-trace][.NET] Discover .NET apps diagnostics ports without perfmaps

4 participants