Conversation
Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
| public PEHeader(ReadOnlySpan<byte> peFileData) | ||
| { | ||
| // We need to copy the span data to a byte array since we can't store spans as fields | ||
| m_buffer = peFileData.ToArray(); |
There was a problem hiding this comment.
I would like to avoid copying the buffer here. Would it make more sense to fold PEHeader into PEFile so that the copy doesn't need to occur?
There was a problem hiding this comment.
Addressed by eliminating the buffer copy. Added an internal PEHeader(byte[] buffer, int offset, int length) constructor that references PEBuffer's existing byte array without copying. PEFile now uses PEBuffer.GetBufferInfo() to get the buffer parameters and constructs PEHeader with zero allocation overhead. The public PEHeader(ReadOnlySpan<byte>) constructor remains available for external callers that provide their own buffers. (commit fbbf797)
Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
|
|
||
| // Internal method to get buffer parameters for zero-copy PEHeader construction | ||
| internal void GetBufferInfo(int filePos, int size, out byte[] buffer, out int offset, out int length) | ||
| { |
There was a problem hiding this comment.
Renamed to EnsureRead in commit ec49a3a.
| } | ||
| return &m_buffPtr[filePos - m_buffPos]; | ||
| } | ||
| public ReadOnlySpan<byte> FetchSpan(int filePos, int size) |
There was a problem hiding this comment.
Rename PEBuffer to PEBufferedReader.
There was a problem hiding this comment.
Renamed to PEBufferedReader in commit ec49a3a.
|
|
||
| // Internal method to get buffer parameters for zero-copy PEHeader construction | ||
| internal void GetBufferInfo(int filePos, int size, out byte[] buffer, out int offset, out int length) | ||
| { |
There was a problem hiding this comment.
Rather than returning 3 out parameters, please return a struct called PEBufferedSlice that contains properties for Buffer, Offset and Length. Please also add a method to the struct called AsSpan() that returns a ReadOnlySpan<byte> representing the slice.
There was a problem hiding this comment.
Created PEBufferedSlice struct with Buffer, Offset, Length properties and AsSpan() method in commit ec49a3a.
| /// <summary> | ||
| /// Returns a PEHeader for ReadOnlySpan of bytes in memory. Validates buffer bounds. | ||
| /// </summary> | ||
| public PEHeader(ReadOnlySpan<byte> peFileData) |
There was a problem hiding this comment.
This constructor is unused. Please remove it.
There was a problem hiding this comment.
Removed the unused constructor in commit ec49a3a.
| /// <summary> | ||
| /// Returns a PEHeader that references an existing buffer without copying. Validates buffer bounds. | ||
| /// </summary> | ||
| internal PEHeader(byte[] buffer, int offset, int length) |
There was a problem hiding this comment.
This should take the new struct as a parameter. Then, just save the struct instead of m_buffer, m_bufferOffset, and m_bufferLength.
There was a problem hiding this comment.
Updated to take PEBufferedSlice and store it directly in commit ec49a3a.
| } | ||
|
|
||
| IMAGE_DOS_HEADER dosHdr; | ||
| fixed (byte* bufferPtr = m_buffer) |
There was a problem hiding this comment.
You should be able to call AsSpan() here and then operate with the ReadOnlySpan instead of using unsafe here. That will ensure that we don't read off the end of the buffer.
There was a problem hiding this comment.
Updated to use AsSpan() and MemoryMarshal.Read instead of unsafe pointer access in commit ec49a3a.
…PEBufferedReader, use AsSpan Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
All feedback addressed in commit ec49a3a. Created PEBufferedSlice struct, renamed to PEBufferedReader, renamed method to EnsureRead, removed unused constructor, and updated to use AsSpan() with MemoryMarshal.Read. |
| throw new InvalidOperationException("GetDirectory only available with span-based PEHeader."); | ||
| } | ||
|
|
||
| private IMAGE_DOS_HEADER* dosHeader; |
There was a problem hiding this comment.
@copilot, we shouldn't have two different paths with different fields that only work based on the constructor that was used. Instead, please move everything to be span based and remove the constructor that takes a raw pointer.
There was a problem hiding this comment.
Removed all dual-path logic in commit 7f1edfe. Eliminated the pointer-based constructor and all pointer fields (dosHeader, ntHeader, sections). Everything now uses the span-based approach exclusively with PEBufferedSlice and MemoryMarshal for safe, bounds-checked access.
…out PEHeader Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
|
@copilot, please add some tests to TraceEvent.Tests to validate that the changes in the PEFile class work. |
Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
…tation failure Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
Fixed in commit a04994b. The generator now creates a PE file with:
The test output shows:
Run with: |
marklio
left a comment
There was a problem hiding this comment.
This looks pretty reasonable. While this looks correct, I'm curious why you didn't just move to the built-in PE reader (or Marklio.Metadata ;) )
| return ref MemoryMarshal.Cast<byte, IMAGE_SECTION_HEADER>(span)[0]; | ||
| } | ||
|
|
||
| private ref readonly IMAGE_OPTIONAL_HEADER32 OptionalHeader32Span |
There was a problem hiding this comment.
Why does this have a Span suffix?
There was a problem hiding this comment.
This is an artifact of an earlier iteration of the change when copilot was trying to build this class to have two constructors - one that took a pointer and one that took a span. I've removed the suffix.
I did not know about either of these. :) There is also a benefit to having fewer dependencies because PerfView them must carry them all in order to support its "autoextract" capability. |
Full disclosure, I hate our built-in PE reader :), which is why I continue to use and support Marklio.Metadata. :) I did a big span conversion a few years ago, so alot of this is very famliar. It supports memory-mapped files, streams, and lots of different "I have a bunch of bytes that are pieces of binaries" scenarios. |
Overview
This PR completely refactors
PEFileandPEHeaderto useReadOnlySpan<byte>exclusively instead of raw unsafe pointers, providing automatic bounds checking to prevent reading outside allocated buffers. The implementation uses zero-copy buffer sharing for optimal performance and eliminates all dual-path logic for a cleaner, more maintainable codebase.Motivation
The existing implementation uses unsafe pointers (
byte*,void*) to read PE file headers, which has several risks:Using
ReadOnlySpan<byte>provides:Key Design Pattern - Progressive Reads
m_sectionsOffsetfor use byPEHeaderSizepropertyHeader.PEHeaderSize > 1024and re-reads with correct size if neededSafety Guarantees
Performance
Compatibility
Testing
Comprehensive Test Suite
Added 10 comprehensive tests in
src/TraceEvent/TraceEvent.Tests/Utilities/PEFileTests.cs:All tests pass (9/10 on Linux, all 10 on Windows)
Test Applications - Demonstrating the Improvement
Added standalone test applications in
src/TestApps/LargePEHeaderTest/that clearly demonstrate the limitations of the old implementation:Generated PE File Characteristics
Test Results
Running
TestBothImplementations.csproj:Old Implementation (OldPEFile.cs):
Fails the check:
if (!(sizeof(IMAGE_DOS_HEADER) <= imageHeaderOffset && imageHeaderOffset <= 512))New Implementation (PEFile with ReadOnlySpan):
Running the Tests
The test applications clearly demonstrate that:
imageHeaderOffset > 512bytesImplementation Details
PEBufferedReader (renamed from PEBuffer)
FetchSpan(int filePos, int size)returningReadOnlySpan<byte>EnsureRead(int filePos, int size)returningPEBufferedSlicestruct for zero-copy constructionFetch()method returningbyte*for backward compatibilityPEBufferedSlice (new struct)
AsSpan()method returningReadOnlySpan<byte>PEHeader (fully span-based)
PEHeader(void*)PEHeader(PEBufferedSlice slice)for zero-allocation performanceMemoryMarshal.Read<T>()andMemoryMarshal.Cast<T>()exclusively for safe structure accessGetBufferSpan(int offset, int length)helper for bounds-checked span slicingPEFile (public API unchanged)
EnsureRead()and zero-copyPEHeaderconstructorOriginal prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.