Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add drmemtrace replay-as-traced support to record_scheduler_t #6712

Closed
derekbruening opened this issue Mar 19, 2024 · 1 comment · Fixed by #6714
Closed

Add drmemtrace replay-as-traced support to record_scheduler_t #6712

derekbruening opened this issue Mar 19, 2024 · 1 comment · Fixed by #6714
Assignees

Comments

@derekbruening
Copy link
Contributor

Trying to use record_filter running as-traced on a smallish app based on the threadsig sample fails:

Encoding size 9 != instr size 3 for PC 0x7ffa29f26ba9
[scheduler] next_record[6]: from 2 @1710797586236606: type=10 size=2 addr=0x7ffa2b0582fc
[scheduler] next_record[6]: from 2 @1710797586236631: type=47 size=6 addr=0x12c840f
[scheduler] next_record[6]: waiting because timestamp 13355270234506704 is ahead of output 0
[scheduler] next_record[6]: replay segment in=2 (@909) type=0 start=876 end=909
next_record[7]: from 2 @1710797586238158: next_record[3]: from 4 @1710797586238082: [scheduler] [record_filter] next_record[6]: advancing to input 3 instr #0
next_record[6]: replay segment in=3 (@0) type=0 start=0 end=-1
next_record[6]: from 3 @1710797586237492: type=25 size=0 addr=0x6
next_record[0]: from 1 @1710797586238563: next_record[6]: from 3 @1710797586238516: type=28 size=2 addr=0x2f728c2a350003
next_record[6]: from 3 @1710797586238712: type=28 size=9 addr=0xe40
[scheduler] next_record[6]: from 3 @1710797586238981: type=22 size=4 addr=0x41194
[scheduler] next_record[6]: from 3 @1710797586239016: type=24 size=4 addr=0x4115c
[scheduler] next_record[6]: from 3 @1710797586239212: type=28 size=10 addr=0x40
[scheduler] [scheduler] next_record[6]: from 3 @1710797586239336: type=28 size=20 addr=0x989680
[record_filter] [scheduler] next_record[6]: from 3 @1710797586239495: type=28 size=18 addr=0x1000
[scheduler] next_record[6]: from 3 @1710797586239827: [record_filter] type=28 size=2 addr=0x2f728c2a3505d0
[scheduler] next_record[3]: from 4 @1710797586239853: next_record[6]: from 3 @1710797586239920: type=28 size=3 addr=0x7
[scheduler] next_record[6]: from 3 @1710797586240088: type=47 size=3 addr=0xc08548
[scheduler] next_record[6]: from 3 @1710797586240206: type=10 size=3 addr=0x7ffa29f26ba9

It's the segment endpoint: have to stop before the encoding.
This would also affect skips, incl for ROI.
Better to change record_file_reader to ++ the instr count at the encoding
instead of the instr?

@derekbruening derekbruening self-assigned this Mar 19, 2024
derekbruening added a commit that referenced this issue Mar 19, 2024
Adds record_scheduler_t (for the record_filter tool) support for
replaying as-traced.  The key change here is having record_reader_t
increment the instruction ordinal on an encoding record instead of an
instruction record.  This avoids splitting encodings from instructions
when operating at instrution boundaries in the scheduler (during
replay, but this would also affect skipping for scheduler regions of
interest).

Adds a test of record_filter on the checked-in threadsig trace in
as-traced replay mode.

Fixes #6712
@derekbruening
Copy link
Contributor Author

This is a little more complicated: we have to account for TRACE_MARKER_TYPE_BRANCH_TARGET as well which appears between the encoding and the instruction records.

derekbruening added a commit that referenced this issue Mar 21, 2024
Adds record_scheduler_t (for the record_filter tool) support for
replaying as-traced.  The key change here is having record_reader_t's
instruction ordinal, record_scheduler_t's output stream instruction
ordinal, and record_scheduler_t's switch boundaries all consider the
first in any sequence of encoding records or
TRACE_MARKER_TYPE_BRANCH_TARGET markers to start an instruction
instead of waiting for the instruction record.  (Previously the
scheduler switch point did consider encodings, but not branch targets,
and the ordinals only considered instructions.)

Moving the boundary back avoids splitting encodings from instructions
when operating at instrution boundaries in the scheduler (encountered
during replay, but this would also affect skipping for scheduler
regions of interest).

Adds tests of all 3 boundary types to the record_scheduler_t unit
test.

Adds a test of record_filter on the checked-in threadsig trace in
as-traced replay mode.

Fixes #6712
derekbruening added a commit that referenced this issue Apr 2, 2024
Adds two sanity checks developed to identify what was at first
believed to be a new bug but turned out to be #6712:

+ Ensure that last_encoding is empty on an input switch, to avoid
  recording the wrong encoding in pc2encoding.

+ Ensure the encoding size matches the instr size when inserting
  chunk-new encodings.

Tested by running record_filter on large proprietary inputs: no checks
failed.

Issue: #6712
derekbruening added a commit that referenced this issue Apr 2, 2024
Adds two sanity checks developed to identify what was at first believed
to be a new bug but turned out to be #6712:

+ Ensure that last_encoding is empty on an input switch, to avoid
recording the wrong encoding in pc2encoding.

+ Ensure the encoding size matches the instr size when inserting
chunk-new encodings.

+ Augment the reader check for mismatching encoding vs instr sizes, and
abort in release build too.

Although #6712 is fixed, these checks are still useful to prevent
regressions. Plus, the reader_t check detects other bugs as well and we
know at least one other is out there in #6303. Making the reader check
abort in release build should help avoid wasted work as already happened
here where we didn't notice the printed warnings in release build.

Tested by running record_filter on large proprietary inputs: no checks
failed.

Issue: #6712
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant