Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#5694 core-sharded: Add get_shard_index() + get_tid() #6568

Merged
merged 11 commits into from
Jan 25, 2024

Conversation

derekbruening
Copy link
Contributor

@derekbruening derekbruening commented Jan 19, 2024

Adds 2 new memtrace_stream_t interfaces to simplify generalizing tools to handle either thread or core sharded operation:

  • get_shard_index() returns a 0-based shard ordinal regardless of whether core-sharded or thread-sharded.
  • get_tid() returns the thread id of the current input. This is a convenience method for use in parallel_shard_init_stream() prior to access to any memref_t records.

For online analysis where there's a single input, the scheduler remembers and returns the last memref.data.tid for get_tid() and uses the dynamic tid discovery order for get_shard_index().

Changes an existing interface:

  • Guarantees that the shard_index passed to parallel_shard_init_stream() is a 0-based ordinal.

Implements the 2 new interfaces in the scheduler and adds two new interface there:

  • get_output_stream_ordinal() to get the underlying output when using single_lockstep_output.
  • get_output_cpuid(ord) taking in an ordinal so the analyzer or other user can get the cpuids statically when using single_lockstep_output. Analysis tools must dynamically discover the cpuids (stopped short of making this a memtrace_stream_t interface, as analysis tools in general must dynamically discover most things already).

Removes dr$sim's manual mapping of cpuid to core index in favor of using the new get_shard_index().

Updates all the analysis tools to use the new interfaces and to generalize their code to either handle both thread and core shards (reuse_time, reuse_distance, basic_counts, histogram, opcode_mix, syscall_mix, record_filter) or explicitly return an error for core-sharded modes (func_view, invariant_checker). (schedule_stats and record_filter needed no changes.)

Updates several unit tests to handle these changes:

  • Expands the default_memtrace_stream_t to be suitable as a mock stream for unit tests with the new interfaces.
  • Skips invariant stream checks for the mock stream by checking its input interface, since the stream itself is no longer null.
  • Fixes drcachesim unit tests which were not initializing tid.

Adds some sanity tests on the new interfaces.

Adds a new end-to-end test running the newly-updated tools as -core_sharded. Limits the reuse_time histogram printing output to avoid hanging CMake's regex matcher in this test.

Issue: #5694

Adds 2 new memtrace_stream_t interfaces to simplify generalizing
tools to handle either thread or core sharded operation:

+ get_shard_index() returns a 0-based shard ordinal regardless
  of whether core-sharded or thread-sharded.
+ get_input_tid() returns the thread id of the current input.
  This is a convenience method for use in parallel_shard_init_stream()
  prior to access to any memref_t records.

Changes an existing interface:

+ Guarantees that the shard_index passed to parallel_shard_init_stream()
  is a 0-based ordinal.

Implements the 2 new interfaces in the scheduler and adds two new
interface there:

+ get_output_stream_ordinal() to get the underlying output when using
  single_lockstep_output.
+ get_output_cpuid(ord) taking in an ordinal so the analyzer or other
  user can get the cpuids statically when using single_lockstep_output.

Removes dr$sim's manual mapping of cpuid to core index in favor of
using the new get_shard_index().

Updates all the analysis tools to use the new interfaces and to
generalize their code to either handle both thread and core shards
(reuse_time, reuse_distance, basic_counts, histogram, opcode_mix,
syscall_mix, record_filter) or explicitly return an error for
core-sharded modes (func_view, invariant_checker).  (schedule_stats
and record_filter needed no changes.)

Adds some sanity tests on the new interfaces.

Adds a new end-to-end test running the newly-updated tools as
-core_sharded.

Issue: #5694
where the scheduler remembers and returns the last memref.data.tid for
get_shard_index().

Expand mock stream for unit tests.

Skip invariant stream checks for mock stream.
clients/drcachesim/common/memtrace_stream.h Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/simulator/simulator.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/invariant_checker.cpp Outdated Show resolved Hide resolved
clients/drcachesim/common/memtrace_stream.h Show resolved Hide resolved
clients/drcachesim/common/memtrace_stream.h Show resolved Hide resolved
@derekbruening derekbruening changed the title i#5694 core-sharded: Add get_shard_index() + get_input_tid() i#5694 core-sharded: Add get_shard_index() + get_tid() Jan 25, 2024
+ s/get_input_tid/get_tid/
+ is_combined_stream helper
+ is_a_unit_test helper
+ use sentinels
+ update comments
+ s/set_input_tid/set_tid/
+ Have it set the shard index to the dynamic-discovery-order tid ordinal
  to make it easier on tests

Revert the sentinel for last_thread_ back to -1 to support many tests
setting memref tids to 0.
@derekbruening
Copy link
Contributor Author

ub22 failure is api.rseq #6185; x64 failure is attach_test #6452

@derekbruening derekbruening merged commit 910f82d into master Jan 25, 2024
12 of 15 checks passed
@derekbruening derekbruening deleted the i5694-get-shard-id branch January 25, 2024 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants